Skip to main content
Version: Summer 23

Environment Setup

This is not a programming class; this is a “workflows” class

Why reproducible Research?

To reduce wasted resources.

Computing

  • Use the right tool for the job.
  • Running computational intensive things on your laptop is too little.
  • Running jobs that are precise wastes compute time in the cluster that you or someone else could be using for other things.

Money

Funding runs out. Waiting for results to come back, or reproducing old results requires time of highly trained individuals, which is expensive.

Time

Save time for your future self, and for researchers who come after you.

  • What if you're about to graduate and your hard-drive dies(The results are too large for a backup on a usb)? How quickly could you reproduce that work?
  • How would you work with a new undergraduate in your lab?
    • How would they make substantial contributions?
    • How would could you trust them to work on your research without fear that they may break something?
  • When you go to write a manuscript, how would you share your code?

How are we going to be more Reproducible?

  • Avoiding homegrown solutions, instead favoring community curated efforts
  • Documentation
  • Version Control
  • Utilizing build systems

How big of a problem is this?

Awesome Reproducible Research

Logging into Sysbio

Just to check that everyone has access and get any issues out of the way!

Connecting Remotely

Utilize the UTD VPN Service:

  • OIT support link
  • Enroll in NetIDplus
  • Install the Cisco AnyConnect VPN Software
  • Connect to the Cisco AnyConnect VPN System

Setting up VS Code

  1. Select the correct install for your platform
  2. Install
  3. Search in extensions for Functional Genomics

Login using Remote-SSH

  1. Click the green button
  2. Connect to Host...
  3. Select Add new ssh host
  4. ssh <netid>@sysbio.utdallas.edu (Or see this guide to connect to giant.utdallas.edu if we haven't gotten accounts on sysbio yet)
  5. Select Linux as the OS of the remote host
  6. Password is your usual UTD password
  7. Open up a terminal Ctrl+Shift+~
  8. sinfo or ls
caution

If you're on windows you may need to follow if you're getting an error about vscode remote connection error the process tried to write to a nonexistent pipe, you need to set your ssh config manually.

There are plenty of other ways to login remotely. Here are some alternatives for you to play around with:

Windows:

MacOS:

Biostar Handbook: 8. Installing on a Computer Cluster

We'll be following the text book for this section!

You'll need to run

module load anaconda3
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

curl http://data.biostarhandbook.com/install/conda.txt | xargs conda create -n biostars -q -y

Update prompt

curl http://data.biostarhandbook.com/install/bashrc.txt >> ~/.bashrc
curl http://data.biostarhandbook.com/install/bash_profile.txt >> ~/.bash_profile

Assignment 1

Create a GitHub account and submit your username through elearning.