Environment Setup
This is not a programming class; this is a “workflows” class
Why reproducible Research?
To reduce wasted resources.
Computing
- Use the right tool for the job.
- Running computational intensive things on your laptop is too little.
- Running jobs that are precise wastes compute time in the cluster that you or someone else could be using for other things.
Money
Funding runs out. Waiting for results to come back, or reproducing old results requires time of highly trained individuals, which is expensive.
Time
Save time for your future self, and for researchers who come after you.
- What if you're about to graduate and your hard-drive dies(The results are too large for a backup on a usb)? How quickly could you reproduce that work?
- How would you work with a new undergraduate in your lab?
- How would they make substantial contributions?
- How would could you trust them to work on your research without fear that they may break something?
- When you go to write a manuscript, how would you share your code?
How are we going to be more Reproducible?
- Avoiding homegrown solutions, instead favoring community curated efforts
- Documentation
- Version Control
- Utilizing build systems
How big of a problem is this?
Logging into Sysbio
Just to check that everyone has access and get any issues out of the way!
Connecting Remotely
Utilize the UTD VPN Service:
- OIT support link
- Enroll in NetIDplus
- Install the Cisco AnyConnect VPN Software
- Connect to the Cisco AnyConnect VPN System
Setting up VS Code
- Select the correct install for your platform
- Install
- Search in extensions for
Functional Genomics
Login using Remote-SSH
- Click the green button
- Connect to Host...
- Select
Add new ssh host
ssh <netid>@sysbio.utdallas.edu
(Or see this guide to connect togiant.utdallas.edu
if we haven't gotten accounts on sysbio yet)- Select
Linux
as the OS of the remote host - Password is your usual UTD password
- Open up a terminal
Ctrl+Shift+~
sinfo
orls
caution
If you're on windows you may need to follow if you're getting an error about vscode remote connection error the process tried to write to a nonexistent pipe, you need to set your ssh config manually.
There are plenty of other ways to login remotely. Here are some alternatives for you to play around with:
Windows:
MacOS:
Biostar Handbook: 8. Installing on a Computer Cluster
We'll be following the text book for this section!
You'll need to run
module load anaconda3
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
curl http://data.biostarhandbook.com/install/conda.txt | xargs conda create -n biostars -q -y
Update prompt
curl http://data.biostarhandbook.com/install/bashrc.txt >> ~/.bashrc
curl http://data.biostarhandbook.com/install/bash_profile.txt >> ~/.bash_profile
Assignment 1
Create a GitHub account and submit your username through elearning.