Welcome to BIS180L

Dear Students,

Welcome to BIS180L. I am looking forward to teaching you in this class. This class will give you hands-on experience with bioinformatics data analysis.

Class will meet in person in SLB 2020

Pre-recorded Lectures

For the most part I will provided lectures in a pre-recorded format. This will allow the best use of our time together in the classroom. Pre-recorded lectures will be available at the latest at 9AM the day before the lecture/lab time. You will need to watch them in advance and answer embedded quiz questions via playposit. Due date for the quiz is 9AM the day of lecture. This give me and the TA time to review questions before class.

Class meeting time

Even though lectures are pre-recorded, we will still meet at 1:10 PM.

In class

We will devote the beginning of each class / lab to questions on the lecture material. You will then work in small groups of two or three to work on the lab material. John (TA) and I will be available to help when you have questions.

I expect you to be in class/lab each day at 1:10. If you have a Covid or other health emergency that prevents you from coming to class, please let me know (in advance if possible).

Outside of class

You can expect to spend a fair bit of time outside of class working on the assignments. This can either be done in one of the computer labs, or on your laptop. If you are not willing to put in time outside of class this is probably not the best class for you. If you do put in the time outside of class you will be rewarded by learning a great deal about how to perform bioinformatics analyses.

Reading Material / Text Book

We will use Vince Buffalo’s excellent book Bioinformatics Data Skills.

This book is available online for free through the UC Davis Library (on campus or VPN connection required). UC is licensed for simultaneous access by 28 people. If that doesn’t work for you, you can buy it direct from the publisher or from amazon. As of this posting Amazon is cheaper and has used copies as well.

Additionally we will use Hadley Wickham’s also excellent book R for Data Science which is available online for free. If you would like a physical copy, here is the amazon link

Installing Requisite Software on your Mac

If you want to try doing the labs directly on your Mac instead of on JetStream here is how you can. This document may have errors; please let me know and I can help (and update the document)

web downloads

Install the following packages by clicking on the link and following the instructions on the web page:

  • The atom editor
  • The R statistical programming language
  • The R IDE Rstudio
  • The sequence alignment viewer jalview # maybe not used
  • The network viewer cytoscape # maybe not used

Atom packages

Open up atom, click on File > Preferences, and then install the following atom packages. Not all of these are needed for the BIS180L but they are good to have

  • Sublime-Style_Column-Selection
  • language-markdown
  • git-plus
  • markdown-pdf
  • line-ending-converter

In Atom go to preferences and search for the ‘whitespace’ package. Disable it.

XCode command line tools

First you need to install Xcode, Apple’s development software that includes compilers necessary for installing packages from source.

We only need the Xcode command-line tools, you can install them by entering the command below in the Unix terminal:

xcode-select --install

(If you prefer, you can install the whole GUI instead but it takes up a lot of space and you will probably never use it.)

Homebrew

Homebrew is a package manager for OS X that it much easier to install Linux/Unix packages.

Go to the Homebrew webpage and follow the install instructions there.

After Homebrew is installed, then from a terminal window:

upgrade homebrew

brew update
brew upgrade

install the following packages through homebrew

brew tap brewsci/bio # additional packages
brew install htop
brew install wget
brew install git
brew cask install igv
brew install mafft
brew install bowtie2
brew install bwa
brew install bwa-mem2brew 
brew install samtools
brew install bedtools
brew install blast
brew install emboss
brew install gatk
brew install freebayes
brew install fasttree
brew install star
brew install fastqc
brew cask install dendroscope
brew cask install mega
brew cask install git-it

R packages

Open RStudio and then

install.packages(c('swirl','genetics','hwde','seqinr','qtl','evaluate','formatR','highr','markdown','yaml','htmltools','caTools','bitops','knitr','rmarkdown','devtools','shiny','pvclust','gplots','cluster','igraph','scatterplot3d','ape','SNPassoc','rsconnect','dplyr','tidyverse','learnr', 'LDheatmap'), dependencies=T)

Still within R, install Bioconductor

install.packages("BiocManager")
BiocManager::install(c("Rsubread","snpStats","rtracklayer","goseq","impute","multtest","VariantAnnotation","chopsticks","edgeR"))

perl modules

For auto_barcode to work, the following must be installed (From the Unix command line):

sudo cpan install Statistics::Descriptive
sudo cpan install Statistics::R
sudo cpan install Text::Levenshtein::XS
sudo cpan install Text::Table

tophat

cd /usr/local/bin
wget https://ccb.jhu.edu/software/tophat/downloads/tophat-2.1.1.OSX_x86_64.tar.gz
tar -xvzf tophat-2.1.1.OSX_x86_64.tar.gz
ln -s tophat-2.1.1.OSX_x86_64/tophat ./

add class data

IMPORTANT: I AM GIVING THIS A DIFFERENT PATH THAN USED IN THE LAB TO REDUCE THE CHANCE OF YOU OVERWRITING FILES ON YOUR COMPUTER. On the website it this is ~/data but here it is ~/bis180l/data. ``~/bis180l/data` will be overwritten so make sure that nothing critical is there.

cd
mkdir -p bis180l
cd bis180l
wget http://malooflab.phytonetworks.org/media/maloof-lab/filer_public/8f/d5/8fd59de6-e311-4d50-8320-acc58402982f/bis180l_class_data_2020tar.gz
tar xzvf bis180l_class_data_2020tar.gz
rm bis180l_class_data_2020tar.gz

Other

Still need to deal with trimmomatic and auto_barcode, but the instructions in the labs should more or less work.

fastStructure

I can’t get this to work, you will need to use your instance for this one.