Popular genome browsers are nowadays fundamental in a research lab to access the sequence and the annotations for hundreds of species. Here we will use one of the most popular genome browsers: UCSC. We will focus on the protocol to navigate through the basic information that can be used to annotate genes and regulatory regions in the human genome. Other genomes will be used here to explore comparative genomics capabilities provided by these bioinformatics tools. The most powerful feature of UCSC is how easy is for non-expert users to upload their own data to integrate it with current annotations. Thus, users can save multiple configurations of custom tracks into sessions that are protected under password which can be distributed among collaborators. We will see in this exercise how to create a UCSC account for our sessions, uploading our own custom tracks of annotations in Bed and BedGraph formats.
TABLE OF CONTENTS:
[ETC: 5 mins]
Open the UCSC Genome Browser:
Search the Help link (Browser documentation)
Read the Getting started using Sessions text
Open the Sessions Tool and Create an account
Once you receive the confirmation e-mail, activate the link
Now, login into your session
[ETC: 15 mins]
Click the Genomes link in the menu on top:
Select to work with Human
Explore the Human assembly box
Set the Human assembly to hg38
Read the Search the assembly instructions
Search for the LRRTM1 gene (RefSeq Genes track in the output)
Notice general features of this gene: location, exons, strand, etc.
Play with the move/zoom in/zoom out/base buttons
Practice to drag the landscape or to zoom into the viewer
Change the order between tracks by drag and drop vertically
Learn to use the reverse and resize buttons
Drag and drop the mouse to select a region (on top of the ruler)
First, highlight the region in orange
Second, repeat the selection and perform a zoom in, instead
Use View->DNA to get the DNA sequence of the current window
Use View->PDF to get a graphical representation of this region
Click over the Genome browser link in the menu on top to return
Press configure: increase text size and press submit
[ETC: 20 mins]
Let us focus on the blocks of data tracks:
Press the hide all button
Examine the different blocks of options
Find the NCBI RefSeq Genes supertrack into the Genes and Gene Prediction Tracks block
Click over the NCBI RefSeq Genes link to recover more information
Set UCSC RefSeq subtrack to pack and press Submit
Change to dense or squish visibility modes of NCBI RefSeq Genes
Use the refresh button for updating the screen
Return to pack visualization mode
Click over one RefSeq exon of LRRTM1 in the viewer to access the RefSeq Gene record
Use the Entrez Gene, PUBMED and OMIM links to gain knowledge on this gene
Go down to Links to sequence section in this record
Learn how to download the protein and the transcript sequences
Press over the Genomic Sequence link to open the sequence retrieval interface
Extract the CDS sequence of LRRTM1
Extract the promoter sequence (1Kb)
Click over the Genome browser link in the menu on top to return
Find the Conservation track in the Comparative Genomics block
Change the visualization mode to pack and refresh
Explore the new tracks added into the screen
Click over the Conservation link or the grey bar on the left for configuration menu
Hide all the subtracks
Set phastCons alignment for vertebrates to pack and press submit
Zoom out 10x and analyze the conservation landscape of the LRRTM1 gene
Click with the mouse right button over the Conservation track in the browser
Set the phastCons track to dense
Click with the mouse right button again and choose View image
Click over the Conservation track (left button) to change to full mode
Click again over the same track to access to statistics
Click over the Genome browser link in the menu on top to return
Go to My Data -> My Sessions
In the Save Settings block, save your session as UB_Master_Session1_hg38
Use the link to this session to reopen this session in a separate tab
[ETC: 20 mins]
We are going to explore how to expand the collection of UCSC tracks:
Click over the Genome browser link in the menu on top to return home
Go back to the RefSeq internal entry for the LRRTM1 gene
Download the mRNA sequence
In the main page, open the Blat program (Tools)
Paste the LRRTM1 transcript and press the I'm feeling lucky button
Click over the BLAT track in the genome viewer to see the alignments
Try with the Predicted protein sequence
Repeat both BLAT queries but pressing Submit to explore the list of hits
Change to Mouse in the Blat interface and repeat the query of the human transcript
Open the Help->Browser Documentation link
Read the Creating and managing custom annotation tracks text
Read the documentation on the BED format
Go back to the browser (hg38)
Press the Add custom tracks button
Copy and paste this line:
chr2 80301878 80304752
Press Submit
Press the go button and examine the result
Press the Manage custom tracks button
Click over the User track link
Edit the track line to have this:
track name=mygene description=text color=200,100,50
Press the go button and examine the result
Explore this web to find more RGB colors
Go to My Data -> Sessions and save the session
Open the Help->Browser Documentation link
Read the Creating and managing custom annotation tracks text
Read the documentation on the bedGraph format
Click over the Genome browser link in the menu on top to return
Press Manage custom tracks first and press Add custom tracks after
Introduce the following lines:
track type=bedGraph name=myprofile description=text color=50,100,200
chr2 80301000 80302000 5
chr2 80302000 80303000 10
chr2 80303000 80304000 20
chr2 80304000 80305000 10
chr2 80305000 80306000 5
Press the go and examine the result
Open the properties of the myprofile track (link, mouse right button or grey bar on the left)
Set the visibility mode to full, the scaling to vertical (min to 0 and max to 20)
Change the Track height to 60
Go to My Data -> Sessions and save the session
Click over the Genome browser link in the menu on top to return
Read the Creating and managing custom annotation tracks text
Read the documentation on the interact format
Go back to the browser
Press the Manage custom tracks button, press Add custom tracks next
Introduce the following lines:
track type=interact name="myinteraction" description="text" interactDirectional=true maxHeightPixels=50:50:50 visibility=full
chr2 80301878 80304752 sample 0 1 tissue #00AA00 chr2 80301878 80301878 LRRTM1 . chr2 80304752 80304752 INTERACTION +
Press the Go to genome browser button and examine the result
Go to My Data -> Sessions and save the session
E. Blanco. Genomica Computacional (spanish, 248 pages). Editorial UOC (2013). ISBN: 978-84-9029-910-4.
Practical Bioinformatics. Michael Agostino. Garland Science (2012). ISBN: 978-0815344568.
Genomes, Browsers and Databases: Data-mining Tools for Integrated Genomic Databases. Peter Schattner. Cambridge University Press (2008). ISBN: 978-0521884433.
Understanding Bioinformatics. M.J. Zvelebil and J.O. Baum. Garland Publishing Inc ,USA (2007). ISBN-10: 0815340249.
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3rd Edition. A.D. Baxevanis and B. F. Francis Ouellette, chief editors. John Wiley & Sons Inc., New York (2005). ISBN: 0-4 71-47878-4.