CSE 291: Genomic Data Science
Spring 2019 - UCSD - San Diego, USA
Lectures:
Lectures:
- Monday, Wednesday
- 10:30 - 11:50 am, EBU3B 4140
Logistics
Logistics
- Join the class at Gradescope via 922GZP code
- Join the Piazza for discussions
Course staff:
Course staff:
- Yana Safonova, instructor: isafonova at ucsd dot edu
- Andrey Bzikadze, teaching assistant: abzikadze at ucsd dot edu
Office hours:
Office hours:
- Yana: Wed, 1:00 - 2:00 pm, Atkinson Hall 4105 (by appointment)
- Andrey: Mon, 12:00 - 1:00 pm, EBU3B 4250
Syllabus
Syllabus
UCSD Academic and Administrative Calendar 2018–2019
This graduate course covers various fields of genomic data science, including but not limited to genome assembly, haplotype assembly, analysis of RNA-seq data, computational immunology, and structural genomics. In this course, we will discuss various computational problems arising in these fields, algorithms and the state-of-the-art tools solving them. Students will complete a number of homework assignments involving analysis of real sequencing data and interpretation of the results. Students will also complete a scientific project dedicated to one of discussed topics.
Prerequisites to the course:
Prerequisites to the course:
- Python or any other programming language
- Confident work with command line
- Knowledge of algorithms
- Knowledge of basic biology
- Access to a server (at least 20Gb RAM)
The detailed syllabus is available here.
Schedule
(a subject of change)Schedule
Week 1.
Week 1.
- Mon, Apr 1st: Introduction to genomic data science
- Wed, Apr 3rd: Sequencing Data. History of immunology
Week 2. Analysis of immunosequencing (Rep-seq) data - 1
Week 2. Analysis of immunosequencing (Rep-seq) data - 1
- Mon, Apr 8th:
- Wed, Apr 10th: Repertoire sequencing data: protocols, artifacts, and error correction. Repertoire construction problem.
Week 3. Analysis of immunosequencing (Rep-seq) data - 2
Week 3. Analysis of immunosequencing (Rep-seq) data - 2
- Mon, Apr 15th:
- Population analysis of immunoglobulin genes and immunoglobulin locus
- HW #2: Finding IGH genes in the reference genome
- Deadline for submission of final project abstracts
Week 4. Analysis of immunosequencing (Rep-seq) data - 3
Week 4. Analysis of immunosequencing (Rep-seq) data - 3
- Mon, Apr 22nd:
- Wed, Apr 24th:
- Sun, Apr 28th, 11:59 pm: Deadline for HW 1 & 2.
Week 5. Analysis of immunosequencing (Rep-seq) data - 4. Midterm 1.
Week 5. Analysis of immunosequencing (Rep-seq) data - 4. Midterm 1.
- Mon, Apr 29th:
- Wed, May 1st: Midterm 1
- Sun, May 5th, 11:59 pm: Deadline for HW 3
Week 6. Analysis of immunosequencing (Rep-seq) data - 5.
Week 6. Analysis of immunosequencing (Rep-seq) data - 5.
- Mon, May 6th: Properties of antibodies from various vertebrate species. TCR repertoires.
- Wed, May 8th: Interim presentations of final projects
Week 7. Algorithms and tools for genome assembly (invited lecturer: Anton Bankevich)
Week 7. Algorithms and tools for genome assembly (invited lecturer: Anton Bankevich)
- Mon, May 13th: Genome assembly, part I (slides 1 - 39)
- Wed, May 15th:
- Genome assembly, part II (slides 40 - 46)
- HW #4: Choose a genome assembler and read the paper describing it. Answer the following questions:
- How does the assembler perform the main genome assembly steps described on slide 45 (lecture slides)?
- What steps the assembler does particularly focus on and what innovation does it introduce into these steps?
- What assumptions does the assembler make to formulate a computational problem?
- Recommended assemblers: Celera, ARACHNE, Velvet, SPAdes, IDBA-UD, SOAPdenovo, Canu, Flye
Week 8. Haplotype assembly. RNA-seq data.
Week 8. Haplotype assembly. RNA-seq data.
- Mon, May 20th:
- Genome assembly, part III (slides 47 - 71).
- Haplotype assembly, part I (slides 1 - 32)
- Wed, May 22nd:
- Haplotype assembly, part II (slides 33 - 51).
- Analysis of RNA-seq data.
- HW #5: Differential Gene Expression
- Sun, May 26th, 11:59 pm: Deadline for HW 4
Week 9. Analysis of genomic 3D structure using Hi-C data.
Week 9. Analysis of genomic 3D structure using Hi-C data.
- Mod, May 27th: Memorial Day observance
- Wed, May 29th: Analysis of Hi-C data
Week 10. Midterm 2
Week 10. Midterm 2
- Mon, Jun 3rd: Journal club
- Wed, Jun 5th: Midterm 2
- Sun, Jun 9th, 11:59 pm: Deadline for HW 5
Final week. Presentations of student projects
Final week. Presentations of student projects
- Mon, Jun 10th: 8:00 am - 11:00 am.