CSE 291: Genomic Data Science

Spring 2019 - UCSD - San Diego, USA

Lectures:

  • Monday, Wednesday
  • 10:30 - 11:50 am, EBU3B 4140

Logistics

Course staff:

  • Yana Safonova, instructor: isafonova at ucsd dot edu
  • Andrey Bzikadze, teaching assistant: abzikadze at ucsd dot edu

Office hours:

  • Yana: Wed, 1:00 - 2:00 pm, Atkinson Hall 4105 (by appointment)
  • Andrey: Mon, 12:00 - 1:00 pm, EBU3B 4250

Syllabus

UCSD Academic and Administrative Calendar 2018–2019


This graduate course covers various fields of genomic data science, including but not limited to genome assembly, haplotype assembly, analysis of RNA-seq data, computational immunology, and structural genomics. In this course, we will discuss various computational problems arising in these fields, algorithms and the state-of-the-art tools solving them. Students will complete a number of homework assignments involving analysis of real sequencing data and interpretation of the results. Students will also complete a scientific project dedicated to one of discussed topics.

Prerequisites to the course:

  • Python or any other programming language
  • Confident work with command line
  • Knowledge of algorithms
  • Knowledge of basic biology
  • Access to a server (at least 20Gb RAM)

The detailed syllabus is available here.

Schedule

(a subject of change)

Week 1.

Week 2. Analysis of immunosequencing (Rep-seq) data - 1

Week 3. Analysis of immunosequencing (Rep-seq) data - 2

Week 4. Analysis of immunosequencing (Rep-seq) data - 3

Week 5. Analysis of immunosequencing (Rep-seq) data - 4. Midterm 1.

Week 6. Analysis of immunosequencing (Rep-seq) data - 5.

Week 7. Algorithms and tools for genome assembly (invited lecturer: Anton Bankevich)

  • Mon, May 13th: Genome assembly, part I (slides 1 - 39)
  • Wed, May 15th:
    • Genome assembly, part II (slides 40 - 46)
    • HW #4: Choose a genome assembler and read the paper describing it. Answer the following questions:
      • How does the assembler perform the main genome assembly steps described on slide 45 (lecture slides)?
      • What steps the assembler does particularly focus on and what innovation does it introduce into these steps?
      • What assumptions does the assembler make to formulate a computational problem?
      • Recommended assemblers: Celera, ARACHNE, Velvet, SPAdes, IDBA-UD, SOAPdenovo, Canu, Flye

Week 8. Haplotype assembly. RNA-seq data.

Week 9. Analysis of genomic 3D structure using Hi-C data.

Week 10. Midterm 2

  • Mon, Jun 3rd: Journal club
  • Wed, Jun 5th: Midterm 2
  • Sun, Jun 9th, 11:59 pm: Deadline for HW 5

Final week. Presentations of student projects

  • Mon, Jun 10th: 8:00 am - 11:00 am.