CSE 291: Genomic Data Science

Spring 2019 - UCSD - San Diego, USA


  • Monday, Wednesday
  • 10:30 - 11:50 am, EBU3B 4140


Course staff:

  • Yana Safonova, instructor: isafonova at ucsd dot edu
  • Andrey Bzikadze, teaching assistant: abzikadze at ucsd dot edu

Office hours:

  • Yana: Wed, 1:00 - 2:00 pm, Atkinson Hall 4105 (by appointment)
  • Andrey: Mon, 12:00 - 1:00 pm, EBU3B 4250


UCSD Academic and Administrative Calendar 2018–2019

This graduate course covers various fields of genomic data science, including but not limited to genome assembly, haplotype assembly, analysis of RNA-seq data, computational immunology, and structural genomics. In this course, we will discuss various computational problems arising in these fields, algorithms and the state-of-the-art tools solving them. Students will complete a number of homework assignments involving analysis of real sequencing data and interpretation of the results. Students will also complete a scientific project dedicated to one of discussed topics.

Prerequisites to the course:

  • Python or any other programming language
  • Confident work with command line
  • Knowledge of algorithms
  • Knowledge of basic biology
  • Access to a server (at least 20Gb RAM)

The detailed syllabus is available here.


(a subject of change)

Week 1.

Week 2. Analysis of immunosequencing (Rep-seq) data - 1

Week 3. Analysis of immunosequencing (Rep-seq) data - 2

Week 4. Analysis of immunosequencing (Rep-seq) data - 3

Week 5. Analysis of immunosequencing (Rep-seq) data - 4. Midterm 1.

Week 6. Analysis of immunosequencing (Rep-seq) data - 5.

Week 7. Algorithms and tools for genome assembly (invited lecturer: Anton Bankevich)

  • Mon, May 13th: Genome assembly, part I (slides 1 - 39)
  • Wed, May 15th:
    • Genome assembly, part II (slides 40 - 46)
    • HW #4: Choose a genome assembler and read the paper describing it. Answer the following questions:
      • How does the assembler perform the main genome assembly steps described on slide 45 (lecture slides)?
      • What steps the assembler does particularly focus on and what innovation does it introduce into these steps?
      • What assumptions does the assembler make to formulate a computational problem?
      • Recommended assemblers: Celera, ARACHNE, Velvet, SPAdes, IDBA-UD, SOAPdenovo, Canu, Flye

Week 8. Haplotype assembly. RNA-seq data.

Week 9. Analysis of genomic 3D structure using Hi-C data.

Week 10. Midterm 2

  • Mon, Jun 3rd: Journal club
  • Wed, Jun 5th: Midterm 2
  • Sun, Jun 9th, 11:59 pm: Deadline for HW 5

Final week. Presentations of student projects

  • Mon, Jun 10th: 8:00 am - 11:00 am.