CSE428: Computational Biology Capstone

Project Options:


Projects will involve problems in "Regulatory Genomics" where the goal is to predict varied types of information from genomic DNA sequence. Each project will consist of two phases: i) reproducing existing results, ii) implementing a novel extension.


i) "reproducibility phase" will provide an opportunity to jointly make progress on understanding the research area, and current state-of-the-art methods and results

  • All groups will work with the same data, and model during the first 5 weeks of the course.

  • The goal is to re-implement a multi-task CNN, using data provided here (and the corresponding GitHub here).

  • During this phase, groups will rigorously evaluate the re-implementation of the model and assess their results.


ii) "Model extension phase" will provide an opportunity for each group to utilize their creativity and knowledge to pose a research question, implement a new approach, and evaluate the results.

  • Each group will have the freedom to pose a question, to extend the model create in phase i.

  • We will devote a lecture to brainstorm, to come up with viable, interesting, well-formulated directions.