# Research

I'm currently an associate research scientist with Mark Gerstein's lab in Computational Biology and Bioinformatics at Yale. My interests are in computational neuroscience, genomics, machine-learning and evolutionary theory. I'm also interested in music theory and analysis (which is the area of my PhD). Below is a summary of some of my work in these areas.

## Neuroscience and Genomics

The brain exhibits multiple layers of organization and dynamics, from the genetic and molecular, to the behavioral and computational. In our work on PsychENCODE below, we developed a integrative deep-learning approach for tracing risk for psychiatric disorders across multiple levels of organization. Further, our forthcoming work on cancer genomics develops an approach which extends the standard individual driver based model of cancer progression to include the effects of weak drivers acting in tandem.

Emani, P.S.*, **Warrell, J.*,** Anticevic, A., Bekiranov, S., Gandal, M., McConnell, M.J., Sapiro, G., Aspuru-Guzik, A., Baker, J., Bastiani, M., McClure, P., Murray, J., Sotiropoulos, S.N., Taylor, J., Senthil, J., Lehner, T., Gerstein, M., Aram W. Harrow, A.W., 2021.** **Quantum Computing at the Frontiers of Biological Sciences. * Nature Methods*.

**(* equal contribution)**

Sushant Kumar*, **Jonathan Warrell***, Shantao Li, Patrick D. McGillivray, William Meyerson, Leonidas Salichos, Arif Harmanci, Alexander Martinez-Fundichely, Calvin W.Y. Chan, Morten Muhlig Nielsen, Lucas Lochovsky, Yan Zhang, Xiaotong Li, Jakob Skou Pedersen, Carl Herrmann, Gad Getz, Ekta Khurana, Mark B. Gerstein, 2020. Passenger mutations in more than 2500 cancer genomes: Overall molecular functional impact and consequences. * Cell*.

**(* equal contribution)**[Weak driver additive effects code]

D. Wang*, S. Liu*, **J. Warrell***, H. Won*, X. Shi*, F. C. P. Navarro*, D. Clarke*, M. Gu*, P Emani*, Y. T. Yang, M. Xu, M. J. Gandal, S. Lou, J. Zhang, J. J. Park, C. Yan, S. K. Rhie, K. Manakongtreecheep, H. Zhou, A. Nathan, M. Peters, E. Mattei, D. Fitzgerald, T. Brunetti, J. Moore, Y. Jiang, K. Girdhar, G. E. Hoffman, S. Kalayci, Z. H. Gumus, G. E. Crawford, PsychENCODE Consortium, P. Roussos, S. Akbarian, A. E. Jaffe, K. P. White, Z. Weng, N. Sestan, D. H. Geschwind, J. A. Knowles, M. B. Gerstein, 2018. Comprehensive functional genomic resource and integrative model for the human brain. * Science*,

*362*(6420), p.eaat8464.

**(* equal contribution)**[DSPN code]

## Interpretable Machine Learning

Deep-learning provides models of universal function classes, and has proved capable of learning generalizable predictive models for real-world tasks of (effectively) arbitrary complexity. In order to integrate the knowledge learnt by such models with prior scientific and cultural knowledge, techniques are required to analyze or interpret the models learnt. In the papers below, we develop techniques for extracting and representing knowledge in deep neural nets and analyze the sense in which deep learning models carry implicit semantics.

**J. ****Warrell*, **H. Mohsen*, P. Emani and M. B. Gerstein, M. 2021. Interpretability and Implicit Model Semantics in Biomedicine and Deep Learning. *Preprint*. **(* equal contribution) **

**J. ****Warrell, **H. Mohsen and M. B. Gerstein, M. 2020. Compression based network interpretability schemes. *bioRxiv*.

**J. Warrell**, H. Mohsen, M. B. Gerstein, 2018. Rank Projection Trees for Multilevel Neural Network Interpretation, * NeurIPS Workshop on Machine Learning for Health*. [Rank projection tree code]

## Statistical Learning Theory and Probabilistic Programming

Probabilistic Programming and Dependent Type Theory are powerful frameworks for modeling recursion and compositionality mathematically. We use techniques derived from these sources to derive novel generalization bounds in the context of PAC-Bayes analysis, motivating new algorithms for problems such as transfer and meta-learning.

**J. Warrell,** and M. B. Gerstein, 2021. Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs via PAC-Bayes Objectives. *Preprint*.

**J. Warrell,** and M. B. Gerstein, 2019. Hierarchical PAC-Bayes Bounds via Deep Probabilistic Programming. In **Bayesian Deep Learning Workshop at NeurIPS***.*

**J. Warrell**, and M. B. Gerstein, 2018. Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types, * Uncertainty in Artificial Intelligence Workshop on Uncertainty in Deep Learning*.

**Warrell, J.H, **2016. A probabilistic dependent type system based on non-deterministic beta reduction. *arXiv**.*

## Evolution and Theoretical Biology

Evolution is both a physical and a computational process. Evolutionary processes exhibit structure and are capable of information processing at a range of levels, from molecular and neural networks, to behavior and culture. In the papers below, we consider how to define evolutionary processes embedding cyclic and multilevel notions of causality, and the relationship of global and local features in certain observed cellular molecular networks.

Salichos, L.*, **Warrell, J.*,** Cevasco, H., Chung, A. and Gerstein, M., 2021. Genetic determination of regional connectivity in modelling the spread of COVID-19 outbreak for improved mitigation strategies. *medRxiv*. **(* equal contribution) **

**J. ****Warrell*,** L. Salichos*, and M. B. Gerstein. 2020. Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution. *bioRxiv.* **(* equal contribution)**

**J. Warrell** and M. B. Gerstein, 2020. Cyclic and Multilevel Causality in Evolutionary Processes. * Biology & Philosophy*, 35(5), pp.1-36.

**J. Warrell** and M. Mhlanga, 2017. Stability and structural properties of gene regulation networks with coregulation rules. * Journal of theoretical biology*,

*420*, pp.304-317.

## Music Theory and Analysis

Musical structure is created through a rich interaction of cultural, physical, abstract and psychological causes. In my thesis, I develop an integrative theory and analytical approach, focused on the music of the 20th century composer Arnold Schoenberg, which explores general notions of form and function applicable to entities across all musical dimensions.

**J. ****Warrell*,** L. Salichos*, and M. B. Gerstein. 2020. Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution. *bioRxiv* **(* equal contribution)**

**J. Warrell**, 2006. Repetition in the Music of Arnold Schoenberg. PhD thesis, King's College London.