# Research

I'm currently an associate research scientist with Mark Gerstein's lab in Computational Biology and Bioinformatics at Yale. My interests are in computational biology, neuroscience, AGI, machine-learning and theoretical computer science. I'm also interested in music theory and analysis (which is the area of my PhD). Below is a summary of some of my work in these areas.

## AGI

Below are the links to my slides/paper from AGI-22 (main conference paper and workshop) and my keynote from AGI-21.

Warrell, J., Potapov, A., Vandervorst, A. and Goertzel, B., 2022. A meta-probabilistic-programming language for bisimulation of probabilistic and non-well-founded type systems. In *AGI-22*, see *arXiv preprint arXiv:2203.15970*. (slides) (code) (paper)

Warrell, J. Formalizing Higher-order Probabilistic Type Systems in MeTTa. *AGI-22 Workshop on Scaling up Neural-Symbolic AGI with OpenCog Hyperon*. (slides) (code)

Warrell J. Probabilistic Dependent Types and Semantics in AGI: Formal and Philosophical perspectives. AGI-21 Keynote. (slides)

## Neuroscience and Genomics

The brain exhibits multiple layers of organization and dynamics, from the genetic and molecular, to the behavioral and computational. In our work on PsychENCODE below, we developed a integrative deep-learning approach for tracing risk for psychiatric disorders across multiple levels of organization. Further, our forthcoming work on cancer genomics develops an approach which extends the standard individual driver based model of cancer progression to include the effects of weak drivers acting in tandem.

Emani, P.S.*, **Warrell, J.*,** Anticevic, A., Bekiranov, S., Gandal, M., McConnell, M.J., Sapiro, G., Aspuru-Guzik, A., Baker, J., Bastiani, M., McClure, P., Murray, J., Sotiropoulos, S.N., Taylor, J., Senthil, J., Lehner, T., Gerstein, M., Aram W. Harrow, A.W., 2021.** **Quantum Computing at the Frontiers of Biological Sciences. * Nature Methods*.

**(* equal contribution)**

Sushant Kumar*, **Jonathan Warrell***, Shantao Li, Patrick D. McGillivray, William Meyerson, Leonidas Salichos, Arif Harmanci, Alexander Martinez-Fundichely, Calvin W.Y. Chan, Morten Muhlig Nielsen, Lucas Lochovsky, Yan Zhang, Xiaotong Li, Jakob Skou Pedersen, Carl Herrmann, Gad Getz, Ekta Khurana, Mark B. Gerstein, 2020. Passenger mutations in more than 2500 cancer genomes: Overall molecular functional impact and consequences. * Cell*.

**(* equal contribution)**[Weak driver additive effects code]

D. Wang*, S. Liu*, **J. Warrell***, H. Won*, X. Shi*, F. C. P. Navarro*, D. Clarke*, M. Gu*, P Emani*, Y. T. Yang, M. Xu, M. J. Gandal, S. Lou, J. Zhang, J. J. Park, C. Yan, S. K. Rhie, K. Manakongtreecheep, H. Zhou, A. Nathan, M. Peters, E. Mattei, D. Fitzgerald, T. Brunetti, J. Moore, Y. Jiang, K. Girdhar, G. E. Hoffman, S. Kalayci, Z. H. Gumus, G. E. Crawford, PsychENCODE Consortium, P. Roussos, S. Akbarian, A. E. Jaffe, K. P. White, Z. Weng, N. Sestan, D. H. Geschwind, J. A. Knowles, M. B. Gerstein, 2018. Comprehensive functional genomic resource and integrative model for the human brain. * Science*,

*362*(6420), p.eaat8464.

**(* equal contribution)**[DSPN code]

## (Meta) Probabilistic Programming and Homotopy Type Theory

Probabilistic Programming and Dependent Type Theory are powerful frameworks for modeling recursion and compositionality mathematically. Below, we develop methods for defining probabilistic dependent type systems, and non-well-founded types which may include cycles in the type relationship. A rigorous semantics for these systems is provided by introducing a formal meta-language for probabilistic programming, capable of expressing both programs and the type systems in which they are embedded. We are motivated here by the desire to allow a system to learn not only relevant knowledge (programs/proofs), but also appropriate ways of reasoning (logics/type systems). We draw on the frameworks of cubical type theory and dependent typed metagraphs to formalize our approach.

**J. Warrell,** A. Potapov, A. Vandervorst and B. Goertzel, 2022. A meta-probabilistic-programming language for bisimulation of probabilistic and non-well-founded type systems. *arXiv.*

**J. Warrell,** and M. B. Gerstein, 2022. Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs via PAC-Bayes Objectives. *Preprint*.

**J. Warrell,** and M. B. Gerstein, 2019. Hierarchical PAC-Bayes Bounds via Deep Probabilistic Programming. In **Bayesian Deep Learning Workshop at NeurIPS***.*

**J. Warrell**, and M. B. Gerstein, 2018. Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types, * Uncertainty in Artificial Intelligence Workshop on Uncertainty in Deep Learning*.

**Warrell, J.H, **2016. A probabilistic dependent type system based on non-deterministic beta reduction. *arXiv**.*

## Interpretable Machine Learning

Deep-learning provides models of universal function classes, and has proved capable of learning generalizable predictive models for real-world tasks of (effectively) arbitrary complexity. In order to integrate the knowledge learnt by such models with prior scientific and cultural knowledge, techniques are required to analyze or interpret the models learnt. In the papers below, we develop techniques for extracting and representing knowledge in deep neural nets and analyze the sense in which deep learning models carry implicit semantics.

**J. ****Warrell*, **H. Mohsen*, P. Emani and M. B. Gerstein, M. 2022. Interpretability and Implicit Model Semantics in Biomedicine and Deep Learning. *Preprint*. **(* equal contribution) **

**J. ****Warrell, **H. Mohsen and M. B. Gerstein, M. 2020. Compression based network interpretability schemes. *bioRxiv*.

**J. Warrell**, H. Mohsen, M. B. Gerstein, 2018. Rank Projection Trees for Multilevel Neural Network Interpretation, * NeurIPS Workshop on Machine Learning for Health*. [Rank projection tree code]

## Evolution and Theoretical Biology

Evolution is both a physical and a computational process. Evolutionary processes exhibit structure and are capable of information processing at a range of levels, from molecular and neural networks, to behavior and culture. In the papers below, we consider how to define evolutionary processes embedding cyclic and multilevel notions of causality, and the relationship of global and local features in certain observed cellular molecular networks.

Salichos, L.*, **Warrell, J.*,** Cevasco, H., Chung, A. and Gerstein, M., 2021. Genetic determination of regional connectivity in modelling the spread of COVID-19 outbreak for improved mitigation strategies. *medRxiv*. **(* equal contribution) **

**J. ****Warrell*,** L. Salichos*, and M. B. Gerstein. 2020. Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution. *bioRxiv.* **(* equal contribution)**

**J. Warrell** and M. B. Gerstein, 2020. Cyclic and Multilevel Causality in Evolutionary Processes. * Biology & Philosophy*, 35(5), pp.1-36.

**J. Warrell** and M. Mhlanga, 2017. Stability and structural properties of gene regulation networks with coregulation rules. * Journal of theoretical biology*,

*420*, pp.304-317.

## Music Theory and Analysis

Musical structure is created through a rich interaction of cultural, physical, abstract and psychological causes. In my thesis, I develop an integrative theory and analytical approach, focused on the music of the 20th century composer Arnold Schoenberg, which explores general notions of form and function applicable to entities across all musical dimensions.

**J. ****Warrell*,** L. Salichos*, and M. B. Gerstein. 2020. Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution. *bioRxiv* **(* equal contribution)**

**J. Warrell**, 2006. Repetition in the Music of Arnold Schoenberg. PhD thesis, King's College London.