Research

I'm currently a visiting fellow with Mark Gerstein's lab in Computational Biology and Bioinformatics at Yale and a research scientist with the NEC Labs Princeton Machine Learning Group.   My interests are in computational biology, neuroscience, AGI, machine-learning and theoretical computer science.  I'm also interested in computational music theory and analysis.  Below is a summary of some of my work in these areas.

AI in Cancer


The rapid progress of AI and integrated genomics offers the promise of developing tractable models of cancer progression to uncover novel insights into underlying biological mechanisms and clinical treatments.  I am interested particularly in integrative approaches to AI in cancer, including spatial transcriptomics, models of somatic evolution, biomarker discovery and cancer vaccine development.  In my work below, I have combined machine-learning approaches with theoretical population-genetics based models, designed to model the effects of strong and weak drivers along with neutral and deleterious passenger variants acting in tandem.


Song, T., Cosatto, E., Wang, G., Kuang, R., Gerstein, M., Min, M.R. and Warrell, J.*, 2024. Predicting Spatially Resolved Gene Expression via Tissue Morphology using Adaptive Spatial GNNs. bioRxiv, pp.2024-06. (*corresponding author, accepted as oral presentation to ECCB and Bioinformatics)


Aung, T.N.*, Warrell, J.*, Martinez-Morilla, S., Gavrielatou, N., Vathiotis, I., Yaghoobi, V., Kluger, H.M., Gerstein, M. and Rimm, D.L., 2024. Spatially informed gene signatures for response to immunotherapy in melanoma. Clinical Cancer Research. (*equal contribution)


Kumar, S.*, Warrell, J.*, Li, S., McGillivray, P.D., Meyerson, W., Salichos, L., Harmanci, A., Martinez-Fundichely, A., Chan, C.W., Nielsen, M.M. and Lochovsky, L., 2020. Passenger mutations in more than 2500 cancer genomes: Overall molecular functional impact and consequences. Cell, 180(5), pp.915-927. (*equal contribution)

Theoretical Biology and Evolution


Evolutionary processes may be viewed from multiple perspectives, for instance, as learning algorithms, dynamical systems, causal networks,  and open-ended generative models.  In my work below, I have been interested in probing the intersection of Universal Darwinism and machine-learning, particularly with respect to models of multilevel selection and cultural evolution.  In the process, my goal is both to use machine-learning based methods to make higher-order evolutionary concepts more precise, as well as providing concrete approaches for applying such models to diverse kinds of data.


J. Warrell,*, Salichos, L.*, Gancz, M.* and Gerstein, M.B., 2024. Latent evolutionary signatures: a general framework for analysing music and cultural evolution. Journal of the Royal Society Interface, 21(212), p.20230647. (*equal contribution)


J. Warrell and M. B. Gerstein, 2020. Cyclic and Multilevel Causality in Evolutionary Processes. Biology & Philosophy, 35(5), pp.1-36.


J. Warrell and M. Mhlanga, 2017. Stability and structural properties of gene regulation networks with coregulation rules. Journal of theoretical biology, 420, pp.304-317.


Integrative Psychiatric Genomics


The brain exhibits multiple layers of organization and dynamics, from the genetic and molecular, to the behavioral and computational.  In my work on PsychENCODE below, I have used the extensive resources of psychiatric genomics data collected by the PsychENCODE consortium to develop integrative deep-learning methods for tracing risk for psychiatric disorders across multiple levels of organization.  This work has involved a vast collaborative effort.  In the process, I have proposed a set of integrated theoretical and machine-learning based models, including LNCTP (2024) and DSPN (2018), which allow the tracing of heritable variation, co-heritability of intermediate genomic phenotypes, imputation of cell-specific expression, and identification of subclasses of psychiatric disorders, in a unified framework.

D. Wang*, S. Liu*, J. Warrell*, H. Won*, X. Shi*, F. C. P. Navarro*, D. Clarke*, M. Gu*, P Emani*, Y. T. Yang, M. Xu, M. J. Gandal, S. Lou, J. Zhang, J. J. Park, C. Yan, S. K. Rhie, K. Manakongtreecheep, H. Zhou, A. Nathan, M. Peters, E. Mattei, D. Fitzgerald, T. Brunetti, J. Moore, Y. Jiang, K. Girdhar, G. E. Hoffman, S. Kalayci, Z. H. Gumus, G. E. Crawford, PsychENCODE Consortium, P. Roussos, S. Akbarian, A. E. Jaffe, K. P. White, Z. Weng, N. Sestan, D. H. Geschwind, J. A. Knowles, M. B. Gerstein, 2018. Comprehensive functional genomic resource and integrative model for the human brain. Science, 362(6420), p.eaat8464. (* equal contribution)


Prashant S. Emani*, Jason J. Liu*, Declan Clarke*, Matthew Jensen*, Jonathan Warrell*, Chirag Gupta*, Ran Meng*, Che Yu Lee*, Siwei Xu*, Cagatay Dursun*, Shaoke Lou*, Yuhang Chen, Zhiyuan Chu, Timur Galeev, Ahyeon Hwang, Yunyang Li, Pengyu Ni, Xiao Zhou, PsychENCODE Consortium, Trygve E. Bakken, Jaroslav Bendl, Lucy Bicks, Tanima Chatterjee, Lijun Cheng, Yuyan Cheng, Yi Dai, Ziheng Duan, Mary Flaherty, John F. Fullard, Michael Gancz, Diego Garrido-Martín, Sophia Gaynor-Gillett, Jennifer Grundman, Natalie Hawken, Ella Henry, Gabriel E. Hoffman, Ao Huang, Yunzhe Jiang, Ting Jin, Nikolas L. Jorstad, Riki Kawaguchi, Saniya Khullar, Jianyin Liu, Junhao Liu, Shuang Liu, Shaojie Ma, Michael Margolis, Samantha Mazariegos, Jill Moore, Jennifer R. Moran, Eric Nguyen, Nishigandha Phalke, Milos Pjanic, Henry Pratt, Diana Quintero, Ananya S. Rajagopalan, Tiernon R. Riesenmy, Nicole Shedd, Manman Shi, Megan Spector, Rosemarie Terwilliger, Kyle J. Travaglini, Brie Wamsley, Gaoyuan Wang, Yan Xia, Shaohua Xiao, Andrew C. Yang, Suchen Zheng, Michael J. Gandal, Donghoon Lee, Ed S. Lein, Panos Roussos, Nenad Sestan, Zhiping Weng, Kevin P. White, Hyejung Won, Matthew J. Girgenti, Jing Zhang, Daifeng Wang, Daniel Geschwind, Mark Gerstein. 2024. Single-cell genomics and regulatory networks for 388 human brains. Science, 384(6698), p.eadi5199. (* equal contribution)

Quantum Machine Learning in Biology


Quantum Machine Learning methods are rapidly becoming practical for handling real-world problems.  I am particularly interested in the potential for combining probabilistic and quantum methods in machine learning, and developing approaches which make use of quantum mixed-states.  Such approaches have great potential in applications involving generative machine learning and  combinatorial optimization in computational biology and beyond.

Wang, G.*, Warrell, J.*, Emani, P.S. and Gerstein, M., 2024. ζ-QVAE: A Quantum Variational Autoencoder utilizing Regularized Mixed-state Latent Representations. arXiv preprint arXiv:2402.17749. (* equal contribution) 


Emani, P.S.*, Warrell, J.*, Anticevic, A., Bekiranov, S., Gandal, M., McConnell, M.J., Sapiro, G., Aspuru-Guzik, A., Baker, J., Bastiani, M., McClure, P., Murray, J., Sotiropoulos, S.N., Taylor, J., Senthil, J., Lehner, T., Gerstein, M., Aram W. Harrow, A.W., 2021. Quantum Computing at the Frontiers of Biological Sciences. Nature Methods. (* equal contribution) 

Type Theory in Machine Learning


Probabilistic Programming and Dependent Type Theory are powerful frameworks for modeling recursion and compositionality mathematically.  In the work below, I have developed methods for defining probabilistic dependent type systems, and non-well-founded types which may include cycles in the type relationship.  A rigorous semantics for these systems is provided by introducing a formal meta-language for probabilistic programming, capable of expressing both programs and the type systems in which they are embedded. Here, I am motivated here by the desire to allow a system to learn not only relevant knowledge (programs/proofs), but also appropriate ways of reasoning (logics/type systems). This work has helped formulate a precise semantics for the AGI programming language MeTTa.

Meredith, L.G., Goertzel, B., Warrell, J. and Vandervorst, A., 2023. Meta-MeTTa: an operational semantics for MeTTa. arXiv preprint arXiv:2305.17218.


J. Warrell, A. Potapov, A. Vandervorst and B. Goertzel, 2022. A meta-probabilistic-programming language for bisimulation of probabilistic and non-well-founded type systems. In International Conference on Artificial General Intelligence. (oral presentation at AGI 2022)


J. Warrell, and M. B. Gerstein, 2019. Hierarchical PAC-Bayes Bounds via Deep Probabilistic Programming. In Bayesian Deep Learning Workshop at NeurIPS. (oral presentation)


J. Warrell, and M. B. Gerstein, 2018. Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types, Uncertainty in Artificial Intelligence Workshop on Uncertainty in Deep Learning

Variational Methods


I have worked extensively on developing variational methods for complex optimization problems, particularly focusing on mixed combinatorial-continuous problems, involving optimization over graph partitions, probabilistic programs, and structured bags for multiple instance learning.


Wang, G.*, Warrell, J.*, Zheng, S. and Gerstein, M., 2024. A Variational Graph Partitioning Approach to Modeling Protein Liquid-liquid Phase Separation. bioRxiv, pp.2024-01. (*equal contribution)


Warrell, J. and Gerstein, M., 2019. PAC-Bayes Objectives for Meta-Learning using Deep Probabilistic Programs. In MetaLearn 2019 at NeurIPS.


Warrell, J. and Torr, P.H., 2011, July. Multiple-instance learning with structured bag models. In International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (pp. 369-384). Berlin, Heidelberg: Springer Berlin Heidelberg. (oral presentation)

Music Theory and Analysis


Musical structure is created through a rich interaction of cultural, physical, abstract and psychological causes.  In my thesis, I develop an integrative theory and analytical approach, focused on the music of the 20th century composer Arnold Schoenberg, which explores general notions of form and function applicable to entities across arbitrary musical dimensions.


J. Warrell, 2006. Repetition in the Music of Arnold Schoenberg. PhD thesis, King's College London.