28 Yu-Chen Cheng, Hyemin Gu, Thomas O McDonald, Wenbo Wu, Shubham Tripathi, Cristina Guarducci, Douglas Russo, Daniel L Abravanel, Madeline Bailey, Yue Wang, Yun Zhang, Yannis Pantazis, Herbert Levine, Rinath Jeselsohn, Markos A Katsoulakis, Franziska Michor. (2025). "PROFET predicts continuous gene expression dynamics from scRNA-seq data to elucidate heterogeneity of cancer treatment responses." bioRxiv, 2025.06.27.662030, submitted. (Initiation: 2024. First submission: 2025.)
Single-cell RNA sequencing captures static snapshots of gene expression but lacks the ability to track continuous gene expression dynamics over time. To overcome this limitation, we developed PROFET (Particle-based Reconstruction Of generative Force-matched Expression Trajectories), a computational framework that reconstructs continuous, nonlinear single-cell gene expression trajectories from sparsely sampled scRNA-seq data. PROFET first generates particle flows between time-stamped samples using a novel Lipschitz-regularized gradient flow approach and then learns a global vector field for trajectory reconstruction using neural force-matching.
27 Yue Wang, and Zeyu Zheng. "Online learning with deliberately poisoned offline data." SSRN 5123361, to be submitted. (Initiation: 2024.)
We studied an online learning problem with offline data. In this reinforcement learning scenario, a player repeatedly chooses an action and receives a reward (also partially learns the system parameters). This scenario is similar to a continuous version of the multi-armed bandit problem. Some historical data are available for better learning. The goal of the player is to apply a wise policy to maximize the cumulated reward. The offline data are poisoned by an adversary who wants to minimize the reward. If the player does not know the existence of adversary, we studied the amount of poisoned offline data points to ruin the player's policy, and the optimal poisoning by the adversary. If the player knows the existence of the adversary, it becomes a game theory problem, and we studied the properties of the Nash equilibrium.
26 Yue Wang, Zeyu Zheng, and Zuo-Jun Max Shen. "Online pricing with polluted offline data." SSRN 4320324, submitted. (Initiation: 2021. First submission: 2022.)
We studied an online learning problem with offline data. In this reinforcement learning scenario, a player repeatedly chooses an action and receives a reward (also partially learns the system parameters). This scenario is similar to a continuous version of the multi-armed bandit problem. Some historical data are available for better learning. The goal of the player is to apply a wise policy to maximize the cumulated reward. The offline data are polluted, so that they do not faithfully reflect the true system parameters. If the player does not know the existence of pollution, we studied the effect of pollution on the player's performance. If the player knows the existence of pollution, we developed the optimal method to utilize the polluted offline data.
25 Yue Wang (Corresponding author), and Xueying Tian. (2025). "QWENDY: gene regulatory network inference by quadruple covariance matrices." Bulletin of Mathematical Biology, 87: 167. (Initiation: 2025. First submission: 2025.)
We developed a method, QWENDY, for inferring the structure of gene regulatory networks from single-cell gene expression data measured at four time points after an intervention, where the joint distribution of different time points is unknown. With data from four time points, we can determine the gene regulatory network uniquely without solving a non-convex optimization problem, which is better than WENDY. The same as TRENDY, we used a transformer model to enhance this method to obtain TEQWENDY. Since large language models contain pre-trained transformer layers, we replaced the transformer layers in TEQWENDY by the pre-trained transformer layers in RoBERTa-large model, and fine-tuned it with LoRA method, to obtain another inference method, LEQWENDY. TEQWENDY outperforms other methods on synthetic and experimental datasets.
24 Xueying Tian, Yash Patel, and Yue Wang (Corresponding author). (2025). "TRENDY: gene regulatory network inference enhanced by transformer." Bioinformatics, 41(6), btaf314. (Initiation: 2024. First submission: 2024.)
We developed a method, TRENDY, for inferring the structure of gene regulatory networks from single-cell gene expression data measured at multiple time points after an intervention, where the joint distribution of different time points is unknown. It uses the transformer neural networks to enhance the traditional WENDY method. This idea is also applied to enhance other inference methods, and the transformer-enhanced methods all perform better than their non-transformer counterparts on synthetic and experimental datasets.
23 Yue Wang (Corresponding author), Peng Zheng, Yu-Chen Cheng, Zikun Wang, and Aleksandr Aravkin. (2024). "WENDY: gene regulatory network inference with covariance dynamics." Mathematical Biosciences, 377, 109284. (Initiation: 2023. First submission: 2024.)
We developed a method, WENDY, for inferring the structure of gene regulatory networks from single-cell gene expression data measured at multiple time points after an intervention, where the joint distribution of different time points is unknown. The idea is to build a linear model for gene expression, and derive the dynamics of the covariance matrix. Then the dynamics is solved by some non-convex optimization methods.
22 Yue Wang, Blerta Shtylla, and Tom Chou. (2024). "Order-of-mutation effects on cancer progression: models for myeloproliferative neoplasm." Bulletin of Mathematical Biology, 86: 32. (Initiation: 2022. First submission: 2023.)
Some myeloproliferative neoplasms patients have two mutations, JAK2 and TET2. It has been observed that with or without one mutation, the other mutation has different regulatory effects on certain genes. Besides, with both mutations, the order of appearance for two mutations can affect gene expression, cell population, and age at diagnosis. We built an ODE model to explain the observations regarding gene expression, and a generalized Moran process model to explain the other observations.
21 Yue Wang. (2023). "Algorithms for the uniqueness of the longest common subsequence." Journal of Bioinformatics and Computational Biology, 21(06), 2350027. (Initiation: 2012. First submission: 2022.)
For different individuals of the same species, their gene sequences (not DNA sequences) might differ, due to the existence of transposons (jumping genes). I translated the problem of locating transposons into a mathematical problem of determining the longest common subsequence. Depending on whether sequences have repeated genes, and whether the sequences are linear or cyclic, I designed fast algorithms to determine the transposons. When the longest common subsequence is not unique, such algorithms can determine whether a gene appears in all/some/none of these sequences.
20 Yue Wang, Joseph X. Zhou, Edoardo Pedrini, Irit Rubin, May Khalil, Roberto Taramelli, Hong Qian, and Sui Huang. (2023). "Cell population growth kinetics in the presence of stochastic heterogeneity of cell phenotype." Journal of Theoretical Biology, 575, 111645. (Initiation: 2016. First submission: 2023.)
We found in experiments that the growth patterns of leukemia cell populations depend on their initial cell numbers. We analyzed the data and proposed that there should be heterogeneity in leukemia cells. We also built a branching process model and ran simulations to verify our proposition. slides
19 Yue Wang (Corresponding author), and Siqi He. (2023). "Inference on autoregulation in gene expression with variance-to-mean ratio." Journal of Mathematical Biology, 86(5), 87. (Initiation: 2021. First submission: 2022.)
Some genes can regulate (activate or inhibit) their own expressions, called autoregulation. We proved some probabilistic theorems and designed a simple and robust method to detect the existence of autoregulation from observational gene expression data. This method only needs the variance-to-mean ratio (Fano factor) of expression levels.
18 Yue Wang (Corresponding author), and Zeyu Zheng. (2023). "Measuring policy performance in online pricing with offline data." Journal of Systems Science and Systems Engineering, 32(3), 352-371. (Initiation: 2020. First submission: 2021.)
We studied an online learning problem with offline data. In this reinforcement learning scenario, a player repeatedly chooses an action and receives a reward (also partially learns the system parameters). This scenario is similar to a continuous version of the multi-armed bandit problem. Some historical data are available for better learning. The goal of the player is to apply a wise policy to maximize the cumulated reward. We found that to measure the performance of a policy, the Bayesian approach has some good properties, compared to the frequentist approach.
17 Yue Wang. (2022). "Impossibility results about inheritance and order of death." PLOS ONE, 17(11), e0277430. (Initiation: 2021. First submission: 2021.)
For heritage succession, the order of death for several relatives can affect the inheritance result. When the order of death is unknown, the inheritance cannot be carried out. Civil laws in different countries have different treatments for this problem. I proved that under some basic criteria, the approach in the French Civil Code is the only valid solution to the order of death problem. I also studied inheritance methods that do not depend on the order of death.
16 Erin Angelini, Yue Wang, Joseph Xu Zhou, Hong Qian, and Sui Huang. (2022). "A model for the intrinsic limit of cancer therapy: duality of treatment-induced cell death and treatment-induced stemness." PLOS Computational Biology, 18(7), e1010319. (Initiation: 2017. First submission: 2021.)
We found that when treating cancer cells with drugs, the treatment effect does not always become better when the drug dose increases. We developed an ODE model to explain this phenomenon. At higher drug doses, the killing rate does not further increase, but the drug-induced transition rate from the sensitive state to the resistant state still increases.
15 Yue Wang, Bhaven A. Mistry, and Tom Chou. (2022). "Discrete stochastic models of SELEX: aptamer capture probabilities and protocol optimization." Journal of Chemical Physics, 156(24), 244103. (Initiation: 2021. First submission: 2022.)
We built a Markov chain model for a biochemical protocol called SELEX. Here target molecules and different types of aptamer molecules are mixed. After reaching equilibrium, the bound aptamers are separated and amplified. Thus aptamers with better binding abilities are concentrated. We proved some theorems and designed the optimal protocol. We found that the optimal protocol in this stochastic model is different from that in the traditional deterministic model. slides
14 Yue Wang. (2022). "Two metrics on rooted unordered trees with labels." Algorithms for Molecular Biology, 17, 13. (Initiation: 2018. First submission: 2021.)
Consider the space of rooted unordered trees, where each node has a label, and different nodes can have the same label. I defined two metrics on this space to compare different trees and designed corresponding fast algorithms to calculate them. The background is to compare developmental trees (early embryonic developments) of different species.
13 Yue Wang, Renaud Dessalles, and Tom Chou. (2022). "Modeling the impact of birth control policies on China's population and age: effects of delayed births and minimum birth age constraints." Royal Society Open Science, 9(6), 211619. (Initiation: 2021. First submission: 2021.)
We built a PDE model (generalized McKendrick equations) for the impact of birth control policies on China's population and solved it. We found that a longer interval between births can effectively control the population. Besides, with a strict one-child policy, increasing the minimum marriage age can delay the decrease in population.
12 Yue Wang (Corresponding author), and Zikun Wang. (2022). "Inference on the structure of gene regulatory networks." Journal of Theoretical Biology, 539, 111055. (Initiation: 2021. First submission: 2021.)
Genes can regulate the expressions of other genes. Genes and such regulations form gene regulatory networks (GRNs). There have been many methods to infer the GRN structure from experimental data. We classified this problem into 20 scenarios based on data types. For scenarios that have been well-studied, we summarized known methods. For other scenarios, we developed new inference methods or proved that such scenarios are unsolvable. slides
11 Yue Wang (Corresponding author), Boyu Zhang, Jérémie Kropp, and Nadya Morozova. (2021). "Inference on tissue transplantation experiments." Journal of Theoretical Biology, 520, 110645. (Initiation: 2019. First submission: 2020.)
Consider tissue transplantation experiments, where the results can be normal/abnormal, host-like/donor-like. Since most experimental results are unknown, we developed a penalty-based method to infer unknown results from known results. We also studied how to design experiments to better apply this inference method. slides
10 Yue Wang, Jérémie Kropp, and Nadya Morozova. (2020). "Biological notion of positional information/value in morphogenesis theory." International Journal of Developmental Biology, 64, 453-463. (Initiation: 2019. First submission: 2019.)
We studied the role of position in developmental biology (why cells at different positions know what they should become). We clarified the definition of positional information/value with some criteria.
9 Yue Wang (Corresponding author), and Hong Qian. (2020). "Mathematical Representation of Clausius' and Kelvin's Statements of the Second Law and Irreversibility." Journal of Statistical Physics, 179(3), 808-837. (Initiation: 2015. First submission: 2018.)
We found a special lifting of a finite Markov chain, where there is a global potential, and different states have different potentials. We proved that the asymptotic entropy production rate of the lifted chain equals the entropy production rate of the original chain. This provides a new interpretation of the second law of thermodynamics. slides
8 Yue Wang (Corresponding author), and Linbo Wang. (2020). "Causal inference in degenerate systems: An impossibility result." Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR, 108, 3383-3392. (Initiation: 2016. First submission: 2018.)
We found that when the joint probability distribution has a special property (non-uniqueness of Markov boundary), some known methods that quantify the causal effect between variables could fail. We proved that in this pathological scenario, no reasonable causal quantity exists. We then designed algorithms to detect this scenario from finite observational data. slides
7 Yue Wang, Andrey Minarsky, Robert Penner, Christophe Soulé, and Nadya Morozova. (2020). "Model of morphogenesis." Journal of Computational Biology, 27(9), 1373-1383. (Initiation: 2018. First submission: 2019.)
We developed a mathematical model to explain the early development of an embryo. The distribution of certain markers on the membrane can determine cell behavior.
6 Da-Quan Jiang, Yue Wang (Corresponding author), and Da Zhou. (2017). "Phenotypic equilibrium as probabilistic convergence in multi-phenotype cell population dynamics." PLOS ONE, 12(2), e0170916. (Initiation: 2012. First submission: 2014.)
For cancer cells of several types, the proportions of different types converge to the same constants regardless of initial proportions. We explained this phenomenon by proving a strong law of large numbers in multi-type branching processes.
5 Felix X.-F. Ye, Yue Wang, and Hong Qian. (2016). "Stochastic dynamics: Markov chains and random transformations." Discrete and Continuous Dynamical Systems - Series B, 21(7), 2337-2361. (Initiation: 2015. First submission: 2016.)
We studied a discrete-state space, discrete-time random dynamical system and proved some theorems, especially about synchronization.
4 Xiufang Chen, Yue Wang, Tianquan Feng, Ming Yi, Xingan Zhang, and Da Zhou. (2016). "The overshoot and phenotypic equilibrium in characterizing cancer dynamics of reversible phenotypic plasticity." Journal of Theoretical Biology, 390, 40-49. (Initiation: 2014. First submission: 2015.)
We compared the hierarchical model and the reversible model for cancer dynamics. We showed that two experimental phenomena, phenotypic equilibrium and overshooting, only exist in the reversible model.
3 Yuanling Niu, Yue Wang, and Da Zhou. (2015). "The phenotypic equilibrium of cancer cells: From average-level stability to path-wise convergence." Journal of Theoretical Biology, 386, 7-17. (Initiation: 2014. First submission: 2015.)
For cancer cells of several types, the proportions of different types converge to the same constants regardless of initial proportions. We further explained this phenomenon in a branching process model.
2 Yu Kang, Chaohao Gu, Lina Yuan, Yue Wang, Yanmin Zhu, Xinna Li, Qibin Luo, Jingfa Xiao, Daquan Jiang, Minping Qian, Aftab Ahmed Khan, Fei Chen, Zhang Zhang, and Jun Yu. (2014). "Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks." mBio, 5.6, e01867-14. (Initiation: 2012. First submission: 2014.)
In bacteria gene sequences, some core genes are relatively stable in chromosome order. We designed a method to locate such stable core genes and analyzed them for different species. Species with closer phylogenetic relations tend to have similar properties for stable core genes.
1 Da Zhou, Yue Wang, and Bin Wu. (2014). "A multi-phenotypic cancer model with cell plasticity." Journal of Theoretical Biology, 357, 35-45. (Initiation: 2013. First submission: 2013.)
For cancer cells of several types, the proportions of different types converge to the same constants regardless of initial proportions. We explained this phenomenon by proving the existence of a unique stable fixed point (and no limit cycle) in an ODE model.