News:
2025.9: submitted materials for NeurIPS AI education track!
2025.8: our abstract was accepted by SfN (Society for Neuroscience), see you this November in San Diego!
2025.7: our work on neural network initialization received positive feedbacks by reviewers of NeurIPS!
2025.4: our two abstracts were accepted by UCSD Education Innovation Expo!
2025.1: after 1.5 years of review, our paper was officially accepted by Neural Computation!
(Neural Computation, https://direct.mit.edu/neco/article-abstract/37/8/1409/131382/A-Categorical-Framework-for-Quantifying-Emergent?redirectedFrom=fulltext )
This is our foundational and groundbreaking work that quantifies emergence!! Our measure is being applied to a range of models in machine learning and neurobiology, yielding very interesting insights.
Emergent effect is crucial to the understanding of the properties of complex systems that do not appear in their basic units, but there has been a lack of theories to measure and understand its mechanisms. In this paper, we established a framework based on homological algebra that encodes emergence as the mathematical structure of cohomologies and then applied it to network models to develop a computational measure of emergence. This framework ties the emergence of a system to its network topology and local structures, paving the way to predict and understand the cause of emergent effects. We show in our numerical experiment that our measure of emergence correlates with the existing information-theoretic measure of emergence.
(submitted to NeurIPS, https://arxiv.org/pdf/2407.19044)
We proposed a neural network initialization scheme with higher emergence value, resulting in better performance compared to Kaiming/ Xavier initialization.
We introduce a novel yet straightforward neural network initialization scheme that modifies conventional methods like Xavier and Kaiming initialization. Inspired by the concept of emergence and leveraging the emergence measures proposed by (Li, 2023), our method adjusts the layer-wise weight scalar multiplier variable to achieve higher emergence values. This enhancement is easy to implement, requiring no additional optimization steps for initialization compared to GradInit. We evaluate our approach across various architectures, including MLP and convolutional architectures for image recognition, and transformers for machine translation. We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization. The simplicity, theoretical innovation, and demonstrable empirical advantages of our method make it a potent enhancement to neural network initialization practices. These results suggest a promising direction for leveraging emergence to improve neural network training methodologies. Code is available at: https://github.com/johnnyjingzeli/EmergenceInit.
(accepted by AppliedMath, https://www.mdpi.com/2673-9909/5/3/93)
This is our endeavor to connect emergent phenomena with dynamical systems and stochastic processes.
We present a formal framework for modeling neural network dynamics using Category Theory, specifically through Markov categories. In this setting, neural states are represented as objects and state transitions as Markov kernels, i.e., morphisms in the category. This categorical perspective offers an algebraic alternative to traditional approaches based on stochastic differential equations, enabling a rigorous and structured approach to studying neural dynamics as a stochastic process with topological insights. By abstracting neural states as submeasurable spaces and transitions as kernels, our framework bridges biological complexity with formal mathematical structure, providing a foundation for analyzing emergent behavior. As part of this approach, we incorporate concepts from Interacting Particle Systems and employ mean-field approximations to construct Markov kernels, which are then used to simulate neural dynamics via the Ising model. Our simulations reveal a shift from unimodal to multimodal transition distributions near critical temperatures, reinforcing the connection between emergent behavior and abrupt changes in system dynamics.
(Submitted to ICLR, https://arxiv.org/pdf/2409.01568)
This is my work joint with my undergrads! We study what happens to the network's emergent abilities as training goes.
Emergence, where complex behaviors develop from the interactions of simpler components within a network, plays a crucial role in enhancing neural network capabilities. We introduce a quantitative framework to measure emergence during the training process and examine its impact on network performance, particularly in relation to pruning and training dynamics. Our hypothesis posits that the degree of emergence—defined by the connectivity between active and inactive nodes—can predict the development of emergent behaviors in the network. Through experiments with feedforward and convolutional architectures on benchmark datasets, we demonstrate that higher emergence correlates with improved trainability and performance. We further explore the relationship between network complexity and the loss landscape, suggesting that higher emergence indicates a greater concentration of local minima and a more rugged loss landscape. Pruning, which reduces network complexity by removing redundant nodes and connections, is shown to enhance training efficiency and convergence speed, though it may lead to a reduction in final accuracy. These findings provide new insights into the interplay between emergence, complexity, and performance in neural networks, offering valuable implications for the design and optimization of more efficient architectures.
Physical Review E 101.1 (2020): 013312.
We present a class of exponential integrators to compute solutions of the stochastic Schrödinger equations arising from the modeling of open quantum systems. To be able to implement the methods within the same framework as the deterministic counterpart, we express the solution using Kunita's representation. With appropriate truncations, the solution operator can be written as matrix exponentials, which can be efficiently implemented by the Krylov subspace projection. The accuracy is examined in terms of the strong convergence by comparing trajectories, and in terms of the weak convergence by comparing the density-matrix operators. We show that the local accuracy can be further improved by introducing third-order commutators in the exponential. The effectiveness of the proposed methods is tested using the example from Di Ventra et al. [J. Phys.: Condens. Matter 16, 8025 (2004)].
Works in preparation:
(in preparation) This is a math theory paper proposing the concept of "activation paths" and connecet it with the flatness of loss landscape, generalizability, etc. It gives a theoretical grounding of our work on emergence in neural neworks.
(in preparation) This is a paper towards emergence based neural network/ circuit design. It helps us to understand the properties of such systems in a new perspective.
My Research in education:
Johnny Jingze Li, “Fostering Interdisciplinary AI Education Through Project-Based Learning: A Case Study” (in preparation)
Johnny Jingze Li, “Concept Maps based Learning Path Recommendation and Navigation” (in preparation)
Kalyan Basu, Johnny Jingze Li, AlShinaifi, Faisal, Zeyad Almoaigel, “A Mathematical and Algorithmic Framework for Efficient Concept Acquisition by Learners” (in preparation)