Jordan
Taylor

Academic Bio

Understanding ML & quantum systems.

I work on new methods for understanding, steering, and revealing hidden structure within:

I'm currently interning at the NTT Physics & Informatics (PHI) Laboratories under Jess Riedel,  before another internship at Center for Human-Compatible Artificial Intelligence under Erik Jenner.

I’m also finishing up my thesis for a PhD at the University of Queensland, Australia, under Ian McCulloch. I’ve been working on new "tensor network" algorithms, which can be used to simulate entangled quantum materials, quantum computers, or to perform machine learning.

Contact me: jordantensor [at] gmail [dot] com Also see my CV, LinkedIn, or Twitter

LinkLinkedInTwitter

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
Dan Braun, Jordan Taylor, Nicholas Goldowsky-Dill, Lee Sharkey (2024)

Sparse autoencoders (SAEs) are an unsupervised method for interpreting machine learning models, allowing you to decompose any space of activations within the model into a set of sparsely activating "features". But vanilla SAEs often aren't able to explain as of the model's performance as we'd like, and they may learn features which reflect structure of the input data, rather than structure of the model we're trying to interpret. 

Rather than training sparse autoencoders to reproduce the activations at the current layer, we trained them to reproduce the activations of all downstream layers, as well as the output logits of the original model. This significantly improves the trade-off between performance degradation (eg. CE_loss) and interpretability (eg. L_0, or autointerp scores), while removing functionally irrelevant classes of features.

The cost is that there are some (apparently functionally irrelevant) intermediate subspaces of activations which these SAEs don't reconstruct, and it takes much longer to train them with this loss. We believe that best practice going forward will probably be to pre-train SAEs with regular local loss, then fine-tune them with the downstream end-to-end loss we've introduced. 

We have open sourced our library for training SAEs with this new method: https://github.com/ApolloResearch/e2e_sae  
A selection of our SAEs are hosted on neuronpedia so you can play with them in your browser: https://www.neuronpedia.org/gpt2sm-apollojt 

Defining wavefunction branches

Ian McCulloch and I recently proposed a criterion for finding effectively decohered wavefunction "branches" in arbitrary quantum states, without the need for a system / environment split. We say that branches are defined if you can write your quantum state as a superposition of terms which are easy to distinguish, but hard to interfere (as measured by the number of local operations required).

We argue that these branches

We conjecture that branch formation is a ubiquitous process in nature, occurring generically in the time evolution of many-body quantum systems (even when there is no clear "environment"). We're currently looking for branches in various numerical time evolution simulations, by developing an algorithm to find them tensor-network states. 

Check out the paper or the poster! We give many examples of states with good branch decompositions.

Good branches are effectively the opposite of good error-correcting codes. 

The complexity of distinguishing two states |a⟩ and |b⟩ is ~ the size of the smallest circuit satisfying this (~ swapping  |a⟩+|b⟩ with |a⟩-|b⟩ ).

The complexity of interfering two states |a⟩ and |b⟩ is ~ the size of the smallest circuit satisfying this (~ swapping  |a⟩ with |b⟩ ).

Graphical tensor notation for ML interpretability

Jordan K. Taylor (2024) arxiv:2402.01790 [or blog post]

A tutorial applying Penrose graphical notation to mechanistic interpretability: understanding how Transformer AI systems like GPT work, and the most basic kinds of algorithms they can learn internally. As well as AI systems, I also apply the notation to the singular value decomposition and its higher-order extensions, and introduce tensor-network decompositions.

Detecting beeps from brains

Jordan K. Taylor and Melvyn Yap (2021)  [link]

The result of a two month internship at Max Kelsen, applying machine learning to neuroscience research. I used data from a few neurons in the brains of rats to predict the timing of audio beeps they were hearing. I applied unsupervised clustering techniques, as well as supervised neural networks and gradient-boosting. I found interpretability tools to be vital for increasing generalization robustness to new neurons, new sessions, new audio tones, and new rats.

Solving PDEs in parallel

Iterative implicit parallelisation of the Crank-Nicolson method
J. K. Taylor, M. W. J. Bromley, L. Rabenhorst, M Richards (2020)

A pretty simple new method to run the Crank-Nicolson method in parallel. This is a method for implicitly solving partial differential equations (PDEs) like the time-dependent Schrodinger equation. Our parallel modification provides a speedup even though it increass compute usage, and extends the Crank-Nicolson method to allow it to hande PDEs with non-linear terms. C++ code is available at https://github.com/jordansauce/iterative_parallel_CN 

Simulating atomic clocks 

Hyperpolarisability calculations for optical lattice atomic clocks (Honours thesis)
Jordan K. Taylor, supervised by Michael W. J. Bromley (2019)

I ran numerical calculations to characterize errors in the world's most accurate atomic clocks. Specifically, AC-stark frequency shifts induced by the trapping laser in strontium optical lattice clocks. The effects of these frequency shifts can be partially cancelled by operating the trapping laser at a "magic wavelength", so the characterizing the dominant further errors required going to 4th order perturbation theory, calculating the "hyperpolarisability" of strontium using C++ code.

Talks

My PhD confirmation talk on isometric tensor networks and gauge fixing (July 2021)

Confirmation Talk

Connections between tensor networks and machine learning (May 2021)

Tensor Networks for ML
TN_ML_talk_20_05_2021.mp4

Other slides 

Defining Wavefunction Branches
Branch finding algorithm
Tensor network states
2D Tensor Network Methods
Complexity growth
Kondo Lattice Model
Papers
Hyperpolarisability Calculations for Atomic Clocks
Jordan_Taylor_Academic_CV.pdf