BrianMcWilliams

AI Research Scientist @ Google DeepMind

I am an AI research scientist at Google DeepMind, where I work on generative models for music and audio including the dialogue generation model powering NotebookLM which was named one of the best inventions of 2024 by TIME magazine. You can also try it here in Google Cloud TTS.

I have won several awards including, most recently, an Outstanding Paper Award at ICLR 2021. See my Google Scholar profile and some representative projects below for more. 

I've held positions at Google Research, Twitter, and Disney Research where I led the Deep Learning group and worked on developing ML algorithms for production quality rendering, image and video processing, contributing to Toy Story 4 and Frozen 2. 

✉️ bmcw@google.com 

Generative AI

Pushing the frontiers of audio generation (NotebookLM audio dialogue generation model)
"Our pioneering speech generation technologies are helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools."

DeepMind blog post | NotebookLM | Time: The Best Inventions of 2024

New generative AI tools open the doors of music creation

DeepMind blog post | Jacob Collier x Gen Music

Generating audio for video

"Video-to-audio research uses video pixels and text prompts to generate rich soundtracks"

DeepMind blog post | The Verge 

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

"Natively multimodal, with an updated long context window of up to two million tokens — the longest of any large-scale foundation model."

Gemini site | Google blog post | tech report 

Transforming the future of music creation

"Music AI tools – a set of tools we’re designing with artists, songwriters, and producers to help bolster their creative processes."

DeepMind blog post

MusicLM + MusicFX

"MusicLM generates high-fidelity music from text descriptions. Recent improvements include the integration of classifier-free guidance, improved acoustic tokens, and a new backbone architecture specifically designed to operate on such acoustic tokens, as well as applying SoundStorm, to achieve efficient high-fidelity audio generation."

website | MusicFX | TechCrunch 

MusicRL: Aligning Music Generation to Human Preferences

The first text-to-music model that incorporates human feedback at scale.

G Cideron et al | ICML 2024 | website | arXiv 

Games and Multi-Agent.

Game theory as an engine for large-scale data analysis

"EigenGame maps out a new approach to solve fundamental ML problems."

B McWilliams, I Gemp, C Vernade | DeepMind blog post

The generalized eigenvalue problem as a Nash equilibrium

We formulate the solution to the generalized eigenvalue problem as the Nash of a game, design a state-of-the-art algorithm to solve it, and solve problems 100x larger than before.

I Gemp, C Chen, B McWilliams | ICLR 2023 notable paper (top 25% of accepted papers) | arXiv | code

EigenGame Unloaded When playing games is better than optimizing

EigenGame's updates are biased if computed using minibatches of data, which hinders convergence and more sophisticated parallelism in the stochastic setting.We propose an unbiased stochastic update that is asymptotically equivalent to EigenGame, enjoys greater parallelism allowing computation on datasets of larger sample sizes, and outperforms EigenGame in experiments. We present applications to finding the principal components of massive datasets and performing spectral clustering of graphs. We analyze and discuss our proposed update in the context of EigenGame and the shift in perspective from optimization to games.

I Gemp, B McWilliams, C Vernade, T Graepel | ICLR 2022 | arXiv | code

EigenGame: PCA as a Nash Equilibrium

We present a novel view on principal component analysis (PCA) as a competitive game in which each approximate eigenvector is controlled by a player whose goal is to maximize their own utility function. We analyze the properties of this PCA game and the behavior of its gradient based updates. The resulting algorithm is naturally decentralized and hence parallelizable through message passing. We demonstrate the scalability of the algorithm with experiments on large image datasets and neural network activations. We discuss how this new view of PCA as a differentiable game can lead to further algorithmic developments and insights.

I Gemp, B McWilliams, C Vernade, T Graepel | ICLR 2021 Outstanding Paper Award | video | arXiv | code

Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning

K McKee, E Hughes, J Leibo, I Gemp, B McWilliams, E Duéñez-Guzmán | AAMAS 2020 | arXiv

The Unreasonable Effectiveness of Adam on Cycles

I Gemp, B McWilliams | Bridging Game Theory & Deep Learning 2019 | paper

Graphics, Rendering and Vision.

Neural Importance Sampling 

T Müller, B McWilliams, F Rousselle, M Gross, J Novak | SIGGRAPH 2019 | arXiv | video | interactive test suite

Denoising with Kernel Prediction and Asymmetric Loss Functions

T Vogels, F Rousselle, B McWilliams, G Rothlin, A Harvill, D Adler, M Meyer, J Novak | SIGGRAPH 2018 | paper | video | online test suite

PhaseNet for Video Frame Interpolation 

S Meyer, A Djelouah, C Schroers, B McWilliams, A Sorkine-Hornung, M Gross | CVPR 2018 | arXiv | video

A Fully Progressive Approach to Single-Image Super-Resolution 

Y Wang, F Perazzi, B McWilliams, A Sorkine-Hornung, O Sorkine-Hornung, C Schröers | CVPR NTIRE 2018 | arXiv | code | 2 minute summary 

Deep Scattering: Rendering Atmospheric Clouds with Radiance-Predicting Neural Networks 

S Kallweit, T Muller, B McWilliams, M Gross, J Novak | SIGGRAPH Asia 2017 | arXiv | video | project page | 2 minute summary

Kernel-predicting Convolutional Networks for Denoising Monte Carlo Renderings

S Bako, T Vogels, B McWilliams, M Meyer, J Novak, A Harvill, P Sen, T DeRose, F Rouselle | SIGGRAPH 2017 | Project page

A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation

F Perazzi, J Pont-Tuset, B McWilliams, M Gross, L Van Gool, A Sorkine-Hornung | CVPR 2016 | Project page

Representation Learning.

Using machine learning to accelerate ecological research

"DeepMind is collaborating with ecologists and conservationists to develop machine learning methods to help study the behavioural dynamics of an entire African animal community in the Serengeti National Park and Grumeti Reserve in Tanzania."

S Petersen, et al | DeepMind blog post

TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations

X Zhang, Y Malkov, O Florez, S Park, B McWilliams, J Han, A El-Kishky | arXiv | code

Pushing the limits of self-supervised ResNets: can we outperform supervised learning without labels on ImageNet?

N Tomasev, I Bica, B McWilliams et al. | arXiv

Representation Learning via Invariant Causal Mechanisms

ICLR 2021 | arXiv | poster

Correlated random features for fast semi-supervised learning

This paper presents a fast semi-supervised algorithm for regression and classification. The algorithm draws on two main ideas. First, it generates two views consisting of computationally inexpensive random features. Second, multiview regression, using Canonical Correlation Analysis on unlabeled data, biases the regression towards useful features.

B McWilliams, D Balduzzi, J Buhmann | NeurIPS 2013 | arXiv

Subspace clustering of high-dimensional data: a predictive approach

B McWilliams, G Montana | Data Mining and Knowledge Discovery 2014 | arXiv

Multi-view predictive partitioning in high dimensions

Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high-dimensional vectorial representations or "views". Some typical examples can be found in web mining and genomics applications. In this article we present an algorithm for data clustering with multiple views which relies on a novel criterion of predictive similarity between data points.

B McWilliams, G Montana | Statistical Analysis and Data Mining 2012 | arXiv

Deep Learning Theory, Optimization and Statistics

The Shattered Gradients Problem: If resnets are the answer, then what is the question? 

D Balduzzi, M Frean, L Leary, JP Lewis, K Ma, B McWilliams | ICML 2017 | PADL best paper | arXiv | video

Neural Taylor Approximation: Convergence and Exploration in Rectifier Networks 

D Balduzzi, B McWilliams, T Butler-Yeoman | ICML 2017 | arXiv | video

Preserving Differential Privacy Between Features in Distributed Estimation

C Heinze-Deml, B McWilliams, N Meinshausen | Stat | arXiv

Scalable Adaptive Stochastic Optimization Using Random Projections

G Krummenacher, B McWilliams, Y Kilcher, J Buhmann, N Meinshausen | NeurIPS 2016 | arXiv

DUAL-LOCO: Distributing Statistical Estimation Using Random Projections

C Heinze, B McWilliams, N Meinshausen | AISTATS 2016 Oral (top 6%) | arXiv | software 

Variance Reduced Stochastic Gradient Descent with Neighbors

T Hofmann, A Lucchi, S Lacoste-Julien, B McWilliams | NeurIPS 2015 | arXiv

LOCO: Distributing Ridge Regression with Random Projections

C Heinze, B McWilliams, N Meinshausen, G Krummenacher | arXiv | software

Fast and Robust Least Squares Estimation in Corrupted Linear Models

B McWilliams, G Krummenacher, M Lučić, J Buhmann | NeurIPS 2014 Spotlight (top 4%) | arXiv | slides | video