Ajay nagesh

NLP/AI Researcher — LLM Evaluation · Agentic Systems · Representation Learning

[publications] [resume] [LinkedIn] [writing]

I am a Senior Applied Scientist at Amazon (Sunnyvale, CA), working on LLM evaluation and agentic systems.

I was previously Staff Research Scientist at DiDi AI Labs and postdoctoral researcher at the University of Arizona and UMass Amherst. I hold a joint PhD from IIT Bombay and Monash University.

My research spans LLM evaluation methodology, agentic systems, semi-supervised learning, representation learning, and information extraction.

Research Writing

Think Before You Judge

On the hard problem of evaluating AI — LLM-as-judge methodology, analysis-first prompting, and what systematic evaluation requires.

We Built an AI Agent in 2020

Retrospective on agentic AI through the lens of a 2020 ride-hailing agent — what the field has caught up on, and what it hasn't.

Filter, Augment, Synthesize

Practitioner's notes on the data problem — filtering noisy corpora, augmenting limited labels, and building structured synthetic pipelines.

When Models Lose the Plot

On semantic drift, Ladder Networks, and why the EMA stabilization trick at the heart of Mean Teacher keeps reappearing in modern ML.

What the Model Knew Without Labels

On emergent geometry, interpretable representations, and what 2018 NEC research says about modern SAEs and the linear representation hypothesis.

Explorations

Dissecting Large Language Models

Working through Karpathy's nanochat repository — annotated observations on modern LLM architecture.

Dissecting LLMs: Tools to Inspect the Engine

A workflow for understanding the attention mechanism — visual, hands-on, and practical.

Page updated

Google Sites

Report abuse