Thank you for stopping by my personal website. I am a methodologist, applied scientist, and research PM with a passion for making complex methods usable in the real world. My work lives at the intersection of education, AI, and human data. Over the past decade, I’ve led large-scale projects that not only advance how we learn and evaluate interventions, but also translate those insights into tools, datasets, and platforms that people actually use.
My focus is rigorous, human-centered research. I specialize in building and running human data and evaluation programs: designing labeling schemes, training and calibrating annotators, monitoring quality, and turning messy text and behavioral data into reliable signals. In the past, those skills aligned well with research synthesis work. With the rise of large language models, those skills now sit inside AI pipelines. A growing share of my work is about how we collect, structure, and benchmark human feedback so AI systems are not only powerful, but also trustworthy, transparent, and usable for the educators, researchers, and developers who rely on them.
This commitment led me to create MetaReviewer, a free, collaborative platform now used by more than 1,300 researchers across 150+ projects. In building and scaling MetaReviewer, I owned the product vision and research roadmap, led user studies on screening and coding workflows, analyzed adoption and reliability patterns, and partnered with engineers to turn those insights into features. More recently, I have designed and evaluated LLM-assisted extraction workflows inside MetaReviewer and other pipelines, creating gold-standard datasets and benchmarks so we can measure when automation is safe to use, when humans must stay in the loop, and how to allocate their time where it matters most.
Earlier in my career, I authored national research standards and led major evidence initiatives for the U.S. Department of Education’s What Works Clearinghouse, including a five-year, multi-organization project that coordinated large human-coded datasets across nine content areas. I’ve been PI or co-PI on more than fifteen funded projects from agencies and foundations such as NSF, IES, and NIJ, studying topics like social and emotional learning, cyberbullying prevention, financial aid, and college access. Across all of this work, the throughline has been the same: design data, methods, and technology so they actually help people make better decisions.
Today, through my consulting practice, I partner with organizations like 3ie, Development Services Group, and Callahan & Associates on AI- and data-intensive problems. That includes building LLM-assisted extraction and evaluation pipelines, creating benchmarks to compare model and human performance, and developing analytics products that turn complex data into clear, actionable metrics for practitioners and leaders.
I also care deeply about teaching and mentorship. I'm currently an Adjunct Professor at the University of San Francisco. I’ve taught graduate research methods, mentored junior researchers who now lead their own projects, trained hundreds of analysts at national meta-analysis institutes, and authored more than 80 publications spanning peer-reviewed articles, technical reports, and practitioner-facing pieces. Alongside academic writing, I enjoy contributing to talks, workshops, and public outlets to share what we are learning about AI, evaluation, and open science.
Looking ahead, my focus is on shaping responsible, human-centered uses of AI in education and research. The future of evidence-based practice will depend not only on methodological rigor, but also on building intuitive, reliable, and inclusive systems for collecting and using human data at scale.
Thanks again for visiting. Feel free to reach out with questions, collaboration ideas, or just a curious thought.