Highly-efficient, large-scale learning systems have shown the potential to drastically optimize traditional fields such as infrastructure and transportation control, business decision support, and resource management. For example, reinforcement learning (RL) agents for cities and built-environments save 10% of energy usage without compromising performance, extrapolating to over 130 million dollars and 32 million tons of CO2 reduction annually in the U.S. alone, if deployed en masse. However, these learning-based agents face significant challenges moving from laboratories to industry. Regulators and practitioners need verifiable safety bounds and trustworthy procedures before they can approve AI systems for critical infrastructure. For example, building operators need guarantees that a learning-based HVAC controller won't suddenly fail during a heat wave. Business practitioners need evidence that a large language model (LLM)-based decision support system is faithful to trusted source of information and respects corporate guidelines. Yet today's AI systems—whether RL controllers or LLM agents—remain largely black boxes. We lack standardized evaluation protocols, interpretable decision frameworks, and mechanisms to align these systems with diverse human values in contested domains.
My research in technical AI governance develops the technical infrastructure to make AI systems verifiable, steerable, and democratically aligned before deployment in high-stakes settings. I build the tools, metrics, and workflows that transform abstract commitments to responsible AI into concrete mechanisms that regulators and practitioners can actually use. This includes:
uncertainty-aware decision and knowledge support agents,
steering mechanisms that make LLM internal reasoning transparent and contestable, and
democratic aggregation methods that align AI systems with pluralistic values through participatory design.
In my work, I combine mathematical tools from uncertainty quantification, verification, and social choice theory with machine learning methods from interpretability and preference learning. My collaborations span computer scientists and domain practitioners in energy, business, and civic technology.