Yunfei Felix Bai

PhD | AI innovator and evangelist

AGI | AI/ML | Data Science | Cloud Computing

About Me

I design and build AI/ML and AGI products and solutions that tackle complex technical and business challenges, deliver measurable results that accelerate growth, and help organizations achieve strategic transformation.

Experience

I am a Principal Solutions Architect at Amazon Web Services (AWS) based in Seattle, where I’ve been helping customers innovate since 2020. Before joining AWS, I held a variety of technical and product-focused roles at Nokia in the UK and US, including Solutions Architect, Technical Sales Consultant, Product Manager, and Support Engineer. I hold a PhD in Electronic and Electrical Engineering from the University of Leeds, UK.

Interests

My work over the past two decades has centered on Machine Learning, Neural Networks, Deep Learning, and Natural Language Processing/Understanding. Since 2022, my focus has shifted toward Artificial General Intelligence (AGI), encompassing cutting-edge LLM and agentic AI research and application design. My interests span LLM fine-tuning and reinforcement learning with human preference alignment (SFT, RLHF, PPO, DPO, GRPO), as well as multimodal LLMs and Vision-Language Models (VLMs). I also work on optimization techniques such as PEFT, LoRA/qLoRA, quantization, and distillation; autonomous AI agents, MCP, A2A, and multi-agent collaboration; and LLM/AI agent evaluation, benchmarking, and interpretability.

Publications

I published research papers, patents, and blogposts on AI and Machine Learning, Data Science, Deep Learning, Generative AI and Agentic AI to resolve real-world problems in multiple industries.

Research paper

Blogpost

Github repo

My AGI and ML Projects

Reinforcement Fine-Tuning (RFT) for multi-agent orchestration (2026)

Agentic AI | RFT | DPO | GRPO, DAPO, GSPO

While agents have transformed how we build AI systems, advanced fine-tuning remains a critical component for enterprises seeking competitive advantage in high-stakes domains. By understanding the evolution of techniques like PPO, DPO, GRPO, DAPO and GSPO, and applying them strategically within agent architectures, organizations can achieve significant improvements in accuracy, efficiency, and safety.

Blogpost: https://aws.amazon.com/blogs/machine-learning/advanced-fine-tuning-techniques-for-multi-agent-orchestration-patterns-from-amazon-at-scale/

Multimodal LLM study (2026)

Text2Image | Text2Video | Image2Image | Image2Video| Text2Speech

An all-in-one repository for open-source state-of-the-art Multimodal Large Language Models (MMLLM).

Github: https://github.com/yfgit2012/Multimodal-LLM

Evaluating AI agents: real-world lessons from Amazon's agent systems (2025)

Agentic AI | LLM and agent evaluation | Reasoning, tool-use, multi-turn | AWS reInvent 2025

A framework for evaluating complex AI agents in production, with real-world examples from Amazon shopping, seller support, and advertising agents.. Elaborates on how to measure AI performance beyond traditional metrics, approaches for assessing language model reasoning, tool usage, and memory management.

LinkedIn: Post | AWS reInvent 2025: Content

Blogpost: to be published

Reinventing Structural Inspections with Multimodal AI Agents on AWS (2025)

ReAct agent | Multimodality | VLM fine-tuning | MCP | Strands Agents

A next-generation multimodal agentic AI solution designed to automate and accelerate structural inspection processes, combining reasoning LLM, RL/GRPO fine-tuned VLMs, and MCP servers for seamless orchestration. The framework is developed with Strands Agents and Bedrock AgentCore, generates significant efficiency gains including 50% reduction in manual inspection efforts.

Blogpost: to be published

[AAAI2026] A Multi-Agent LLM Framework for Reverse Engineering Codebase to Causal Relational Diagram (2025)

LLM post-training | Causal graph reasoning | RL | SFT and DPO

A multi-agent framework using reinforcement fine-tuning (RFT) to enhance the reasoning accuracy on expert-curated causal graphs, allowing smaller specialized models to outperform larger foundation models on domain-specific tasks, improving F1 score from 0.69 to 0.97.

Paper: https://openreview.net/pdf?id=rbQLTUDZiH (accepted by AAAI 2026)

[ISWC2025] An explainable natural language framework for identifying and notifying target audiences in enterprise communication (2025)

Graph RAG | LLM explanability| LLM knowledge base

A novel framework that combines RDF graph databases and RAG pipeline with LLMs to process natural language queries for precise audience targeting, while providing transparent reasoning through a planning-orchestration architecture.

Paper: https://www.amazon.science/publications/an-explainable-natural-language-framework-for-identifying-and-notifying-target-audiences-in-enterprise-communication

https://arxiv.org/abs/2508.05267

Improve Amazon Nova migration performance with data-aware prompt optimization (2025)

LLM migration | Data-aware prompt optimization | DSPy:MIPROv2

An LLM migration paradigm and architecture, including a continuous process of model evaluation and data-aware prompt optimization by MIPROv2, iteratively optimizing LLM prompts using user-provided dataset and objective metrics.

Blogpost: https://aws.amazon.com/blogs/machine-learning/improve-amazon-nova-migration-performance-with-data-aware-prompt-optimization/

Task based LLM Evaluation and Migration Framework for enterprise LLM applications (2025)

LLM evaluation and benchmarking | Cost-performance analysis

An LLM evaluation framework, focusing on evaluating LLM performance, responsibility, infrastructure, and cost, driven by LLM tasks, providing Insights on LLM cost-performance for model selection and optimization.

Github: https://github.com/aws-samples/sample-llm-task-ben-mig-aws

Improve LLM performance by continuous fine-tuning with Compound AI system (2025)

A continuous self-instruct fine-tuning framework and its pipeline implemented by DSPy. The framework generates a synthetic dataset from the domain knowledge base, and drives SFT, HITL, and RL with human alignment.

Blogpost: https://aws.amazon.com/blogs/machine-learning/llm-continuous-self-instruct-fine-tuning-framework-powered-by-a-compound-ai-system-on-amazon-sagemaker/

Github: https://github.com/aws-samples/amlc-2024-tutorial-continuous-fine-tuning-compound-ai

Presented at Amazon Machine Learning Conference 2024

How Amazon Finance Automation built a generative AI Q&A chat assistant using Amazon Bedrock (2024)

Blogpost:https://aws.amazon.com/blogs/machine-learning/how-amazon-finance-automation-built-a-generative-ai-qa-chat-assistant-using-amazon-bedrock/

Predicting VIX with Adaptive Machine Learning (2024)

Paper: https://www.tandfonline.com/doi/full/10.1080/14697688.2024.2439458

How Twitch used agentic workflow with RAG on Amazon Bedrock to supercharge ad sales (2024)

Blogpost: https://aws.amazon.com/blogs/machine-learning/how-twitch-used-agentic-workflow-with-rag-on-amazon-bedrock-to-supercharge-ad-sales/

[KDD2024] Domain-Driven LLM Development: Insights into RAG and Fine-Tuning Practices (2024)

RAG | Instruction Tuning | RLHF/RLAIF

Presentation of RAG and LLM fine-tuning techniques, their advantages, limitations, and best-practice adoption strategies for various LLM tasks. Demonstration of advanced methods for optimizing RAG and fine-tuned LLM architectures for domain-specific applications.

Proceeding: https://doi.org/10.1145/3637528.3671445

Github: https://github.com/aws-samples/kdd-2024-domain-driven-llm-development

Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering (2024)

Using Reinforcement Learning from AI Feedback (RLAIF) to scale up human-in-the-loop feedback data by LLM-as-a-judge, fine-tuned the 7B LLM via RL/PPO, improved overall accuracy to 89%, and reduced human annotation effort by 80%.

Blogpost: https://aws.amazon.com/blogs/machine-learning/improve-llm-performance-with-human-and-ai-feedback-on-amazon-sagemaker-for-amazon-engineering/

A generative AI-powered solution on Amazon SageMaker to help Amazon EU Design and Construction (2023)

RAG | LLM | Instruction Tuning | PEFT | LoRA | AWS | SageMaker

A RAG optimization solution with SFT fine-tuned LLMs, improved accuracy from 50% to 81%. Launched enterprise-grade GenAI Q&A bot, reduced cost by 60% across mission-critical workflows.

Blogpost: https://aws.amazon.com/blogs/machine-learning/a-generative-ai-powered-solution-on-amazon-sagemaker-to-help-amazon-eu-design-and-construction/

Github: https://github.com/aws-samples/aws-rag-llmft-sagemaker

AI/ML-driven actionable insights and themes for Amazon third-party sellers using AWS (2023)

An automated workflow using AutoGluon AutoML framework on Amazon SageMaker, that performs high-accuracy (91%) theme detection that surfaces top customer contact reasons and enables faster, more effective issue resolution.

Blogpost: https://aws.amazon.com/blogs/machine-learning/ai-ml-driven-actionable-insights-and-themes-for-amazon-third-party-sellers-using-aws/

Auto Machine Translation and Synchronization for "Dive into Deep Learning" (2022)

An Auto Machine Translation and Synchronization (AMTS) system, an extensible, language-agnostic framework with customizable sub-pipelines that incorporate language characteristics and translator preferences, reduced human translation workload by 80%.

Blogpost: https://www.amazon.science/blog/auto-machine-translation-and-synchronization-for-dive-into-deep-learning

My Telecom ML and Analytics Projects