Date: (13.6.2026)
Date: (13.6.2026)
Post: 1
Changing from Making Models to Building with Them. For a long time, getting into the field of artificial intelligence felt like getting into a secret guild. If you didn’t have a background in advanced statistics, a thorough understanding of calculus, and a significant budget to train systems from scratch, you were mostly on the sidelines. First, you had to build the brain before you could try to get it to function. But recently, the whole landscape has been altered. The advent of sophisticated, pre-trained foundation models that can be ordered on demand has transformed a highly specialized field into a practical software engineering problem. The barrier to entry has not only been lowered but also eliminated altogether. Today, developers who have never written a single line of training code are building incredibly complex applications. This is a brand-new frontier, and AI engineer, developer, and educator Chip Huyen lays it all out in her new work. It is a book for the practical builder, or anyone who wants to learn how to assemble, test, and scale applications with the robust building blocks that are now available to everyone. A New System for a New Age.
The first and most important thing you need to understand in order to understand where the industry is heading is how it differs from traditional machine learning (ML) engineering. Modern AI engineering vs traditional machine learning engineering [Data Prep] -> [Train Model] [API/Open Model] -> [Prompt/RAG] ↓ [Evaluate] then [Deploy Model] and [System Evaluation]. Engineers would previously dedicate their time to sourcing proprietary datasets, selecting architectures, and tuning hyperparameters for training a specific model for a specific task. It was a long cycle and used a lot of resources. The current strategy skips the training phase entirely. Engineers view the model as a utility, a very powerful service that can be plugged into an app through an API, or run locally via open-weights alternatives. This change has resulted in the creation of an entirely new software stack, with a different set of tools, design patterns, and architectural decisions. Huyen dissects this new ecosystem for developers and walks them through the dizzying array of models, vector databases, and orchestrators. Diverging from the Prompt Getting a prototype going can be as easy as writing a clever prompt, but transitioning from a cool demo to a robust production system is notoriously hard. A plain text prompt can only get you so far. The core of the book is a guide to make your application smarter, taking developers through a logical sequence of methods: Prompt Engineering: Mastering in-context learning, structured outputs, and behavioral framing.
This enables us to combine static models with dynamic, external data sources with the help of retrieval-augmented generation (RAG) to deliver answers based on real, current data. Fine-Tuning: Knowing how to make surgical adjustments to the weights of an existing model at the right times and in the right ways for specific tones and formats or niche domains. Autonomous agents are systems capable of reasoning, utilizing other tools, and executing multi-step workflows. Crucially, Huyen emphasizes the why behind these methods, not just the how. Builders learn how to balance the tradeoffs of each approach, so as not to waste time over-engineering a solution when a simpler approach will do.
Open-ended Evaluation Chaos The hardest obstacle in this new paradigm may be answering the deceptively simple question, "Is my application actually getting better?" If the outputs are free-form text, then traditional metrics, such as accuracy percentages or error rates, fail. If a model changes one word in a paragraph, a strict programmatic test may mark it as a failing model, even though it could actually improve the response. As these applications ingest more user interactions, the odds of silent catastrophic failures become astronomical. Testing Standard: [Real Output: "Blue"] = [Desired Output: "Blue"] -> Open-Ended Testing Pass: [Expected: "helpful text"] vs [Actual: "political text"]. To address this problem, the book examines current evaluation frameworks. It discusses the practicalities of the “AI-as-a-judge” approach in considerable detail, where very capable models are used to grade the outputs of other models. At the same time, it spells out the strict safeguards that must be in place to ensure the judge himself is impartial and consistent. The Realities of Production Cost and Latency
In a lab setting, that five-second delay for a brilliant response is like magic. In a commercial product, a 5-second delay means the user closed the tab. The final critical frontier of the book is about the hard operational realities of running these systems at a large scale. Foundation models are large, costly, and power-hungry. Huyen exposes the unseen roadblocks that hinder user experiences and bleed budgets. This will leave readers with actionable ways to optimize token usage, manage context windows, and minimize latency so that applications stay fast and cost-efficient. About the Author: Chip Huyen has a rare combination of clarity in education and deep systems expertise in this topic. Her career spans leadership at Snorkel AI, NVIDIA, and her own startup in the infrastructure space. She is currently working on fast data analytics on GPUs at Voltron Data. She has taught Machine Learning Systems Design at Stanford University and has worked on software deployment and hardware optimization for years. Her first book, Designing Machine Learning Systems, became the industry standard, and this new book is a modern companion to it.
This book is for anyone who wants to leverage foundation models to solve real-world problems. This is a technical book, so the language of this book is geared toward technical roles, including AI engineers, ML engineers, data scientists, engineering managers, and technical product managers. This book is for you if you can relate to one of the following scenarios:
You’re building or optimizing an AI application, whether you’re starting from scratch or looking to move beyond the demo phase into a production-ready stage. You may also be facing issues like hallucinations, security, latency, or costs, and need targeted solutions.
You want to streamline your team’s AI development process, making it more systematic, faster, and reliable.
You want to understand how your organization can leverage foundation models to improve the business’s bottom line and how to build a team to do so.
Chip Huyen works in the intersection of AI, data, and storytelling. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup (acquired), worked on GPU optimization for data processing, and taught Machine Learning Systems Design at Stanford. Her last book, Designing Machine Learning Systems, is an Amazon bestseller in AI and has been translated into over 10 languages.
Product Details
ASIN: B0DPLNK9GN
Publisher: O'Reilly Media
Publication Date: December 4, 2024
Edition: 1st Edition
Language: English
File Size: 38.6 MB
Print Length: 854 Pages
ISBN-13: 978-1098166267
Kindle Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 2
Traditional software development is like that. It’s like building a house. You use materials like wood and steel, and you have a plan. You put the roof on, and the building usually stays where it is unless something really bad happens. Instead, managing a living ecosystem is more like developing a machine learning system. That’s because data pipelines are complex, hardware is limited, algorithms are changing, and engineers, product managers, and business leaders don’t always share the same priorities. But this is strange in that it is driven by data more than anything else. And data is living, breathing, and inherently unpredictable. It responds to market movements, human behavior, and seasonal trends.
Once a model is deployed in the wild, it begins to degrade. In Designing Machine Learning Systems, engineer and educator Chip Huyen offers a definitive roadmap for navigating this unpredictability. Huyen sees machine learning as a general engineering discipline, not a stand-alone mathematical exercise. He helps designers navigate the messy reality of building systems that are not just accurate in the lab, but also reliable, scalable, and resilient in production. The Isolated Model Catch-22 Teams just getting started in the field often spend 95% of their effort on the model itself—the specific math or architecture used to make a prediction. In practice, the model is often the least complicated part. The real points of friction are at the interfaces between the model and the outside world. It consists of Data Ingestion, Feature Engineering, THE MODEL, and Deployment and Monitoring.
Continuous Feedback Loop Huyen’s guiding principle is that every design decision, from how raw logs are converted into training data to how frequently a system initiates an automated update, should be made in support of the larger business goal. If your user expects a response within fifty milliseconds, a very accurate model that takes three seconds to load is useless. In this book, engineers learn to think about their code from the perspective of holistic system design and break free from technical silos. Design for the Whole Lifecycle The daunting task of system architecture is divided into manageable, real-world phases through the book’s iterative structure. Huyen shows how leading tech companies solve these problems at scale, with each concept grounded in real-world industry case studies, rather than abstract theories. Developers will learn the fundamental pillars of production-grade machine learning.
Data Engineering with a Purpose: Building robust data pipelines or static CSV files? You will learn to select, clean, and engineer features that really capture the issue you are trying to solve while making sure your training data reflects what the system will actually see in the real world. The Continuous Delivery Circle – Machine learning shouldn't need manual deployment every time something changes. The book shows how to automate the infrastructure to continuously evaluate, test, and update models without interfering with the end user. Navigating the Reality of After Deployment: When you hit “deploy”, that’s when the real work starts. Huyen gives a masterclass on building monitoring systems to catch silent failures before they hurt revenue or erode user trust. Examples include data drift, where real-world behavior slowly diverges from the historical data on which the model was trained. From Individual Models to Shared Platforms.
When a company scales, building a brand new pipeline for every feature is a huge waste of engineering time. The book considers this maturity curve by explaining how to move from working on individual projects to building a unified machine learning platform. Traditional Method: “Project A Pipe,” “Model A,” and “Project B Pipe,” respectively. [Centralized Feature Store and Model Registry] Platform Approach [Model A] [Model B] By investing in centralized feature stores, standard evaluation blocks, and standard monitoring tools, organizations can use a single, common infrastructure to support dozens of different use cases across different business units. The Call to Responsibility Finally, the book discusses responsible system design, a topic often overlooked but critical for safety in modern software. Machine learning can easily mirror historical biases, violate privacy laws, or go off the rails with edge cases. Huyen views fairness, interpretability, and security as basic engineering requirements that should be built into the system from day one, not tacked on as ethical afterthoughts.
About the Writer Chip Huyen is writing from deep, real-world experience. She is an expert on accelerating data analytics on GPUs at Voltron Data and co-founder of Claypot AI. She’s spent years designing the infrastructure that links the raw data to the live systems. She has been teaching Machine Learning Systems Design at Stanford University, sharing the benefit of her hard-earned insights with the next generation of engineers. Her previous roles include early engineering positions at Snorkel AI and NVIDIA. Given this background, Designing Machine Learning Systems is a useful, tried-and-true guide for developers who want to build systems that last.
This book is for anyone who wants to leverage ML to solve real-world problems. ML in this book refers to both deep learning and classical algorithms, with a leaning toward ML systems at scale, such as those seen at medium to large enterprises and fast-growing startups. Systems at a smaller scale tend to be less complex and might benefit less from the comprehensive approach laid out in this book.
Because my background is engineering, the language of this book is geared toward engineers, including ML engineers, data scientists, data engineers, ML platform engineers, and engineering managers.
Ever since the first machine learning course I taught at Stanford in 2017, many people have asked me for advice on how to deploy ML models at their organizations. These questions can be generic, such as "What model should I use?" "How often should I retrain my model?" "How can I detect data distribution shifts?" "How do I ensure that the features used during training are consistent with the features used during inference?"
Product Details
ASIN: B0B1LGL2SR
Publisher: O'Reilly Media
Publication Date: May 17, 2022
Edition: 1st Edition
Language: English
File Size: 8.7 MB
Print Length: 617 Pages
ISBN-13: 978-1098107925
Kindle Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 3
Language artificial intelligence (AI) has exploded in recent years. Thanks to huge advances in deep learning, machines can now process, interpret, and generate human text with a nuance that was only the realm of science fiction a decade ago. Now we are teaching computers to understand context, not just word recognition. This change is quietly transforming software development and creating a wave of text-driven industries and brand-new product categories.
You don’t need a PhD in machine learning to harness this power as a Python developer. The tools of the present. What You’ll Learn: This guide is not going to drown you in dense theoretical math; it focuses on actionable implementation. It’s a practical, hands-on guide to closing the gap from complex AI research to production-ready Python code, so you can add sophisticated text-intelligence features to your applications today. Generative AI for Copywriting & Summarization: Learn how to prompt and anchor models to write high-quality content, summarize huge articles into digestible bullet points, and keep consistent brand voices without human intervention. Semantic Search: Break free from the rigid confines of traditional keyword matching. You’ll learn how to use pre-trained Large Language Models (LLMs) to solve real-world problems and build scalable text systems. Scalable Document Intelligence: When you are handling thousands of PDFs, emails, or customer reviews, manual sorting is not possible. You will learn how to build search engines that really understand user intent and retrieve relevant information even when the exact search terms are not present in the source text.
Leveraging the Open-Source Ecosystem: You won’t have to reinvent the wheel to build automated systems that classify, tag, and cluster enormous text datasets to reveal hidden trends in a flash. Core Competencies & In-Depth: You will move from being an AI API user to an architect of sophisticated language pipelines as you move through the chapters. We will show how to use existing Python libraries and open source hubs to deploy world-class classification and search tools in just a few lines of code. Here’s a closer look at the specific engineering skills you’ll acquire:
1. Advanced LLM and Document Clustering Pipelines. Sorting text into predefined buckets is one thing, but what if you don’t know what topics exist in your data? You will create pipelines that process unstructured documents, cluster them according to conceptual similarity, and automatically extract the high-level themes or topics.
2. Modern Search Architecture Dense Retrieval and Rerankers Building a modern search engine is far more than a simple database query. You will build multi-stage search pipelines and learn the inner workings of dense retrieval, the process of transforming text into mathematical vectors encoding meaning. You will learn how to fine-tune cross-encoder rerankers to give the user the best possible results and how to use bi-encoders for fast initial retrieval. Deconstructing The Transformer (BERT & GPT) To write good code for these models, you need to know what’s going on inside. The Transformer architecture is the revolutionary Neural Network design driving the AI revolution, and we’ll unveil it. You will learn about the structural differences between encoder models like BERT, which are great at understanding and classifying text, and decoder models like GPT, which are designed to generate text better.
4. How does a model go from a random string of numbers to something that writes poetry or code? Understanding how LLMs are trained. We’re going to discuss the training process in stages: massive amounts of self-supervised pretraining on raw internet data, and then some alignment techniques to ensure the model is safe, useful, and coherent.
5. Strategic Fine-Tuning & Optimization. There is no one AI model that can solve every business problem right out of the gate. This book features a complete breakdown of how to adapt a model to your domain in different ways: Optimization Technique, Best suited For, Key Advantage, Prompting and in-context learning, Few-shot examples, rapid prototyping, No training cost, instant results, fine-tuning generative, Tone adjustment, specialized formats, domain jargon, Contrastive Fine-Tuning For Permanent Change of Model Behavior for Generation, Training custom embeddings for search/similarity. This book is for Python developers, data analysts, and software engineers who want to level up their skills. If you are comfortable with dictionaries, loops, and basic APIs of Python, all prerequisites are covered. We avoid academic gatekeeping and place a heavy emphasis on real-world usefulness, architectural best practices, and clean code. By the time you finish this book, you won’t just be talking about the AI revolution; you will be building it. This book is your roadmap whether you want to improve internal company workflows, create a disruptive new SaaS product, or just want to understand how modern AI works behind the scenes.
From the Preface
Large language models (LLMs) have had a profound and far-reaching impact on the world. By enabling machines to better understand and generate human-like language, LLMs have opened new possibilities in the field of AI and impacted entire industries.
This book provides a comprehensive and highly visual introduction to the world of LLMs, covering both the conceptual foundations and practical applications. From word representations that preceded deep learning to the cutting-edge (at the time of this writing) Transformer architecture, we will explore the history and evolution of LLMs. We delve into the inner workings of LLMs, exploring their architectures, training methods, and fine-tuning techniques. We also examine various applications of LLMs in text classification, clustering, topic modeling, chatbots, search engines, and more.
With its unique blend of intuition-building, applications, and illustrative style, we hope that this book provides the ideal foundation for those looking to explore the exciting world of LLMs. Whether you are a beginner or an expert, we invite you to join us on this journey to start building with LLMs.
Jay Alammar is Director and Engineering Fellow at Cohere (pioneering provider of large language models as an API). In this role, he advises and educates enterprises and the developer community on using language models for practical use cases). Through his popular AI/ML blog, Jay has helped millions of researchers and engineers visually understand machine learning tools and concepts from the basic (ending up in the documentation of packages like NumPy and pandas) to the cutting-edge (Transformers, BERT, GPT-3, Stable Diffusion). Jay is also a co-creator of popular machine learning and natural language processing courses on Deeplearning.ai and Udacity.
Product Details
ASIN: B0DGZ46G88
Publisher: O'Reilly Media
Publication Date: September 11, 2024
Edition: 1st Edition
Language: English
File Size: 21.6 MB
Print Length: 693 Pages
ISBN-13: 978-1098150921
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 4
The time it took to go from concept to prototype has been slashed from months to days, thanks to the rise of Generative AI. But as the foundational models get more sophisticated, we are moving beyond single text prompts. Currently, we are in the middle of a huge paradigm shift in the software engineering world: the transition to autonomous AI agents.
By combining large language models, external tools, knowledge bases, persistent memory, and continuous feedback loops, we can coordinate multiple layers of machine reasoning. This allows us to tackle highly ambiguous, multi-step, and ultra-complex organizational challenges. Intelligent agents are already boosting team productivity across industries, whether it’s autonomous coding assistants, deep-dive research engines, or predictive financial analysts. However, building an agent that can handle chaos in the real world is very different than building one that works well in a controlled demonstration. Real autonomy requires extensive planning, drafting, error handling, and self-correction. Many companies still struggle with bringing those systems into production from the research lab, especially as the underlying frameworks and best practices evolve rapidly. This book aims to be your ultimate, steadfast guide through this technical landscape that is both complex and rapidly changing. Michael Albada's book offers a very useful, research-based blueprint for the design, testing, and assembly of collaborative multi-agent systems as well as single-agent systems. This guide cuts through the academic jargon and hype to give you the exact tooling and architectural approaches you need to build reliable, production-ready agentic solutions. Core Pillars of Agent Architecture: To build effective agents, it is useful to understand how different cognitive components interact with the language model. This book distills the development lifecycle into actionable, bite-sized design principles.
Discover what makes a fully autonomous agent different from a simple wrapper API in Defining Foundation-Model Agency. You will see how advanced models are the central reasoning engine deciding when to act, think, and bring in humans.
Component Blueprinting: Know the anatomy of an agent. We go deep into building robust planning modules, handling short-term and long-term memory streams, and building safe environments in which models can execute code or query databases safely.
Managing Multi-Agent Ecosystems: Single agents often buckle under heavy and diverse workloads. You’ll learn how to break down complex corporate goals into specialized microtasks and assign individual responsibilities to a network of specialized agents that talk, argue, and work together to accomplish a goal. Architecture trade-offs. Engineering is about making trade-offs. We weigh the trade-off between latency, operational cost, and accuracy to help you decide when to use a lightweight single agent or a highly accurate but resource-heavy multi-agent network. Technical Deep Dives and Deliverables. As you work your way through the text, you will move from theory to practical application. This book gives you concrete execution tracks to help you develop and deploy individual AI solutions.
1. The Anatomy of an Agent. To create a system capable of operating independently, a good understanding of the inner workings of the system is required. You will learn how to build the four fundamental pillars that change a static model to an active, agentic workflow. The component is responsible for the technical implementation. Rationale and planning: Break down big goals into smaller, more manageable milestones. ReAct loops. Chain-of-Thought and Plan-and-Solve Memory Systems. Conversational buffers, semantic caches, and vector databases: Keeping context over long-running conversations. Tool integration examples include running Tool Integration APIs, web scrapers, and sandboxed code environments. Self-correction examples: critic-generator loops, backtesting, automated unit tests.
2. Multi-Agent Choreography and Multi-Agent Orchestration
One agent can get too much on their plate doing multiple domains in writing software, testing it, and writing the documentation. You will learn the two main design patterns for multi-agent systems. You will first create orchestrated networks in which a central manager agent assigns tasks to others. Next, you'll explore choreographed systems, where dedicated agents pass work back and forth like human co-workers on an agile team, dynamically interacting through shared event streams.
3. Production deployment and guardrails. A wild agent can be dangerous on a local machine. This book makes enterprise security and reliability a top priority. You will learn how to install strict guardrails to prevent infinite loops, how to gracefully handle rate limits, how to secure API keys, and how to secure the activities of sandbox agents so that they never accidentally corrupt database environments or leak sensitive corporate data.
Recommended Reading: This guide is for you if you are an engineering leader charged with bringing bleeding-edge AI innovation into your company, a data scientist looking to go beyond basic RAG pipelines, or a software architect who wants to deploy autonomous workflows. We don’t want speculative AI philosophy, only clean architecture, robust code patterns and scalable deployment strategies. By the time you finish this book, you will not only understand what the buzz is about with AI agents, but you will have the practical blueprints, code foundations and architectural confidence you need to implement them successfully in your own field.
What This Book Is About
This book provides a practical framework for building robust applications using AI agents. It addresses key challenges and offers solutions to questions such as:
What defines an AI agent, and when should I use one? How do agents differ from traditional machine learning (ML) systems? How do I design agent architectures for specific use cases, including scenario selection, and core components like tools, memory, planning, and orchestration?
What are effective strategies for agent planning, reasoning, execution, tool selection, and topologies like chains, trees, and graphs?
How can I enable agents to learn from experience through nonparametric methods, fine-tuning, and transfer learning?
How do I scale from single-agent to multiagent systems, including coordination patterns like democratic, hierarchical, or actor-critic approaches?
How do I evaluate and improve agent performance with metrics, testing, and production monitoring?
What tools and frameworks are best for development, deployment, and securing agents against risks?
How do I ensure agents are safe, ethical, and scalable, with considerations for user experience (UX), trust, bias, fairness, and regulatory compliance?
The content draws from established engineering principles and emerging practices in AI agents, with case studies (such as support for customer, personal assistants, legal, advertising, and code review agents) and discussions on trade-offs to help you tailor solutions to your needs.
What This Book Is Not
This book isn’t an introduction to AI or ML basics. It assumes familiarity with concepts like neural networks, natural language processing, and basic programming in languages like Python. If you’re new to these, pointers to resources are provided, but the focus is on applied agent building.
It’s also not a step-by-step tutorial for specific tools, as technologies evolve rapidly. Instead, it offers guidance on evaluating and selecting tools, with pseudocode and examples to illustrate concepts. For hands-on implementation, online tutorials and documentation are recommended, including frameworks like LangChain and AutoGen.
Who This Book Is For
This book is for engineers, developers, and technical leaders aiming to build AI agent-based applications. It’s geared toward roles like AI engineers, software developers, ML engineers, data scientists, and product managers with a technical bent. You might relate to scenarios like the following:
You’re tasked with building an autonomous system for decision support, or interactive services.
You have a working agent prototype and you want to harden it and get it ready for production.
Your team struggles with agent reliability—handling failures, adapting to dynamic environments, or orchestrating complex tasks—and you want systematic approaches including orchestration, memory, and learning from experience.
You’re integrating agents into existing workflows and seek best practices for scalability, multiagent coordination, UX design, measurement, validation, monitoring, and security.
Michael Albada is a machine learning engineer with nine years of experience designing, building, and deploying large-scale machine learning solutions at Uber, ServiceNow, and Microsoft, with experience in recommendation systems, geospatial modeling, cybersecurity, natural language processing, large language models, and the development of large scale multi-agent systems for cybersecurity. He received his B.A. from Stanford University, M.Phil. from the University of Cambridge, and M.S. from Georgia Tech with a concentration in machine learning.
ASIN: B0FR9PN9RX
Publisher: O'Reilly Media
Publication Date: September 16, 2025
Edition: 1st Edition
Language: English
File Size: 3.7 MB
Print Length: 610 Pages
ISBN-13: 978-1098176464
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 5
A company spends a lot of bread. Great crew, 6 months of developing a state-of-the-art AI model. It’s a local computer masterpiece. But when you get down to the infrastructure, it just falls apart in production. And then the project just slowly dies, latency goes up, costs balloon, and data drift destroys accuracy. This is the quiet crisis of modern technology. The AI Engineering Bible fills that gap, arming engineering teams with the weaponry to fight back against brittle pipelines and unpredictable deployments, while the headlines are all about theoretical advances. Systemic approach to manufacturing AI. Building an AI system is not a lengthy data science experiment; it is a complex software engineering problem. This is a rigorous, full-stack playbook for engineers, technical leads, architects, and product owners who need to take volatile machine learning models and make them into reliable enterprise-grade software. This is not some academic theory but a proven framework for the whole system life cycle. Design & Strategy Pipeline Architecture Production Deployment Optimization and Scale, etc.
1. Architecture, Strategy, Foundations. A successful system is well thought out before the first model is trained. In this guide’s first part, you will learn: Tie technical metrics to business results: Fuzzy business goals to concrete optimization goals. Compliance Architect: Build strong data policies, privacy protocols, and ethical protections into the DNA of the system from the outset. Human-AI Interaction Design: Create intuitive interfaces and open feedback loops, balancing automation and human oversight.
2. The core system layers need to be built with structural integrity to go from a notebook script to an automated pipeline. You’ll learn how to build the solid backbone of your system: Data Orchestration – Construct scalable ingestion pipelines and pre-processing workflows to deal with dirty, real-world data. Deterministic training cycles. Strict version control of data, code, and weights to make sure that every experiment is perfectly reproducible. Model Selection and Integration: Select the right model architecture (proprietary, open source, or custom) and integrate smoothly with existing software stacks.
3. Model & Infrastructure Deployment in the Enterprise: Building a model is more than an API endpoint.
Security Architecture: API integration security, adversarial input mitigation, and strict access controls. Maximize Performance, Efficiency & Scale. An AI that works perfectly for a hundred users can quickly fail when it has to process a hundred thousand. Cloud Orchestration & Containerization: Build Environment Agnostic, Self-Healing Deployments via Docker & Kubernetes.
MLOps: CI/CD for Machine Learning. Build automated test and deployment pipelines. Architecture and Scaling Approach.
Focus Area: Key Strategies. Target Result. Throughput & Load. Distributed inference. Better load balancing and batch requests. Architectural bottleneck at high load: zero. Data privacy. Federated learning. Distributed data processing. Improving the model while maintaining proximity to user data. System Profiling. Identify Memory Bottlenecks. Hardware Alignment. CPU/GPU optimization. No wasted resources. Optimization. Deep-Dive. It costs money to compute. Real AI engineering is elegant optimization. Squeezing the most out of every watt and dollar. Model Compression & Quantization: Learn how to minimize model footprints by using weight pruning and mixed-precision quantization ($FP32$ to $INT8$) to significantly lower inference cost while maintaining accuracy. Reduce latency. Learn how to optimize your software stack for sub-millisecond response times for mission-critical applications. long-term sustainability and management. The work starts after deployment. AI systems that aren’t watched tend to rot. Automated alerts on data and concept drift to reduce negative impact on user experience. Automated Retraining Strategies Create secure, closed-loop systems that retrain and redeploy models on new production data, without regressions. Governance and Auditing: Enable transparent logs and lineage tracking for strong regulatory compliance and internal security audits. Bullshit Cut Through. The industry doesn’t need more proof of concept models; it needs. Build real AI systems for the real world. Remove guesswork in architecture. Give your team a trusted technical point of contact.
Thomas R. Caldwell is a technology strategist, AI researcher, and best-selling author specializing in applied artificial intelligence and next-generation intelligent systems. With a career focused on bridging cutting-edge research and real-world implementation, he has advised startups, enterprise teams, and product leaders on building scalable AI-driven solutions.
Caldwell is the author of three internationally recognized titles—The AI Engineering Bible, The Agentic AI Bible, and The LLM Bible—widely used by engineers, founders, and innovation teams seeking practical frameworks for designing, deploying, and managing modern AI architectures. His work focuses on making complex AI concepts accessible, actionable, and production-ready for organizations operating in rapidly evolving technological landscapes.
ASIN: B0F4KZJN6Z
Publication Date: April 11, 2025
Language: English
File Size: 13.2 MB
Print Length: 286 Pages
Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
X-Ray: Enabled
Word Wise: Not Enabled
Page Flip: Enabled
Post: 6
This version is a complete rewrite of the book’s promotional material and is much more humanized. It is written in a way that it sounds like a genuine recommendation from an experienced engineer. It cuts through the tired marketing hype and AI buzzwords, but still keeps the technical heart. A Practical Guide to LLMOps at Production Scale: Creating the LLM Twin Much of what we read about large language models today is in a Jupyter notebook. After writing a few Python lines, you will have access to a wrapper API and can watch a chatbot produce text. It looks like magic. But if you've ever tried to turn that prototype into a high-availability, cost-effective, stable product that real users can depend on, you know the magic fades fast. There is a huge gap between a cool demo and a production-quality system. This is precisely why we wrote this book. We focus on the real workings of modern LLMOps, not superficial tutorials. We will work on creating a digital LLM Twin — a real-world project. Along the way, you will learn how to design, train, deploy, and monitor a sophisticated AI architecture that works on your local machine as well as beautifully scales in the cloud. What is Different about This Book?
This is not a shallow API recipes book or a dry formula-filled academic textbook. It’s a blueprint for AI and software developers who need to ship real code. We see the LLM application as a living ecosystem, not as separate parts. You’ll learn how to handle messy, real-world data engineering, set up pipelines for Retrieval-Augmented Generation (RAG), and tweak models to suit specific human preferences. Each chapter builds on the previous to move you from isolated sandboxes into a modular, scalable, production-grade reality. Skills You Will Acquire: Understanding how the data, training, and inference layers interact is the foundation of the high-level system design needed to build an AI replica in Architecting the LLM Twin.
Production Data Engineering: Build robust data pipelines that get your models accurate, clean data without crashing at scale. Advanced RAG Pipelines: Go beyond the basic vector search. Closing the data science-DevOps gap through inference optimization and cloud deployment. Use AWS and modern orchestrators to deploy the complete solution and optimize your models for low latency and high availability. Inside the Blueprint Contents
We have structured your journey in a logical order, similar to how you would start an enterprise AI project: The LLM Twin: Idea, Architecture and Vision The Setting: Environments, Tools, and Local Installation Generative AI Data Engineering: The Basics How to build the RAG Feature Pipeline: The Importance of Context Deep Dive: Supervised Fine-Tuning (SFT) Tuning the Brain with Preference Alignment in Behavioral Control The Truth Metrics: Measuring LLMs Efficient and fast with state-of-the-art inference optimization Assembling the Pieces: The Complete RAG Inference Pipeline.
Deployment Production Inference Pipeline Before Going Live Best Practices for MLOps and LLMOps: Playing the Long Game Who is This Book For?
This book is for AI engineers, NLP practitioners, and software architects who want to go from quick engineering to full systems engineering. We expect you to be a good Python developer, familiar with cloud environments such as AWS, and already familiar with the basics of the Generative AI landscape. We won't bore you with what a token is, but instead we'll focus on how to properly handle thousands of them without breaking your cloud budget. Your copy comes with exclusive bonuses. With your purchase, you get a whole ecosystem of learning tools, so you’re not just reading—you’re building: The digital PDF edition is fully searchable and transportable on any device for fast access while working. Interactive AI Assistant: A dedicated, context-aware AI companion that is trained specifically on the contents of this book to answer your implementation questions in real time.
The Next-Gen Reader Experience offers you access to a more advanced, interactive reading platform that makes it easy to copy, test, and adapt code snippets to your own projects.
ASIN: B0D1WR77BZ
Publisher: Packt Publishing
Publication Date: October 22, 2024
Edition: 1st Edition
Language: English
File Size: 16.0 MB
ISBN-13: 978-1836200062
Print Length: 783 Pages
Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 7
The current buzzword among those working in the technology sector is “AI agents”. We’ve all seen the demos where a chatbot has supposedly been able to manage your calendar, book a flight, or create a new marketing campaign. But the truth is, anyone who has tried to implement these systems in the real world knows. Most of them are weaklings. The jump from a slick tech demo to a robust and autonomous software system is huge. AI Agents in Action was written to specifically address that gap. They break if a user asks an unexpected question, they get stuck in infinite loops, and they need constant human care. This book is not about building dumb script-following bots that just repeat pre-written responses. Beyond basic chatbots, most of the early-stage AI applications are passive; they just sit there waiting for a user to type a prompt, generate a single response, and forget that the conversation ever happened. This is a practical engineering guide to designing, building, and deploying truly autonomous, trustworthy AI assistants that can operate reliably in unpredictable, high-stakes environments. Taking Chatbots to the Next Level: Real AI agents completely change this paradigm. They don’t just react to things; they plan, they decide, they use external software tools, and learn from their mistakes behind the scenes.
Consider the complexity of a modern business operation. Users, databases, APIs, and business logic all have to dance together in an orchestrated way. This complexity is captured by an autonomous agent that takes these messy interactions and breaks them down into self-contained components that can handle workflows on their own. If you want to build a fully autonomous customer service infrastructure or a set of background agents to automate the analysis of data, this book will give you a battle-tested framework to do it. What You Will Learn Author Michael Lanham skips the fluff tutorials and goes straight into architecture, reliability, and system design. You’ll learn how to: Develop Core Behavior Patterns: Read the book, and move beyond trial and error prompting. Implement true memory and knowledge systems. An agent is only as good as its context. Learn to program predictable behavior patterns into your agents, so they act logically and don’t get distracted. Orchestrate Multi-Agent Systems. Complex problems seldom have one solution. You’ll develop robust retrieval-augmented generation (RAG) architectures and long-term memory systems to ensure your agents are using corporate data properly and remembering past interactions.
Build Self-Improving Feedback Loops: Design software architectures where agents can inspect their own results, incorporate user corrections, and progressively improve their abilities without re-coding.
Give Agents Hands-On Capabilities: Don't just let your AI talk. Learn to build specialized agents like a researcher, a writer, and a critic, and make them collaborate, debate, and solve problems as a team. The Blueprint: Inside Look Preface to Agents and Their World. Autonomy defined. The new software landscape.
Using the Power of Large Language Models: Tweaking the core engine of your agents.
Modern Assistant Frameworks: Getting Started with Engaging GPT Assistants, Designing Collaborative AI Teams to Solve Complex Problems, Exploring Multi-Agent Systems, Agent Empowerment with Action: From Models to Tools, APIs, and External Software, Autonomous Framework/Tool Development, What to Do With It, OpenAI Assistants API, Use native code execution and deal with long-running threads.LangChain & Prompt FlowChain complex prompts together, and visualize execution paths.AutoGen & CrewAI Create, operate, and optimize multi-agent cooperative networks.
GPT Nexus: Deploy and scale your assistants in production. This is not a book that teaches programming basics. Software engineers, data scientists, and intermediate Python programmers, this is your ticket to the next level in AI engineering. You don’t need a PhD in machine learning to read it. You’re ready if you know basic Python syntax, know how to make API calls, and are looking to learn how to build AI systems that can handle high-stakes negotiations and complex workflows with zero supervision. About the AuthorMicheal Lanham is a veteran software innovator with over 20 years of experience navigating major technological shifts. Manning’s Evolutionary Deep Learning. He has authored several books on artificial intelligence that are widely respected, one of which is Learning. Throughout his career, he has been developing practical solutions to complex engineering problems. His unique background combines rigorous academic research and production-first development.
Micheal Lanham is a proven software and tech innovator with over 20 years of experience. He has developed a broad range of software applications in areas such as games, graphics, web, desktop, engineering, artificial intelligence, GIS, and machine learning applications for a variety of industries. At the turn of the millennium, Micheal began working with neural networks and evolutionary algorithms in game development.
ASIN: B0DWNH8BJ4
Publisher: Manning
Publication Date: March 4, 2025
Language: English
File Size: 42.2 MB
ISBN-13: 978-1638357377
Print Length: 562 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 8
The physicist Richard Feynman famously said that if you could not build it, then you did not really understand it. This philosophy is the best answer to the “black box” problem of artificial intelligence. We’re in the age of massive, multi-billion parameter language models that seem like sorcery, but underneath the marketing hype is a beautifully elegant architecture of pure mathematics and code. If you want to stop treating AI like magic and start treating it like engineering, you need to pop the hood, pull out the components, and build them up with your own hands. Build a Large Language Model (from Scratch) is your step-by-step guide to doing just that.
A practical guide from renowned AI research engineer Sebastian Raschka that takes you step-by-step through building a fully functional, GPT-style large language model from Scratch with nothing but raw Python and PyTorch. It explains the attention mechanism, which is the core of modern generative AI. How does a machine look at a sentence and know which words depend on which? The answer is the self-attention mechanism. In this book, you will learn about attention equations and code them line by line. You’ll learn how the mathematical vectors — Queries, Keys, and Values — enable a model to dynamically decide which words in a sequence matter. The Evolution: From Blank Script to the Personal Assistant. The journey is carefully constructed to mirror the true development cycle of an enterprise-grade foundational model. Then you will extend your code to causal multi-head attention, the very mechanism that powers GPT-2. This allows the model to handle complex linguistic relationships while ensuring that it only looks at past tokens to predict the next word. From Empty Script to Personal Assistant Data Preparation and Tokenization: Convert unstructured, unclean text data to clean numerical tensors that a neural network can digest.
Model Architecture: Outline the structural bones of a language model – the basic Transformer blocks, embedding layers, layer normalization, and feed-forward networks. The Pretraining Pipeline: Build a full training loop from Scratch so your model can learn the nuances of human language, predicting the next token over a general corpus of Technical Stack and Tooling. This book deliberately avoids higher-level abstractions such as Hugging Face and LangChain, so that you know exactly what is going on under the hood. This book is written specifically for intermediate Python developers, machine learning engineers, and data scientists who want a definitive, code-first mastery of the transformer architectures. You don’t need any background in advanced physics or high-level calculus to follow along. Or you work directly with the underlying layers: component/phase, implementation strategy, language framework, pure Python 3.x, readable, explicit programming paradigms, deep learning engine, vanilla PyTorch tensors, modules (nn.Module: If you’re comfortable writing object-oriented code, this book will give you the deepest understanding possible, the understanding that comes from building it yourself. You know Python, know the basics of machine learning (training loops, loss gradients), and want to get what’s going on under the hood in modern AI.
About the AuthorSebastian Raschka, PhD, is an LLM Research Engineer at Lightning AI and a former Assistant Professor of Statistics at the University of Wisconsin-Madison. Sebastian’s hands-on artificial intelligence experience bridges the gap between academic rigor and production-first software engineering. He is a prolific open-source contributor and author of several international best-sellers, including Machine Learning with Scikit-Learn and PyTorch, and Machine Learning Q and AI.
Sebastian Raschka has been working on machine learning and AI for more than a decade. Sebastian joined Lightning AI in 2022, where he now focuses on AI and LLM research, developing open-source software, and creating educational material. Prior to that, Sebastian worked at the University of Wisconsin-Madison as an assistant professor in the Department of Statistics, focusing on deep learning and machine learning research. He has a strong passion for education and is best known for his bestselling books on machine learning using open-source software.
ASIN: B0DGQXVK62
Publisher: Manning
Publication Date: October 29, 2024
Language: English
File Size: 20.7 MB
ISBN-13: 978-1638355731
Print Length: 628 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 9
There are a lot of flashy demos of AI agents doing amazing things under strict control right now. One of the biggest engineering challenges of this decade is going from a cool proof-of-concept of an agentic system to a dependable, revenue-generating piece of infrastructure. The Agentic AI Bible was created to be the ultimate cure to this chaotic development cycle. But talk to the engineers behind the curtain and you’ll quickly hear about the chaos: bloated prototypes that fall apart with real-world traffic, brittle tools held together with duct tape, unpredictable agent behavior, From Passive Prompting to Continuous Action LoopsHistorically, LLM apps have been built on a simple, static paradigm: a user enters a prompt, the model consumes it, and it produces a static answer.
This is a comprehensive engineering manual for software engineers, system architects, and technical product leaders who need to ship robust, goal-driven, autonomous AI systems that perform predictably at scale. This paradigm is turned completely upside down by Agentic AI. Real agents do not wait for the next keystroke, but act in a continuous perception, reasoning, and action loop. They dynamically adapt their next steps based on successes and failures, analyze options against a core objective, call external APIs or software tools, look at the results of those actions, and sense changes in their environment. This change requires a whole new approach to software architecture. What you will learn: This book cuts through the academic theory and fluffy intro tutorials to get right to the architectural patterns, safety guardrails, and deployment strategies used by top-tier engineering teams. You’re building a living ecosystem that manages memory, manages state, orchestrates complex workflows and evaluates its own logic in real-time. Design these reasoning engines, short-term memory registers, and long-term knowledge bases as independent, maintainable components for building clean, decoupled systems.
Possess advanced behavioral dynamics: Train your agents to be able to recursively reason, deeply reflect on themselves, and re-prioritize goals in real time so they can deal with unpredictable environments without getting stuck in infinite loops. Lock Down the System with Heavy Duty Guardrails: Master engineering strategies for safety, reliability, and automated testing to avoid catastrophic failures, manage token spend, The book walks you through the entire operational life cycle of an enterprise-grade agentic system, divided into clear and actionable structural zones: Deliverables & Lifecycle Phase Focus, Basic Design, Decomposing the agent loop; tradeoffs between autonomy and deterministic software controls. Cognitive Frameworks: Long-term memory with vector-based storage, episodic memory, and state management.
Tool Integration: Securing API environments, schema validation, and building resilient execution pipelines. Safety & This book shows how to tailor agent behaviors to the unique constraints of your domain, ensuring that your system complies with industry-specific data privacy laws, latency requirements, and compliance standards. If you are a Senior Software Engineer, System Architect, Data Scientist, or AI Product Lead, frustrated with working on fragile toy projects, this book is your roadmap. It was written for people who are paid to build software that works. We assume you have knowledge of the basics of machine learning and modern software engineering practices. You don't need to have an advanced statistics background. You just need to want to build autonomous systems that are predictable, trustworthy, and very capable.
Thomas R. Caldwell is a technology strategist, AI researcher, and best-selling author specializing in applied artificial intelligence and next-generation intelligent systems. With a career focused on bridging cutting-edge research and real-world implementation, he has advised startups, enterprise teams, and product leaders on building scalable AI-driven solutions.
Caldwell is the author of three internationally recognized titles—The AI Engineering Bible, The Agentic AI Bible, and The LLM Bible—widely used by engineers, founders, and innovation teams seeking practical frameworks for designing, deploying, and managing modern AI architectures. His work focuses on making complex AI concepts accessible, actionable, and production-ready for organizations operating in rapidly evolving technological landscapes.
ASIN: B0FJ9QGK8S
Publication Date: July 21, 2025
Language: English
File Size: 6.0 MB
Print Length: 465 Pages
Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
X-Ray: Enabled
Word Wise: Not Enabled
Page Flip: Enabled
Post: 10
Building with Large Language Models is a very different experience from building with traditional software engineering. In the old days, code was deterministic. You wrote a function, you gave it an input, and you got a predictable, repeatable result. These LLMs are breaking that rulebook. The challenge of modern AI engineering isn’t to get an LLM to do something cool once. They introduce a world of probabilistic output where the system can hallucinate facts, change its tone unexpectedly, or ignore your instructions completely, no matter how hard you try at prompt engineering. The trick is to make it do that cool thing ten thousand times safely, reliably, and cheaply. This book is a comprehensive list of solutions to this problem. We see LLM development as a disciplined engineering practice, not an unpredictable art form. This comprehensive guide distills years of accumulated research and production-grade field experience into a structured library of 32 tried-and-true design patterns, designed to neutralize the innate limitations of generative AI, beyond the Shortcomings of the Black Box.
All engineers who work with foundational models will eventually hit the same brick walls: hallucinations, nondeterministic behavior, token limits, context drifting, and rigid knowledge cutoffs. In this book, you will learn how to design around the limitations of LLMs by putting in place structural, algorithmic, and architectural guardrails. Tweaking your prompt by changing a word here or there, or adding more exclamation points, in an attempt to solve these issues, will make your application fragile. Architectural Patterns You Must Know. The 32 design patterns are grouped into five major execution areas, providing a complete tactical toolkit for building dependable enterprise applications. 1. The Core Problem: A deep dive into a specific, real-world failure mode or systemic bottleneck.
The Proven Solution: A structural blueprint that demonstrates exactly how to orchestrate your data, prompts, and code to bypass the issue. The Fully Coded Examples: The pattern is implemented in clean, production-grade Python code from scratch so you can see it in action. Content and Format Controls for Trade-Hardening: Don’t hold your breath waiting for your model to output clean JSON. You will learn structural patterns that will require you to follow a certain formatting, style, and schema. You can wrap your model calls in validation layers2 to avoid downstream software components crashing from an unexpected text format or an unclosed bracket. Managing Creativity and Risk (1) You need different levels of imagination for different business issues. A creative writing assistant needs high temperature and open-ended generation. A financial compliance tool needs absolute factual accuracy, zero variance. You will learn architectural patterns that allow you to manage this delicate balance, maximizing the cognitive power of the model while systematically mitigating operational, factual, and legal risks.
Advanced Agentic Workflows: Organizing. Real automation is the models that can run by themselves. We’ll discuss patterns for building autonomous agents that can plan multi-step processes, detect their own logical errors with self-reflection loops, execute external software commands, and dynamically adjust their course if an API call fails. 4. Multi-Agent Orchestration and Collaboration. Complex business operations are too big for one prompt or agent to handle. You will learn how to break down big workflows into a network of specialized agents collaborating. When you build systems that allow models to communicate, talk through solutions, review each other's work, and assign tasks based on expertise, you can solve problems that would easily fill up the context window of a single model. Real World Production Composability. One pattern generally doesn’t solve a big problem.” The blueprint of the technical toolkit classifies the strategies according to their main architectural goal, providing you with a birds-eye view of how these patterns fit into your day-to-day development workflow: Pattern Category Primary Goal Main Deliverable Structural guardrails Fix formatting errors and hallucinations in text. Schema validated, deterministic outputs (JSON / YAML). In the last section of the book, you will learn how to compose, stack, and chain multiple design patterns to create very complex, production-grade Context and Recollection. Ingest huge data volumes, bypassing knowledge cutoffs. Production-ready RAG and dynamic state management registers.
Cognitive Loops Support Autonomous Problem-Solving and Error Recovery. Recursive Reasoning Pipelines for Self-correcting Agents. Distributed Networks Scale execution across complex multi-variable workflows. Automated task transitions. This book is for Software Engineers, AI Architects, and Technical Product Leaders who are moving from simple prompt hacking to serious systems engineering. We assume you know Python and have a basic understanding of generative AI. We won't go into the details of what an LLM is. We focus exclusively on the architectural frameworks, code patterns, and systems design principles needed to take volatile models and turn them into solid, deployable software infrastructure for business.
From the Preface:
If you’re an AI engineer building generative AI (GenAI) applications, you’ve likely experienced the frustrating gap between the ease of creating impressive prototypes and the complexity of deploying them reliably in production. While foundational models make it easy to build compelling demos, production systems demand solutions to fundamental challenges: hallucinations that compromise accuracy, inconsistent outputs that break downstream processes, knowledge gaps that limit enterprise applicability, and reliability issues that make systems unsuitable for critical applications.
This book bridges that gap by providing 32 battle-tested design patterns that address the recurring problems you’ll encounter when building production-grade GenAI applications. These patterns aren’t theoretical constructs—they codify proven solutions that are often derived from cutting-edge research and refined by practitioners who have successfully deployed GenAI systems at scale.
Valliappa (Lak) Lakshmanan works closely with management teams across a range of enterprises to help them employ data and AI-driven innovation to grow their businesses. Previously, he was the Director for Data Analytics and AI Solutions on Google Cloud and a Research Scientist at NOAA. He co-founded Google's Advanced Solutions Lab and is the author of several O'Reilly books and Coursera courses. He was elected a Fellow of the American Meteorological Society (the highest honor offered by the AMS) for pioneering machine learning algorithms in severe weather prediction.
Hannes Hapke is a Senior Machine Learning Engineer at Digits, and has co-authored multiple machine learning publications, including the book Building Machine Learning Pipelines and Machine Learning Production Systems by O'Reilly Media. He has also presented state-of-the-art ML work at conferences like ODSC or O’Reilly’s TensorFlow World and is an active contributor to TensorFlow's TFX Addons project. Hannes is passionate about machine learning engineering and production machine learning use cases using the latest machine learning developments.
ASIN: B0FTTBPVVP
Publisher: O'Reilly Media
Publication Date: October 3, 2025
Edition: 1st Edition
Language: English
File Size: 18.5 MB
ISBN-13: 979-8341622630
Print Length: 921 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 11
The technology landscape has completely changed beneath our feet. You can think of ChatGPT and Stable Diffusion as the latest software development tools that began as cool internet fads. They’ve been trained on vast oceans of public text and imagery and have an astonishingly broad ability to help with a mind-boggling variety of tasks. The entry barrier has been removed, and now any developer with an API key can write software to automate complex tasks that were previously totally out of reach. But reliability is something that nearly every developer faces when they start to experiment.
Making an AI model do something cool once is very easy. Getting it to do the same thing consistently enough to bake into an automated system of production-grade quality is very hard. You write code. You send a prompt. You cross your fingers that the output doesn't randomly break your app logic.
That’s the thinking behind Generative AI in Practice. James Phoenix and Mike Taylor provide a deeply practical, engineering-first guide to mastering prompt engineering beyond the superficial “hype.” They don’t think of prompts as magic spells, but rather as a structured programming interface for getting deterministic, reliable behavior out of probabilistic machines.
The Shift in Psychology: User Interface: Document Completion
To build trustworthy AI-based software, you need to know how these models actually see the world. The mistake that most developers make is treating an LLM as a human assistant that understands intent, context, and nuance. A language model is actually an extremely complex statistical engine, designed to solve the problem of document completion as the main problem. The online training of an LLM involves learning to predict the next logical word in a sequence. So you do not have to “ask a question” in an application prompt.
Your task is to produce the exact beginning of the document that the model will be able to complete statistically, just by giving the exact answer or data structure that your application needs. This book throws light on this fundamental area of training. You will learn exactly how your high-level application logic is transformed into raw token strings, and how to structure those strings so that you can take advantage of the model's architectural advantages while avoiding its inherent blind spots. Unraveling the Chain of Interactions. In the real world, an AI application is rarely a solitary API prompt.
It's a long chain of little, tiny steps. In this book, you’ll walk through the anatomy of the interaction chain and how to manage data moving back and forth between your application and the model. User input → context injection → prompt optimization → LLM evaluation → output parsing → application action. You will learn to intercept and optimize all the links in this chain. The authors go into detail on tokenization, temperature controls, and context windows and explain how each of these factors influences your system's accuracy, speed, and cost. You'll develop a systematic way to debug failed prompts, manage latency, and organize outputs to fit in with your existing codebases, rather than guessing. Applying Mastery to 4 Technical Areas.
In this book, we study how these fundamental principles are manifested across four critical domains of software engineering to bridge the gap between theory and practice: Natural language processing (NLP) allows you to master complex text classification, semantic search, entity extraction, and sentiment analysis without having to train specialized models from scratch. Structured Text Generation: Learn how to get LLMs to consistently generate reliable, valid JSON, YAML, or markdown, turning unstructured thoughts into clean data that software can parse. Automated Code Execution Use models to dynamically write, refactor, test, and debug code snippets to make your application an intelligent development engine.
Diffusion for Image Generation: Get beyond text models. Use these same principles of prompt engineering to control consistency, style, lighting, and composition in diffusion models such as Stable Diffusion for programmatic visual creation. Who is this book for? The target audience for this book is software engineers, data developers, and product architects who want to transform themselves from casual users of ChatGPT to professional AI systems engineers. We assume you know the basics of coding and how to use APIs.
We’re not going to waste your time telling you what artificial intelligence is, or singing the praises of the human race and what we might achieve in the future. We give you the real-world code examples, architectural insights, and concrete principles you need to make unpredictable generative models predictable components of your production stack.
The rapid pace of innovation in generative AI promises to change how we live and work, but it’s getting increasingly difficult to keep up. The number of AI papers published on arXiv is growing exponentially, Stable Diffusion has been among the fastest growing open source projects in history, and AI art tool Midjourney’s Discord server has tens of millions of members, surpassing even the largest gaming communities. What most captured the public’s imagination was OpenAI’s release of ChatGPT, which reached 100 million users in two months, making it the fastest-growing consumer app in history. Learning to work with AI has quickly become one of the most in-demand skills.
We've been doing prompt engineering since the GPT-3 beta in 2020, and when GPT-4 arrived we found a lot of the tricks and hacks we used were no longer necessary. This motivated us to define a set of future-proof principles that are transferrable across models and modalities, that will still be useful with GPT-5, or whatever model we use in the future.
ASIN: B0D4FBPLX1
Publisher: O'Reilly Media
Publication Date: May 16, 2024
Edition: 1st Edition
Language: English
File Size: 23.1 MB
ISBN-13: 978-1098153397
Print Length: 691 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 12
Most people become angry when something in their world goes wrong. Her powerful father left her mother for a much younger woman, and Linh didn’t throw a fit, scream, or make a fuss. She took a deep breath and looked on instead. Linh had this clinical leverage to negotiate her absolute independence and financial freedom, and she waited patiently until her new stepmother was pregnant.
At 19, Linh believed she could fool anyone living. Then she met Quan, a rising corporate executive for whom emotions were not the capricious whimsy of a dazed mind but strategic variables and who traveled through life with perfect self-control. Linh was interested, not frightened, and six years younger than him. She did what she always did: she worked out a strategy, tested his limits, and went for him. But what had begun as a game of psychological chess was slowly slipping from her control altogether. Their relationship had devolved into a fevered entanglement over the course of a decade, completely blurring the lines between true love, dark obsession, and mutual manipulation.[Book Title] is a sweeping, psychological coming-of-age novel set against the vivid, shifting backdrop of contemporary Vietnam.
It chronicles Linh’s rocky rise over the course of fourteen years from a bright but rebellious teenager to a very successful corporate businesswoman. The end result is that Linh is faced with the emotional bankruptcy of reducing every human interaction, including love, to a cold, calculating transaction. The Anatomy of a Psychological Game. This is not your average, clean-cut contemporary romance. Instead, it’s a messy romance, an intoxicating one that threatens to consume her identity. As a child from a broken home, she learns to cope by viewing life as a series of strategic negotiations, which, for the readers, will be her greatest professional asset and ultimate personal ruin.
Readers are obsessed because the initial response to Linh and Quan’s story underscores a narrative that is as intellectually stimulating as it is emotionally addictive: “The Machiavellian undertone combined with the dysfunctional relationships is what I am here for. The relationship between her and Quan is an interesting look at two people who can see through everyone else’s masks but are utterly trapped by each other." A cleverly written story with strategy, romance, twists, and just the right amount of reality. It’s addictive. Linh is such a strong character. She has a messy, toxic, and utterly captivating relationship with Quan. I liked that she was still turned on by his brain, talent, and drive even when she knew exactly who he was and what he’d done.
She just understood him. The beautifully made portrayal of Hanoi allowed me to relive familiar childhood memories while living my twenties as if I were right there in Vietnam. This novel is a love letter to the changing landscapes of Vietnam.” The setting grows up as Linh does, from the atmospheric, nostalgic streets of Hanoi to the hyper-paced, glittering corporate arenas of modern Saigon. The novel is a powerful story of the psychological warfare between its two main characters, but also a richly textured portrait of modern Vietnam.
Publishing Roadmap: The novel is available now as a special Early Release eBook for readers who want to experience Linh’s world early. The atmospheric prose captures the sensory reality of the country, the humidity, the coffee culture, and the hidden alleys, while at the same time mapping the rise of a new generation of fierce, independent young entrepreneurs struggling to define themselves outside of traditional family expectations. December 2025 - the worldwide release in all formats with everything final, live at last. Features of availability edition: Early Release eBook, Now Available, Official Full Release, December 2025. Main narrative layers optimized for digital reading.
Final All-encompassing Edit, Extra Character Perspectives, Professional Print Layout. Who is this Story For? This book is your way out of passive protagonists and formulaic plots. It's for readers who like stories with characters smart enough to be their own worst enemies, sharp dialogue, and complicated character studies. If you enjoy well-crafted storytelling with a lush intellectual grid and a raw, beating heart, you'll be turning pages long into the night with Linh's fourteen-year journey.
I’m Chip Huyen, a writer and computer scientist. I grew up chasing grasshoppers in a small rice-farming village in Vietnam.
I work in the intersection of AI, data, and storytelling. Previously, I built machine learning tools at NVIDIA, Snorkel AI, Netflix, and founded an AI infrastructure startup (acquired).
I also taught Machine Learning Systems Design at Stanford.
My last book, Designing Machine Learning Systems, is an Amazon bestseller in AI and has been translated into over 10 languages (very proud!).
In my free time, I like writing stories. I'm also the author of 4 Vietnamese story books.
ASIN: B0F1KSDZZL
Publisher: Tép Studio
Publication Date: March 14, 2025
Language: English
File Size: 1.2 MB
Print Length: 382 Pages
Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
Word Wise: Enabled
X-Ray: Not Enabled
Page Flip: Enabled
Post: 13
Large Language Models Are Changing the Rules of Software Development. Today, one model can draft legal contracts, write production-quality code, and diagnose rare mechanical failures. But there’s a world of difference between getting an LLM to perform a neat trick on your laptop and getting it to work reliably in commercial software, as thousands of developers have already discovered. The models are very capable, but are also very unpredictable. They know statistical patterns, not human intentions. Prompt engineering is the bridge between unreliable production software and raw machine capability. In [Book Title], seasoned industry veterans John Berryman and Albert Ziegler cut through the hype to provide a definitive engineering-oriented guide to mastering this new interface.
It is an art and a hard science rolled into one and is fast becoming the defining skill of the modern software age. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. Moving from Human Intent to Document Completion. From human intent to document completion. Human, you are not changing. Once you understand this architectural reality, it will change how you approach development.
You're talking to a massive mathematical engine built to do one thing: take a string of text and guess the most statistically probable way to complete it. You stop asking questions of the model. This book opens the lid on this token-prediction domain and gives you an intuitive understanding of model architecture so that you can take advantage of its inherent structural strengths while systematically avoiding its blind spots. Deconstructed Engineering Framework: If you are building a commercial AI application, you need a full and repeatable strategy for prompt engineering. You start to design the beginning of a document in such a strategic way that the only logical, statistical way for the model to fill in the blanks is with the exact data structure, tone, or answer that your application needs.
This book takes you through the entire lifecycle of an enterprise prompt. It shows you how to treat context as a vital resource that needs to be managed, not an afterthought. [>Application Purpose]> Context Collection and Triage> Prompt Synthesis > LLM Evaluation. You will learn the sophisticated science of Deterministic Output Context Management. In the real world, you can’t just dump thousands of pages of raw data on an application and hope for the best. You will learn how to programmatically collect, triage, prioritize, and compress background information so that your prompts are lean, super-efficient, and perfectly targeted to the task at hand.
Advanced Techniques You Will Learn. The book goes far beyond a guide to writing instructions, and instead explores the programmatic structures that underpin the most advanced AI systems: Advanced Few-Shot Learning: Learn the art of designing templates. This breaks latency, causes context drifting, and eats your token budget. Know how many examples to give, how to order them to avoid model bias, and how to format them so the model mirrors your required output structure perfectly. Chain-of-Thought (CoT) Prompting: Learn how to get models to produce explicit step-by-step reasoning paths before they output a final answer, dramatically improving accuracy on logical, mathematical, and complex symbolic tasks. Retrieval augmented Generation (RAG) Integration: Discover how to seamlessly integrate your prompt pipelines with external databases and vector search engines, injecting real-time, proprietary corporate data right into the model’s active attention window.Technical Toolset Roadmap. To give you an immediate sense of how these strategies translate to production environments, the book is organized around the core techniques by their specific operational objective: Engineering Stage, Core Focus, Architectural Alignment, Practical Benefit, Tokenization, and Temperature adjustment. Removes non-deterministic output drift and random formatting breakage.Context Triage, Semantic ranking, and input compression.
Cut your cloud infrastructure cost and improve the accuracy of your prompts. Cognitive structuring: Chain-of-Thought and structural self-reflection.Overcomes logical bottlenecks to complex multi-variable problems. Who Should Read This Book? We assume that you are familiar with writing clean code and have a good understanding of the fundamentals of software architecture. This book is written specifically for Software Engineers, System Architects, and Data Professionals who want to go from casual prompt hacking to professional AI systems engineering. We’re not going to waste your time with introduction pages about what artificial intelligence is, or philosophizing about the future of work. Instead, we provide you with the exact frameworks, linguistic principles, and production-ready examples that you need to transform unstable, erratic language models into rock-solid, dependable components of your enterprise software stack.
This book is written for application engineers. If you build software products that customers use, then this book is for you. If you build internal applications or data-processing workflows, then this book is also for you. The reason that we are being so inclusive is because we believe that the usage of LLMs will soon become ubiquitous. Even if your day-to-day work doesn’t involve prompt engineering or LLM workflow design, your codebase will be filled with usages of LLMs, and you’ll need to understand how to interact with them just to get your job done.
However, a subset of application engineers will be the dedicated LLM wranglers—these are the prompt engineers. It’s their job to convert problems into a packet of information that the LLM can understand—which we call the prompt—and then convert the LLM completions back into results that bring value to those who use the application. If this is your current role—or if you want this to be your role—then this book is especially for you.
John Berryman is the founder and principal consultant of Arcturus Labs, where he specializes in LLM application development. His expertise helps businesses harness the power of advanced AI technologies. As an early engineer on GitHub Copilot, John contributed to the development of its completions and chat functionalities, working at the forefront of AI-assisted coding tools.
ASIN: B0DM3VLNSK
Publisher: O'Reilly Media
Publication Date: November 4, 2024
Edition: 1st Edition
Language: English
File Size: 24.2 MB
ISBN-13: 978-1098156114
Print Length: 467 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 14
There is a dangerous myth in software engineering that you can just take a well-known database and put it in your cloud infrastructure, and it will scale as your user base grows. But anyone who has run a system under a heavy production load knows that applications die in the data. In the application layer, when your system starts to buckle, it rarely breaks. It doesn’t work due to data bottlenecks. Suddenly, you have to deal with systemic problems such as network partitions, race conditions, replica lag, write amplification, and data corruption.
The sector is also rife with buzzwords, and this is troubling as well. You are told you need document databases, graph databases, NoSQL key-value stores, relational databases, data warehouses, and data lakes. How do you cut through the marketing noise when every vendor says their tool is the fastest, most dependable, and easiest to scale? How do you make engineering decisions that you won’t have a lot of regret about three years from now? Designing Data-Intensive Applications, Second Edition, is the definitive guide for navigating this chaos. Martin Kleppmann and Chris Riccomini have revised the text to reflect the current reality of the cloud-native ecosystem and built upon the storied foundation of the first edition. This book does not support any particular vendor or brand of cloud.
Peering Under the Hood: The Reality of Trade-offs will teach you the fundamental principles of distributed systems so you can build software that is truly scalable, highly reliable, and simple to maintain.
The book’s core message is simple: there are no magic solutions, only trade-offs. Providing very fast write performance may be at the expense of read latency or poorer consistency guarantees. All the databases in the marketplace had made a conscious decision to give up one capability in order to do one thing well. Because of the laws of physics, if a cloud service promises absolute consistency of data across multiple continents, it must introduce network latency. This book trains you to recognize these trade-offs in a flash. You’ll be able to analyze systems by their underlying architectural DNA, not by superficial feature lists. You will learn the mechanics of Log-Structured Merge-trees (LSM-trees) and B-trees and compare and contrast them to understand how data is actually structured on physical disks or in memory.
By understanding these basic storage engines, you can accurately predict how a database will behave when your application goes from a trickle of traffic to millions of concurrent requests. Unraveling the Labyrinth of Distributed Systems. The creation of modern software means dealing with distributed systems. This book is a masterclass in distributed systems theory, translated into the realities of practical engineering. Once your data is on more than one machine, you are in a hostile environment. Networks drop packets, clocks go out of sync, and servers crash without warning.
Partitioning and Sharding: How do you split up huge datasets across many nodes without creating bad data hot spots that bottleneck performance? Transactions and Consensus: You will explore the deep mechanics of ACID properties and isolation levels, and how distributed systems agree on a single truth using consensus algorithms. Replication > Application Writes Network lag Partition> Split-Brain Chaos> Application Writes Knowing about these failure modes can help you stop treating cloud infrastructure as a magical black box.
The second edition of Architecture of the Global Cloud Stack places great emphasis on how the major cloud providers architect their managed services for global scale: Architectural Challenge Underlying MechanismReal World ImpactFault ToleranceAutomated consensus and health checksSystem remains operational even when an entire data center goes offlineGlobal Scale Active-active partitioning, Multi-regionManaged data drift for international users with low latency accessData Analytics Columnar storage enginesScan and roll up billions of rows This book is aimed at Software Engineers, System Architects, DevOps Leads, and Technical Managers who are responsible for delivering and managing production systems in the real world. We suppose you already know how to write clean code and build basic web applications. This book cuts the introductory fluff. It won’t show you how to start a Docker container or the basic SQL syntax. Instead, it gives you the deep structural knowledge you need to go from being a developer who just hooks parts together to an architect who can confidently design robust, industrial-grade data systems that can handle huge scales.
What’s New in the Second Edition?
This second edition has the same goals and scope as the first edition of Designing Data-Intensive Applications, which was published in 2017. However, we have thoroughly revised the entire book to reflect technological changes that have happened in the last decade and to improve the clarity of the explanations.
The biggest technical changes that have affected this book since the first edition are the explosion of interest in AI and the rise of cloud native data systems architectures. While this book is not about AI per se, we have added coverage of data systems that support AI and machine learning, including vector indexes (used for semantic search), DataFrames (used for training datasets), and batch processing systems for preparing large amounts of training data. Cloud native ideas, such as building data systems on top of object stores instead of local disks, have been woven in throughout the book.
Chris Riccomini is a software engineer, startup investor, and author with 15+ years of experience at PayPal, LinkedIn, and WePay. He runs Materialized View Capital, where he invests in infrastructure startups. He is also the co-creator of Apache Samza and SlateDB, and co-author of The Missing README: A Guide for the New Software Engineer.
Martin Kleppmann is a researcher in distributed systems at the University of Cambridge. Previously he was a software engineer and entrepreneur at Internet companies including LinkedIn and Rapportive, where he worked on large-scale data infrastructure. In the process he learned a few things the hard way, and he hopes this book will save you from repeating the same mistakes.
Martin is a regular conference speaker, blogger, and open source contributor. He believes that profound technical ideas should be accessible to everyone, and that deeper understanding will help us develop better software.
ASIN: B0GNTX59CY
Publisher: O'Reilly Media
Publication Date: February 17, 2026
Edition: 2nd Edition
Language: English
File Size: 8.2 MB
ISBN-13: 978-1098119027
Print Length: 1194 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 15
Large Language Models have taken software development past simple text generation. If you’ve ever tried to coordinate multiple LLM calls, you know how quickly linear code breaks down. Monolithic scripts just don’t cut it for tasks requiring real-world reasoning, the ability to change course when an API fails, or the ability to break down a large goal into smaller, logical milestones. If we want to develop truly autonomous software, we should turn to architectural design patterns that resemble human organizational structures. Architecting Autonomous AI Agents is your engineering blueprint for just such a transition. This is a practical guide written by industry-leading AI architects who are actively shaping global enterprise AI standards.
It will take you out of the brittle, one-prompt hack world and show you how to use the Coordinator-Worker-Delegator approach, a highly scalable, predictable framework for multi-agent systems engineering. Core Blueprint: Coordinator, Worker, and Representative. When humans tackle a big business project, there’s rarely one person doing everything alone. We split into specialized roles to handle complexity. This book applies that philosophy to the management of software architecture. Instead of one context window in an LLM doing everything all at once – planning, tool execution, error checking, and final formatting.
You will see how to break these tasks into a structured hierarchy: COORDINATOR (for strategic planning and scheduling)(Uses a Specific Tool) (Spawns an Array of Sub-Agents). The Coordinator is the strategic heart of your system.
The Worker: An agent of extreme specialization, limited to doing one thing and doing it well. This component is responsible for managing the high-level objective, breaking down the user’s goal into chronological phases, maintaining the overall state, and evaluating whether the final output meets quality benchmarks. The Worker has access to certain tools and is subject to strict execution controls when asking a vector database, parsing a specific financial document, or writing Python code.
The Delegator: The scalable routing engine. Beyond scripted automation, most of the traditional chatbots are passive and fully reactive. If a task is too complex to be solved in a single execution loop, the Delegator dynamically spawns sub-agents, hands isolated subtasks off to them, and collects their outputs to feed back to the Coordinator. The truly agentic systems are systems that run on a continuous and self-correcting feedback loop. By decoupling the thinking layer (the Coordinator) from the doing layer (the Worker), we have an AI infrastructure that can deal with unpredictable, high-stakes environments without getting stuck in infinite loops or burning through your cloud budget.
A Sneak Peek Inside: Contents. The chapters are arranged in a logical order to take you from the core generative ideas to production-ready enterprise deployments.
Enabling Tool Use and Planning: How Agents Interact Safely with External APIs.The Coordinator, the Worker & the Delegator Approach: When software is given the autonomy to run code and interact with external data sources, security must not be compromised. The authors spend much time discussing the engineering side of AI ethics and security. Technical Prerequisites: This book is written for professionals paid to ship working code. Required knowledge of the Audience Stack.
Expected Result: AI Developer, Solid Python 3.x experience.Moving from prompt-hacking to resilient agent platforms.ML Engineers familiar with LLM APIs and embeddings.Multi-agent system design and state control.Software Architects: Knowledge of cloud infrastructure and micro services.Create decoupled, scalable AI systems to work with legacy
Anjanava Biswas is an award-winning senior AI specialist solutions architect with over 17 years of industry experience. Specializing in machine learning, Generative AI, natural language processing, deep learning, data analytics, and cloud architecture, he partners with large enterprises to build and scale advanced AI systems in the cloud. Anjanava is widely recognized for his contributions to the field of applied AI. He has published research in multiple scientific journals and actively contributes to open-source AI/ML projects. His professional accolades include Fellowships with BCS (UK), the IET (UK), and IETE (India), and he is a senior IEEE member. A frequent public speaker, Anjanava has held key positions at industry giants like IBM and Oracle Corp. Originally from India, he now resides in San Diego, CA, with his wife and son, where he continues to innovate and inspire within the tech community.
ASIN: B0F22KNJ7C
Publisher: Packt Publishing
Publication Date: April 21, 2025
Edition: 1st Edition
Language: English
File Size: 7.6 MB
ISBN-13: 978-1801079273
Print Length: 507 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 16
When engineers first come to the field of data science, they typically think that the hardest part will be the math, fancy calculus, optimization algorithms, and advanced statistical theories. But once you begin building real systems, you learn the truth pretty fast. The real nightmare of machine learning isn’t math. It's the pipework. It’s addressing data pipeline drift, fragile feature engineering, unreproducible training pipelines & models that work beautifully on a data scientist’s laptop but totally fall apart the moment they hit production traffic. Software engineering survived its early days as a discipline by creating standard design patterns – repeatable, reliable solutions to recurring structural problems (think of the classic Model-View-Controller framework). Machine learning demands the precise same structural discipline.
Without a standard language for architectural patterns, each data science team has to reinvent the wheel, leading to messy, unmaintainable codebases that don’t scale. Machine Learning Design Patterns was written to provide that essential missing discipline. This comprehensive guide, written by three veteran Google engineers, distills the tribal wisdom of hundreds of AI experts into a simple, understandable list of 30 different design patterns. Diving into the ML Pattern Framework.
All of the patterns in this book are presented in the standard engineering blueprint style, from the messy real-world problem to the optimized production state. This book removes the theoretical hand-waving and corporate hype to provide you with real-world blueprints to overcome the most stubborn, recurring bottlenecks throughout the whole machine learning lifecycle.
Each entry is divided into three distinct phases: The Specific Anti-Pattern or Challenge is an in-depth analysis of a common systemic bottleneck, e.g., data leakage, hidden feedback loops, representation bias. The Array of Structural Solutions is a step-by-step breakdown of proven architectural ways to solve the problem, with code and infrastructure layouts.
The Trade-Off Matrix is a simple estimate of the cost of each solution in terms of computational overhead, latency, training time, and Data Representation>Model Customization>Training Loop Design>Scale and Equity in Production> 1. Problem and Data Encodings. The whole path is decided by how you feed data into a neural network. You will learn patterns for complex data representation that go far beyond basic normalization. You’ll learn the specifics of how to build stable embeddings, mathematically sound feature crosses, and a format for representing complex multi-modal data that your models can ingest effectively without overfitting.
Design of Resilient Training Loop: A production-grade training loop is much different than a simple sandbox. fit command.
The authors demonstrate how to build robust, industrial-strength training pipelines. You'll find automated checkpointing systems, scalable GPU cluster distribution strategies, and algorithmic hyperparameter tuning routines that optimize your architecture without blowing your cloud budget. 3. Reproducibility and Operationalization at Scale. A model is a living asset, not a static artifact. In this book, you will learn how to build deployable architectures that can be reused again and again. You'll learn how to build automated retraining pipelines that hot-swap production models seamlessly without user disruption or downtime, detect feature drift in real-time, and ingest new data streams. Fairness, Explainability, and Interpretability. A very accurate model doesn’t help you if it’s a black box your stakeholders don’t trust, or worse, a source of systemic bias against a subset of your users. You’ll learn to pull out clear feature importances and understand predictions for non-technical leaders, as you master patterns for model explainability. Data Engineering: Embeddings, feature crosses, and windowing. Model inputs are mathematically optimized and clean.
System Operations: Distributed training, checkpoints, and tuning. Repeatable high-efficiency training loops. Production DevOps: Model hot-swap, drift detection, CI/CD.Deployments that scale and get better all the time.Ethics & Governance: Explainable AI (XAI) and fairness audit.Transparent, non-biased, and compliant model output. We assume you are a solid Python programmer and have a working familiarity with deep learning frameworks such as TensorFlow or PyTorch. This book is for Data Scientists, Machine Learning Engineers, and System Architects who have graduated from basic tutorials and are responsible for shipping real, reliable enterprise infrastructure. In this book, you don’t have to spend time learning about data frame manipulation or what a neural network is. Instead, it provides the detailed structural design patterns you need to move from a tinkering developer of parameters to an architect who can confidently design robust AI systems of industrial quality that can handle a massive scale.
Sara Robinson is a Developer Advocate on Google's Cloud Platform team, focusing on machine learning. She inspires developers and data scientists to integrate ML into their applications through demos, online content, and events. Sara has a bachelor’s degree from Brandeis University. Before Google, she was a Developer Advocate on the Firebase team.
Michael Munn is an ML Solutions Engineer at Google where he works with customers of Google Cloud on helping them design, implement, and deploy machine learning models. He also teaches an ML Immersion Program at the Advanced Solutions Lab. Michael has a PhD in mathematics from the City University of New York. Before joining Google, he worked as a research professor.
Valliappa (Lak) Lakshmanan is Global Head for Data Analytics and AI Solutions on Google Cloud. His team builds software solutions for business problems using Google Cloud's data analytics and machine learning products. He founded Google's Advanced Solutions Lab ML Immersion program. Before Google, Lak was a Director of Data Science at Climate Corporation and a Research Scientist at NOAA.
ASIN: B08L8GRRBM
Publisher: O'Reilly Media
Publication Date: October 15, 2020
Edition: 1st Edition
Language: English
File Size: 21.9 MB
ISBN-13: 978-1098115746
Print Length: 410 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 17
To an outsider, the field of Artificial Intelligence can seem incredibly intimidating. Most machine learning textbooks plunge the reader directly into a sea of heavy mathematical notation, complicated statistical proofs, and theoretical abstractions. It’s easy to come away with the impression that you shouldn’t attempt to build intelligent systems unless you have a PhD in advanced mathematics. But the worst-kept secret in the industry is that you don’t need to know the underlying calculus to write software that learns from data.
The tools of deep learning have become very accessible in recent years. Today, the entry barrier has vanished completely. If you can write a basic Python script and understand logic, you can build systems that have computer vision, language processing, and complex predictive modeling using production-ready frameworks. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition by Aurélien Géron is widely considered the canonical way to make this transition. You don’t need more theory but an intuitive, hands-on understanding of how these pieces fit together in the real world.
The Journey: Linear Regression to Present Transformer. A common mistake of people learning AI is to jump into the most complex neural networks without first learning the basics. This popular book skips the boring academic lectures and gives you concrete code examples plus only the theory you need to get that intuitive, muscle-memory understanding of modern machine learning. This book is aware of that trap and avoids it. The format is a progressive roadmap that reflects the real-world progression of a data scientist’s career. You learn the fundamentals of classical machine learning with Scikit-Learn. From there you will move on to more advanced classical architectures, mastering: Support Vector Machines (SVMs): Finding optimal decision boundaries in high dimensional space; Decision Trees & Random Forests – Constructing intuitive explainable logic models and aggregating them into powerful ensembles that do not overfit; and Unsupervised Learning: Uncover hidden structures in data without pre-existing labels using clustering, dimensionality reduction, and advanced anomaly detection. You will go through a complete end-to-end project from start to finish. Data Cleaning]>Feature Selection]>Classical Models]>Deep Neural Networks]>Generative AI]> Data Cleaning]>Feature Selection]> The book will take you to the world of Deep Learning with TensorFlow and Keras. It delves into particular neural networks in detail, such as Convolutional Networks (CNNs) for state-of-the-art computer vision, Recurrent Networks (RNNs) for sequential data, and the revolutionary Transformer architectures that drive modern language processing. You will be able to open up the black box of neural networks and see how to build, compile, and train deep architectures.
A blueprint built on engineering first. This book is unlike almost all other books on the shelves in that it is totally focused on being production-ready. You will also get to experience the cutting edge of generative AI by writing your own Generative Adversarial Networks (GANs), autoencoders, and diffusion models. Learning Phase Tooling Focus Key Takeaway / Capability Data & Foundations Scikit-Learn Understand how to clean data, build pipelines, and avoid data leakage. Deep Learning with TensorFlow / Keras: Build multi-layer neural nets optimized for speed and accuracy. Custom Layers and Tokenizers (Advanced)Learn natural language processing and computer vision frameworks.
ScaleTF Data API: Deploy architect-efficient data loading pipelines that don't choke your GPUs. We expect you to already know how to code in Python, and to have a good understanding of basic programming concepts like loops, functions, and object-oriented design. This book is for software engineers, developers, and programmers who want to move beyond treating AI as magic and start treating it as a concrete part of their software toolbox. You don’t need to be a data scientist of a high level. This book is your definitive guide to building real, working systems that can predict trends, classify images, and generate human-like text on production infrastructure.
Machine Learning in Your Projects
So, naturally you are excited about Machine Learning and would love to join the party! Perhaps you'd like to give your homemade robot a brain of its own? Make it recognize faces? Or learn to walk around? Or maybe your company has tons of data (user logs, financial data, production data, machine sensor data, hotline stats, HR reports, etc.), and more than likely you could unearth some hidden gems if you just knew where to look. With Machine Learning, you can accomplish the following & much more:
Segment customers and find the best marketing strategy for each group.
Recommend products for each client based on what similar clients bought.
Detect which transactions are likely to be fraudulent.
Forecast next year’s revenue.
Prerequisites
This book assumes that you have some Python programming experience and that you are familiar with Python’s main scientific libraries, in particular NumPy, Pandas, and Matplotlib.
Also, if you care about what’s under the hood, you should have a reasonable understanding of college-level math as well (calculus, linear algebra, probabilities, and statistics).
Aurélien Géron is a Machine Learning consultant. A former Googler, he led YouTube's video classification team from 2013 to 2016. He was also a founder and CTO of Wifirst from 2002 to 2012, a leading Wireless ISP in France, and a founder and CTO of Polyconseil in 2001, a telecom consulting firm.
Before this he worked as an engineer in a variety of domains: finance (JP Morgan and Société Générale), defense (Canada’s DOD), and healthcare (blood transfusion). He published a few technical books (on C++, WiFi, and Internet architectures), and was a Computer Science lecturer in a French engineering school.
A few fun facts: he taught his 3 children to count in binary with their fingers (up to 1023), he studied microbiology and evolutionary genetics before going into software engineering, and his parachute didn’t open on the 2nd jump.
ASIN: B0BHCFNY9Q
Publisher: O'Reilly Media
Publication Date: October 4, 2022
Edition: 3rd Edition
Language: English
File Size: 26.2 MB
ISBN-13: 978-1098122461
Print Length: 1449 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 18
There’s no doubt that the software development industry is going through a major shift. The demand for engineers who actually know how to implement artificial intelligence is skyrocketing, but traditional programming jobs are getting more crowded year after year. It’s easy to get discouraged and think that unless you have a PhD in mathematics, you have no business touching machine learning. But if you think about it, that’s not how you learned to build web or mobile apps.
As a programmer trying to make the definitive pivot into AI, you’ve probably run into a frustrating wall: most introductory books immediately throw you into a sea of advanced calculus, linear algebra, and intimidating statistical theory before you even get to write a single line of code. AI and Machine Learning for Coders follows the same philosophy. You didn’t start reading theory. You started writing a “Hello, World” script, watching it run, and building your confidence from there. Written by Google’s lead AI advocate Laurence Moroney, this practical guide is a hands-on, code-first approach just for working programmers and cuts through the academic gatekeeping. Changing Your Mental Model: Data versus LogicProgrammer, your entire career has been based on a certain paradigm: you code rules, you feed in data, your program spits out an answer. This book gets you away from the scary math lectures and into the code, providing you with the hands-on skills that modern employers really want to see. This is where machine learning turns the equation on its head.
Traditional Programming [ Data ] + [ Rules ] > [ Answers ] Machine Learning [ Data ] + [ Answers ] - [ Rules ]. In this book, you will soon learn to take on this new way of thinking. You will discover how to provide data and expected answers for a model to solve a problem, and to let TensorFlow work out the rules for you rather than writing endless, brittle if/else statements. Practical Mastery of Foundational AI Areas. The book is built around very practical, real-world development scenarios that reflect the exact challenges you will face on the job, not theoretical abstractions. Computer Vision: Go beyond basic image editing. You’ll see how this shift in perspective allows us to solve complex, unpredictable problems that we couldn’t automate before.
Natural Language Processing (NLP): Learn how to turn messy, unstructured human language into clean data that a machine can understand. You will learn to build neural networks that can recognize patterns, identify specific features in an image, and classify with high accuracy what the camera is seeing. You'll write code to tokenize text, turn sentences into mathematical sequences, and train models to understand sentiment and context. Sequence modeling: You’ll learn how to deal with time-series data and sequential patterns, so that you can create models to predict future trends based on historical timelines. Transitioning from Training to Deployment: If a model only runs in a development sandbox, it’s of no use to a business. Its emphasis on deploying in production makes this book an indispensable tool for your career pivot. A true AI specialist, Moroney admits, needs to know how to take models out of the lab and into the hands of real users. Deployment TargetCore TechnologiesEngineering ValueTensorFlow Lite Mobile RuntimesIntegrate lightweight models directly into native iOS and Android apps.Web PlatformsTensorFlow. jsRun inference right in the browser, using the hardware of the client.CloudTensorFlow Serving Enterprise: Build scalable cloud APIs for high-volume model requests with low latency. You will walk through the same engineering processes you will need to compress your models, optimize them to run in constrained hardware environments, and serve them over the web or cloud infrastructure.
This way, you will get a book that offers not just knowledge on how to train a model but also knowledge on how to deliver a fully functional AI-powered product. We assume you are familiar with writing code and variables and are comfortable with loops and functions. This guide is for software developers, mobile engineers, and web programmers who want to stop watching the AI revolution from the sidelines and start building it. This book is not going to waste your time explaining what an array is or teaching you basic Python syntax. Instead, it gives you the industrial-strength TensorFlow skills and clean, code-driven frameworks you need to go from a developer writing static logic to an AI specialist who can confidently design intelligent, learning software.
Welcome to AI and Machine Learning for Coders, a book that I’ve been wanting to write for many years but that has only really become possible due to recent advances in machine learning (ML) and, in particular, TensorFlow. The goal of this book is to prepare you, as a coder, for many of the scenarios that you can address with machine learning, with the aim of equipping you to be an ML and AI developer without needing a PhD! I hope that you’ll find it useful, and that it will empower you with the confidence to get started on this wonderful and rewarding journey.
Laurence Moroney leads AI Advocacy at Google. His goal is to educate the world of software developers in how to build AI systems with machine learning. He's a frequent contributor to the TensorFlow YouTube channel at youtube.com/tensorflow, a recognized global keynote speaker, and the author of more books than he can count, including several bestselling science fiction novels and a produced screenplay. He's based in Washington state, where he drinks way too much coffee.
Rich. Robust. Rugged. Smooth as the velvet night sky, gritty as the mean streets . . . Timothy Howard Jackson has had a strong interest in creative pursuits all his life in areas such as theater, choir, orchestra, and dance. That said, his original career was in the tech sector for twenty years. When he realized that he really needed to be creative again, he found his outlet in audiobooks. He traded a keyboard for a microphone, and he now works full-time as an audiobook narrator at the foot of the beautiful Wasatch Mountains in Utah. He possesses a rich, robust, rugged voice that lends credibility and interest to all types of materials. Recording sci-fi, mysteries, thrillers, and edgy nonfiction under his name and romance under a pseudonym, he has completed ninety-five books and counting.
ASIN: B08KYN45HF
Publisher: O'Reilly Media
Publication Date: October 1, 2020
Edition: 1st Edition
Language: English
File Size: 29.0 MB
ISBN-13: 978-1492078159
Print Length: 394 Pages
Digital Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled
Post: 19
We all remember the collective gasp when ChatGPT first came out. Then computers could write code, compose essays, and speak with uncanny human fluency. It seemed like the final stop for artificial intelligence. But the business community has learned the hard way that generative chat tools are passive once the novelty wears off. Chatbots are still a big source of fascination for the public, but there’s a much more subtle and significant revolution taking place behind the scenes: the rise of AI Agents. AI agents are the next step, from passive text generation to active, autonomous execution.
They sit there waiting for a human to type a prompt, generate one response, and then freeze until the next command. Forget the search box. Think of it as a digital colleague – a system that not only gives you an answer but logs into your software for you, works with other vendors, makes decisions on the fly, and learns from its mistakes to get complicated multi-step jobs done with little human oversight. Global technologists are worried about the size of this shift. But beyond the giant wall of corporate excitement, a sober truth is looming. And Bill Gates calls it the biggest computing revolution since the graphical user interface, and Jensen Huang says the age of agentic AI is officially here, and Satya Nadella notes that it will become the primary way we interact with machines. There is a big gap between operational reality and marketing promises.
The Executive Roadmap to Agentic AI was created to specifically address that gap. For every story of perfect automation, there were dozens of failed corporate pilots with brittle agents stuck in infinite loops, blowing budgets or totally alienating customers. This jargon-free guide is for business leaders, entrepreneurs, and forward-thinking professionals who want to understand how this technology works, how to deploy it safely, and how to unlock its enormous economic potential without getting lost in technical details.
Economics of the Agent Age"Moving from manually manipulating software to an autonomous agent infrastructure introduces what we call a Compounding Intelligence Advantage. Traditional business models require a linear increase in headcount, training, and software licensing to scale an operation. Agentic systems completely remove this blockage. These models learn from every transaction that they process because they are in loops of perception, execution, and self-reflection. The more they work, the smarter, faster, and more integrated they become. [Self-Reflection] > [System Optimization] > Agent Action > [Evaluation of Result] That dynamic is part of the reason for a growing competitive gap. Those who deploy agentic early will see their operational costs drop and their speed of execution increase, putting the slow-coaches at a huge disadvantage. What You'll Learn: The authors write from raw first-hand experience deploying agentic frameworks across global enterprises and agile startups alike. This isn't just an incremental efficiency upgrade; this is a whole new economic paradigm.
You will learn to: Find the High-Value Opportunities: Get a reliable, systematic way to review the processes you already run in your business and identify the precise processes that are good for agentic automation. Gain Scalable Efficiency: Discover the structural secrets behind transformations that have cut operational overhead by more than 25% while simultaneously increasing customer satisfaction by more than 40%. This book is about strategy, organizational design, and scaling operations. Structural Guide to the Shift” The book dissects the impact across the core pillars of modern business operations so you can see how this technology is reshaping the way organizations are structured.
Business Pillar Standard Approach: The Agency State Customer Experience Linear ticketing, inflexible script-based chatbots.Autonomous empathetic agents solving complex problems in the blink of an eye.Data Processing: Manual ingestion, cleaning, and reporting.Continuous background agents that monitor, analyze, and act on data trends. Who This Book Is For This book is for CEOs, founders, operations leaders, and curious professionals who want to leave the audience and take a more active role in steering the AI transition in their industries (no software engineering background or data science degree needed to read this book). There is absolutely no heavy technical jargon, math, or code blocks. If you have a sharp strategic mind, an entrepreneurial spirit, and want to understand the profound societal, economic, and organizational transformations that are coming our way, this guide is the definitive blueprint you need to lead with purpose and integrity.
Tom Davenport is the President's Distinguished Professor of Information Technology and Management at Babson College, a Fellow of the MIT Initiative on the Digital Economy, and a Senior Advisor to Deloitte's Chief Data and Analytics Officer Program. In 2024-5 he is the Bodily Bicentennial Professor of Analytics at the UVA Darden School of Business. He pioneered the concept of "competing on analytics" with his best-selling 2006 Harvard Business Review article and his 2007 book by the same name. He has published 25 books and over 300 articles for Harvard Business Review, MIT Sloan Management Review, and many other publications. His most recent book is All Hands on Tech: The AI-Powered Citizen Revolution, co-authored with Ian Barkin. He writes columns for Forbes, MIT Sloan Management Review, and the Wall Street Journal. He has been named one of the world's "Top 25 Consultants" by Consulting magazine, one of the top 3 business/technology analysts in the world by Optimize magazine, one of the 100 most influential people in the IT industry by Ziff-Davis magazines, and one of the world's top fifty business school professors by Fortune magazine. He's also been a LinkedIn Top Voice for both the education and tech sectors.
ASIN: B0F1DS36YC
Publisher: Irreplaceable Publishing
Publication Date: March 12, 2025
Language: English
File Size: 11.9 MB
ISBN-13: 979-8992833614
Print Length: 572 Pages
Features
Screen Reader: Supported
Enhanced Typesetting: Enabled
X-Ray: Enabled
Word Wise: Enabled
Page Flip: Enabled
Post: 20
Global employment surveys every year list the software architect among the top ten best jobs in the world. High salaries, significant influence within the organization, and the intellectual satisfaction of working on the company's toughest technical challenges. Ask ten different senior developers how they got the job, and you will get completely different and ambiguous answers. For decades, the road from writing clean code to designing large, enterprise-grade systems has been a mystery. No standard curriculum, no formal roadmap, no definitive guide to cross the chasm. Developers were just assumed to get enough keyboard experience to be ready to make architectural decisions by some unspoken corporate osmosis.
The shift from developer to architect is a total mental shift. A developer’s value is determined by the depth of his knowledge of a given programming language, framework, or API. Suddenly, your knowledge base defines how valuable you are as an architect. Fundamentals of Software Architecture, Second Edition, is the definitive blueprint to unravel this career transition. Instead of worrying about how to implement a particular feature, you get paid to make the decision about which technology stack, deployment pattern, and data strategy will keep the whole company agile, stable, and secure five years from now. This new edition defines architecture as a disciplined, repeatable engineering discipline. Written by experienced, hands-on practitioners Mark Richards and Neal Ford, the authors have decades of experience guiding engineering teams and teaching architecture across the globe.
They take away stack-specific dogmas to provide universal principles that apply whether you are building in Java, Rust, Go, or deploying distributed cloud-native systems.The Holistic Matrix of Contemporary Architecture. To be a good architect in the modern tech landscape, you have to look far beyond simple system diagrams. Richards and Ford unpack the position into an intricate matrix of technical mastery, operational engineering, and leadership abilities, which are interlinked. [System Design and Patterns] comes before 'Operational Engineering' and after 'Business Alignment and Soft Skills'. Core Pillars of Architectural Blueprint. The book charts the chaotic world of software architecture into distinct, scannable structural zones, providing you with a repeatable framework to analyze and solve any enterprise system challenge:
1. The book maps the chaotic world of software architecture into clear, scannable structural zones. It gives you a framework to analyze and solve any enterprise system challenge. Deconstruction of Patterns and Architectural Styles. Beyond buzzwords, you will dive deep into various structural styles such as microservices, modular monoliths, event-driven architectures, space-based patterns, microkernels, and the classic layered systems. The authors teach you how to analyze the precise trade-offs of each style to help you match the architecture to the business domain without getting caught up in industry hype.
Component Determination and Granularity: Learn the systematic mechanics of component determination in a system. You will understand the deep engineering behind data partitioning, granularity, coupling, and cohesion. If you can clearly decompose a monolithic data structure or identify the best boundaries of a microservice, then your teams will be able to deploy code independently without creating a distributed nightmare.
3. As a discipline of rigorous engineering, stop making architectural decisions based on gut feelings or arbitrary preferences. This book provides concrete metrics, fitness functions, and mathematical valuations to bring system design into a more scientific rigor. You will learn how to write automated tests that measure architectural characteristics of your system, such as scalability, elasticity, and security, on a regular basis to make sure that your code does not deteriorate over time. Generative AI and Modernism. The cloud ecosystem today is moving at a breakneck pace. In this guide, we discuss the rise of generative AI in automated code generation and system design, cloud cost optimization, and radical changes in operational approaches. The Operational Trade-Off Framework provides an immediate picture of how the book’s structure supports complex decision-making by presenting core architectural characteristics against real-world operational trade-offs: Architectural Style, Key Strengths, Modular Monolith / Structural Cost / Trade-Off. Low operational complexity, high performance.Longer deployment cycles, risk of code getting coupled over time, microservices, maximum scalability, independent deployments, extreme network latency, and complex data consistency in the distributed environments.
Event: You will learn how to lead teams effectively, how to work with developers, and how to connect with the business. Richards and Ford argue that a substantial portion of the book is dedicated to the crucial but not-so-glamorous interpersonal skills that set the most successful architects in the business apart. This book goes deep into high-stakes negotiation, handling technical debt discussions with non-technical leaders, crafting compelling architecture presentations, and setting up clear engineering governance without turning into a corporate bottleneck. This book is written for Senior Developers, Technical Leads, and Aspiring or Existing Software Architects who want to move from building isolated applications to designing enterprise-grade ecosystems. We assume you know how to write clean, production-ready code and understand basic software engineering concepts.
You will not waste time with tutorials on how to get started on programming or how to set up a particular database. That’s not it. It’s the deep structural design patterns, analytical metrics, and leadership strategies that will take you from an engineer who gives directions to a confident, visionary architect who makes an organization’s technological future.
Mathematicians create theories based on axioms, assumptions for things indisputably true. Software architects build axioms as well, but the software world is, well, softer than mathematics: fundamental things continue to change at a rapid pace in the software world.
The software development ecosystem exists in a constant state of dynamic equilibrium: while it exists in a balanced state at any given point in time, it exhibits dynamic behavior over the long term. A great modern example of the nature of this ecosystem follows the ascension of containerization and the attendant changes wrought: tools like Kubernetes didn’t exist a decade ago, yet now entire software conferences exist to service its users. The software ecosystem changes fractally: one small change causes another small change; when repeated hundreds of time, it generates a new ecosystem.
Mark Richards is an experienced hands-on software architect involved in the architecture, design, and implementation of microservices architectures, service-oriented architectures, and distributed systems. He's been in the software industry since 1983 and has significant experience and expertise in application, integration, and enterprise architecture. He's the author of numerous O'Reilly technical books and videos, including Fundamentals of Software Architecture, Software Architecture: The Hard Parts (both with Neal Ford) several books on microservices, the Software Architecture Fundamentals video series, the Enterprise Messaging video series and was a contributing author to 97 Things Every Software Architect Should Know. A speaker and trainer, he's given talks on a variety of enterprise-related technical topics at hundreds of conferences and user groups around the world.
ASIN: B0F1BWQGYZ
Publisher: O'Reilly Media
Publication Date: March 12, 2025
Edition: 2nd Edition
Language: English
File Size: 57.0 MB
ISBN-13: 978-1098175474
Print Length: 856 Pages
Digital Features
Enhanced Typesetting: Enabled
Page Flip: Enabled
X-Ray: Not Enabled
Word Wise: Not Enabled