Retrieval-Augmented Generation (Gemini)
RAG is an AI framework that enhances large language models (LLMs) by connecting them to external, up-to-date knowledge bases, rather than relying solely on their static training data.
This technique improves the accuracy, relevance, and contextuality of LLM outputs by allowing them to retrieve specific information from sources like internal databases, documents, or the internet before generating a response.
RAG increases trust by enabling the inclusion of citations and reduces the need for costly model retraining, making generative AI more adaptable and efficient for specific tasks.
How RAG Works
Query Input: A user enters a query or prompt into the LLM.
Information Retrieval: A retrieval system searches external knowledge bases or documents for relevant information based on the query's meaning.
This information is often converted into vectors and stored in a vector database for efficient semantic search.
Context Augmentation: The retrieved information is then combined with the original user query.
Generation: The LLM receives the augmented context and generates a response that incorporates the newly retrieved, accurate information.