Team Members: Abhay Shashidhara, Chaitanya Kadam , Francisco Sevilla
SEMANTIC SAGES
This project focuses on building a more reliable question answering system for 3GPP RAN specifications, especially 3GPP TS 38.331, where answers must be precise and fully supported by the source text. Since these documents are long, technical, and highly detailed, language models can sometimes generate responses that sound correct even when the document does not actually support them. Our work is centered on making the system more evidence-aware, grounded, and cautious when the available context is weak.
The pipeline begins by converting the specification PDF into clean, structured text and organizing it into searchable chunks with useful metadata. This processed document then becomes the knowledge base for the system, allowing relevant sections to be retrieved for each user query. The retrieved passages are passed to a generation module that produces answers in a controlled format, along with supporting evidence and a confidence signal.
A major direction of the project is hallucination reduction. Instead of forcing an answer every time, the system is designed to recognize when the evidence is insufficient and abstain when necessary. We are also exploring stronger support checking and verification strategies so that the final responses remain faithful to the retrieved text.
Key highlights
Domain: 3GPP RAN question answering
Core document: 3GPP TS 38.331
Main goal: reduce hallucinations in technical Q&A
Approach: document processing, retrieval, controlled generation, and verification
System behavior: answer only when evidence is strong, otherwise abstain
Focus: improving trustworthiness, traceability, and reliability