Understanding Factual Hallucination in Multilingual LLM

Overview

Large Language Models (LLMs) have shown remarkable progress in many NLP tasks, but still suffer from factual hallucinations—instances where models generate incorrect or fabricated information. While this problem has been well-studied in English, its behavior in multilingual and low-resource languages remains underexplored. Our work investigates this phenomenon using Llama 3.1 (8B), focusing on Indonesian (mid-resource) and Javanese (low-resource) as case studies.

What We Did

We designed a multilingual factual recall evaluation using a translated and validated version of the Counterfact dataset. The model was tested in English, Indonesian, and Javanese to measure how well it retrieves factual knowledge across languages. We used Logit Lens analysis to trace internal representations and identify where hallucinations occur within the model’s layers.

Key Findings

Factual accuracy drops significantly from English (44.8%) to Indonesian (37.1%) and Javanese (34.1%)
Knowledge enrichment hallucinations—where the model fails to retrieve relevant information—occur more frequently in lower-resource languages.
Errors are most common for PERSON and ORGANIZATION entities, highlighting cross-lingual inconsistencies in how models encode and retrieve facts.

Visual Summary

Future Direction

We plan to extend this study using attribute lens probing, which maps subject representations to factual object predictions across languages. This will help uncover how multilingual LLMs internally encode and link factual knowledge.

Page updated

Google Sites

Report abuse