Gaurang Sriramanan, Siddhant Bharti, Vinu Sankar Sadasivan, Shoumik Saha, Priyatham Kattakinda, Soheil Feizi
Published in NeurIPS 2025
The paper investigates efficient methods for detecting hallucinations in Large Language Models (LLMs) without requiring additional computational overhead. It proposes **LLM-Check**, a suite of techniques that analyze internal model representations such as hidden activations, attention maps, and output prediction probabilities to identify hallucinated content in a single LLM response. The approach is highly compute-efficient, requiring only a single forward pass of the model, and achieves significant speedups compared to existing methods. The authors demonstrate the effectiveness of LLM-Check across various hallucination detection settings, including when no external references are available or when multiple model responses are provided. Experimental results show that LLM-Check outperforms current baselines, offering improved performance with minimal resource usage.
Alex Cloud, Minh Le, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans
ArXiv preprint published in July 2025
This viral paper explores the event of language models passing behavioral traits to another model while undergoing knowledge distillation, through seemingly unrelated data, called subliminal learning. The authors conducted several experiments with different language models to elucidate this phenonmenon, and presented a theorem that provides mathematical support for this phenomenon.
Dong Li, Xujiang Zhao, Linlin Yu, Yanchi Liu, Wei Cheng, Zhengzhang Chen, Zhong Chen, Feng Chen, Chen Zhao, Haifeng Chen
Published in NeurIPS 2025
The research paper introduces SolverLLM, a novel framework designed to solve diverse optimization problems using Large Language Models (LLMs) without requiring costly, task-specific training. The primary challenge in automating optimization is the "problem formulation" stage—translating a problem from natural language into a precise mathematical model. Existing LLM-based methods often struggle with this, either depending on fragile prompt engineering that doesn't generalize well, or demanding extensive supervised fine-tuning on curated datasets.
SolverLLM overcomes these limitations by treating the formulation process itself as a search problem. Instead of directly outputting a final solution, the framework uses an LLM guided by a Monte Carlo Tree Search (MCTS) algorithm to explore a vast space of potential mathematical formulations. It incrementally constructs a problem's components—such as variables, constraints, and objectives—and then tests the complete formulation by generating code and running it through a standard solver. The success, failure, or quality of the outcome serves as feedback to refine the search.
The framework's novelty lies in three key enhancements to the MCTS process. First, dynamic expansion allows the model to flexibly revisit and correct earlier parts of the formulation. Second, prompt backpropagation feeds rich, contextual feedback from the solver back into the search, guiding the LLM to avoid repeating mistakes. Finally, uncertainty backpropagation gauges the LLM's confidence in its own generations, prioritizing more reliable search paths and improving efficiency.
Zhendong Yang, Zhe Li, Ailing Zeng, Zexian Li, Chun Yuan, Yu Li
Published in CVPR 2024 Workshop
This paper suggests that both shallow and deep layers in ViT are important in the distillation process, and require distinct distillation strategies. The attention maps of shallow layers of student ViTs are more similar to the corresponding attention maps of teacher models than the deeper layers. Therefore, logit-based distillation might not be a feasible option for the distillation of deeper layers. This paper addresses the issue by introducing ViTKD, which mimics the shallow layers by direct MSE loss calculation, and GENERATES the deeper layers in the teacher. Specifically, for deeper layers, masked tokens are generated using a small CNN, then MSE loss is used to align the features of the student and the teacher.
Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek & Hervé Jégou
Published in ECCV 2022
This paper provides three insights based on simple and easy-to-implement variants of vision transformers. Firstly, The residual layers in transformer layers can be processed parallelly without compromising accuracy, in contrast to typical sequential processing. This approach allows for significant reduction in inference latency. Secondly, the authors claim that finetuning the weights of the attention layers is sufficient to adapt vision transformers to a higher reslution and other classification tasks. Doing so reduces the peak memory usage as well as ensures faster training. Thirdly, Adding MLP-based patch preprocessing layers improved BERT-like self supervised training based on patch masking.
Jinnian Zhang, Houwen Peng1, Kan Wu1, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan
Published in CVPR 2022
This paper explores the weight sharing approach for vision transformer compression, and identifies two associated problems, namely, training instability and performance degradation. The authors observe the L2 norms of gradients during training and claim that strictly identical weights across different layers is the main culprit behind training instability. Moreover, Central Kernel Alignment (CKA) drops significantly in the last few layers, indicating the feature maps generated before and after weight sharing become less correlated, which can be a cause of performance degradation. In order to address these issues while model compression, this paper proposes weight multiplexing, comprised of weight transformation and weight distillation. Weight transformation simply puts a transformation matrix in in the middle of weight sharing to prevent identical weights across different layers. On the other hand, weight distillation employs attention level and hidden state distilation, as opposed to typical prediction level distillation.