14 November 2025

SmaLLEXT: The 1st Workshop on Small and Efficient Large Language Models for Knowledge Extraction

at CIKM 2025

What is the workshop about ?

The widespread adoption of large language models (LLMs) has enabled major advancements in Knowledge Extraction (KE), understanding, and reasoning over unstructured large-scale data. KE has emerged as a make-or-break bottleneck in the the real-world adoption of large language models. The substantial computational requirements of state-of-the-art LLMs impede their scalability and practical deployment, particularly in resource-constrained environments, such as enterprise systems with strict latency requirements. For instance, finance, healthcare, legal technology, and web-scale analytics all need systems that pull structured facts from noisy and heterogeneous data while respecting tight latency and memory budgets, and the today's bar of 10-B models is rarely met when high quality is at stake. The 1st Workshop on Small and Efficient LLMs for Knowledge Extraction (SmaLLEXT) focuses on the development and application of small and efficient LLMs for effective knowledge and information extraction.

Recent progress in model compression, quantization, pruning, retrieval-augmented generation (RAG), and efficient fine-tuning has shown that smaller LLMs can achieve competitive performance on a variety of downstream tasks. However, efforts around scalable KE remain fragmented in sub-fields such as NLP, information retrieval, and knowledge representation.

This workshop closes this gap by bringing the efficiency and extraction sub-communities together at CIKM to focus on data-centric, application-driven methods development. The workshop will focus on industrial and business applications in the real world, where large-scale data are abundant but often noisy, dynamic, and difficult to process reliably. It is designed to convene researchers and practitioners from academia and industry to examine strategies for compressing, distilling, specializing, and accelerating LLM while maintaining extraction accuracy and robustness against hallucination. The workshop will explore how these models can support structured data extraction from diverse formats encountered on the Web, in enterprise data stores, or across multimodal documents.

Topics of Interest

Efficient Architectures for KE

Design of compact transformer variants
Sparse and modular network
Lightweight language models

Model Compression and Acceleration for KE

Quantization, pruning, distillation, and low-rank adaptation methods
Hardware-aware model optimization
Real-time acceleration techniques

Prompt Engineering for Extraction

Zero-shot vs. few-shot templates
Chain-of-thought prompt design
Iterative prompt refinement
Instruction phrasing best practices

Retrieval-Augmented and Hybrid Approaches for KE

Retrieval-augmented generation (RAG)
Symbolic reasoning to support distilled LLMs
Lightweight memory-augmented models

Specialization and Fine-Tuning for KE

Domain adaptation and continual fine-tuning
Parameter-efficient tuning strategies for adaptation
Transfer learning approaches for specialized domains

Depth vs. Breadth in Small-Model for KE

Trade-offs in model capacity for multi-schema coverage
Domain-aware adaptive sparsity patterns
Hybrid breadth-depth pipelines (specialist modules and generalist core)
Evaluation protocols for depth vs. breadth

KE Applications on

Unstructured text (e.g., named entity recognition, relation extraction, entity linking)
Semi-structured data (e.g., tables, forms, web pages)
Multimodal data (e.g., images, PDFs, charts, and scansion)

Cross-lingual & Low-Resource Scenarios

Multilingual transfer and cross-lingual prompting
Data augmentation for low-resource languages

Evaluation, Robustness, and Interpretability for KE

Benchmarks and datasets for evaluating the models
Interpretability and explainability
Challenges of measuring faithfulness and detecting hallucinations

Annotation Strategies and Data Curation

Crowd-sourced annotation with guidelines
LLM-assisted active learning loops
Weak supervision via heuristic labeling
Annotation quality control metrics

Systems and Deployment Considerations

Industry experience in building KE pipelines
Real-world deployments
Energy-efficient training and deployment
Drift detection in extracted knowledge over time
Canary deployments and A/B testing

Adversarial Robustness & Security

Defense against prompt injections
Data-poisoning mitigation
Robustness to private data memorization (or extraction)

Ethical and Societal Implications

Fairness and bias in specialized small models
Privacy-preserving characteristics of small LLMs in KE
Memorization of public datasets/information

Key Dates

All deadlines are at 11:59pm in the Anywhere on Earth (AoE) time zone.

Paper submission deadline: August 29, 2025 Extended to September 1, 2025
Paper acceptance notification: September 26, 2025
Paper camera-ready: October 31, 2025
Workshop date: November 14, 2025

Submission guidelines

All submissions must be PDFs formatted in the Standard ACM Conference Proceedings Template as for the main conference.

The workshop invites three submission types:

Long Papers: 8 pages excluding references,
Short Papers: 4 pages excluding references,
Industry Papers: 4 pages excluding references.

Reviews will be double-blinded.

Long and short papers will be assessed based on their quality, impact, novelty, depth, clarity, and generalizability. The industry papers might be focused on challenges and practical solutions to significant real-world issues faced by industry practitioners and we expect papers to not release industrial datasets. For each accepted paper, at least one author must attend the workshop and present the paper or poster. Submitting papers that are identical (or substantially similar) to versions that have been published, accepted for publication, or submitted in parallel to other conferences (or any venue with published proceedings) is not allowed.

For further details, refer to the Submission Guidelines of the main conference.

Submit Papers