Title: Synergizing Multimodal LLMs, Co-Agent Architectures, and Foundation Models in Manufacturing AI
Abstract: This special session focuses on the rapid advancement of Multimodal Large Language Models (MLLMs) and other foundational AI models and how industries leverage data for decision-making and process optimization. In smart manufacturing, these models can learn from heterogeneous data sources ranging from sensor streams and machine logs to visual inspection data and textual process documentation while demonstrating the ability to adapt across diverse industrial sectors. Their potential spans predictive maintenance, quality control, supply chain optimization, and real-time decision support, ultimately driving efficiency, sustainability, and innovation. Despite this promise, the deployment of such models in manufacturing settings introduces significant challenges, including data privacy, domain generalization, scalability, explainability, and integration with legacy systems. Addressing these issues requires coordinated efforts across academia, industry, and technology providers to establish best practices, create benchmarks, and explore novel architectures that meet the reliability and safety demands of industrial environments. The special session seeks to convene experts from AI, manufacturing, and Industry 4.0 to examine these opportunities and challenges, fostering dialogue on methods to advance cross-domain generalization and transfer learning for industrial applications. It further aims to encourage the sharing of open datasets, standardized benchmarks, and reproducible research frameworks to accelerate technological progress. By stimulating academic–industrial collaboration and outlining actionable research directions, the special session aspires to set the stage for the development of trustworthy, sustainable, and high-performance AI systems capable of transforming manufacturing into a more intelligent, autonomous, and resilient domain.
Aims and scope: We invite innovative and high-impact research that pushes the boundaries of AI and computer vision in industrial and manufacturing contexts. Areas of interest include, but are not limited to:
1. Multimodal Large Language Models (LLMs) in Manufacturing, including novel architectures and training strategies enabling predictive maintenance, advanced defect detection, and high-precision quality control.
2. Cross-Domain and Cross-Sector Adaptation of Foundational Models, including methods to adapt vision and language foundation models for diverse industrial processes, ensuring robustness across heterogeneous environments.
3. Connected Learning for Real-Time Optimization & Decision-Making, including integrative AI approaches enabling adaptive process control, dynamic resource allocation, and operational intelligence.
4. Explainable AI for Industry 4.0, including techniques that bring interpretability, trust, and safety to high-stakes industrial AI systems.
5. Co-Agent Architectures for Collaborative Industrial AI, including multi-agent systems and co-agent learning frameworks enabling synergistic decision-making between AI agents, robots, and human operators in dynamic production environments.
Keywords: multimodal large language models, representation learning cross-domain learning, collaborative industrial AI, connected learning, co-agents.
This special session will be co-funded by the European Commission through the Horizon Europe COGNIMAN Project (grant agreement No. 101058477).
Important Dates
Paper submission deadline 31 January 2026: (23:59, anywhere on Earth, i.e. UTC-12)
Notification: 15 March 2026
Camera-ready: 15 April 2026
Conference: 21–26 June 2026