Call for Contributions

This workshop will provide a platform to discuss the latest advances and trends in theory, methodologies, and applications in the field of multimodal learning. The workshop theme for this year will be on the use of foundation models. These foundation models, such as BERT, T5, LLaMA and GPT-4 which were trained on massive data collections, have significantly revolutionized the field of natural language processing (NLP). The use of such foundation models for solving several NLP tasks represent a fundamental paradigm shift in the way several problems are being solved especially due to their ability to integrate knowledge from other domains such as computer vision (DALL-E, CLIP), retrieval, knowledge graphs and more. Moreover, foundation models have brought some fundamental changes to the multimodal problem setting, especially when integrating text or images with graphs, time-series, and other forms of structured data. As such, the workshop aims to focus on utilizing these foundation models and integrating multiple modalities. Though the workshop might also include discussions and papers about general multimodal learning problems, more emphasis will be given to the works that utilize recently developed foundation models. Our goal will be to explore and showcase the innovative ways in which multimodal learning and data fusion can be employed, with a particular emphasis on how to leverage the capabilities of foundation models for these purposes. The workshop topics include, but are not limited to:



The workshop seeks to bring together researchers in the machine learning and data mining communities and provide a unique opportunity for interdisciplinary researchers to explore and data interactions with foundation models between various modalities, such as text, images, graphs, tabular data, time-series, and more. This workshop will feature invited talks, accepted paper presentations, and a panel discussion to encourage knowledge sharing and foster cross-team collaboration within research and industry communities in the fields of Natural Language Processing (NLP), Information Retrieval, Data Mining, Machine Learning, and others.

Important Dates 

(Time: Anywhere on Earth)

Paper Submission

Paper Acceptance Notification

Camera-Ready Submission

Workshop Date

Submission Guidelines

Paper submissions are limited to 9 pages, excluding references, must be in PDF and use ACM Conference Proceeding templates (two column format). 


Additional supplemental material focused on reproducibility can be provided. Proofs, pseudo-code, and code may also be included in the supplement, which has no explicit page limit. The supplement format could be either single column or double column. The paper should be self-contained, since reviewers are not required to read the supplement. 


The Word template guideline can be found here: https://www.acm.org/publications/proceedings-template

The Latex/overleaf template guideline can be found here: https://www.overleaf.com/latex/templates/association-for-computing-machinery-acm-sig-proceedings-template/bmvfhcdnxfty


The submissions will be judged for quality and relevance through single-blind reviewing.


A paper should be submitted in PDF format through EasyChair at the following link: https://easychair.org/my/conference?conf=multimodalkdd2023