FM&LLM&GM2025

1st International Workshop on Foundation, Multimodal Large Language and Generative Models for Face and Gesture Recognition (FM&LLM&GM2025)

19th IEEE International Conference on Automatic Face and Gesture Recognition (FG) - Clearwater, Florida, USA | May 30, 2025

The field of face and gesture recognition has recently experienced a transformative shift with the rise of foundation models, generative models, and multimodal large language models (LLMs), which offer unprecedented capabilities to process and integrate multimodal data (e.g., text, images, video, and audio) in a unified framework. This workshop aims to explore the implications and potential uses of these models specifically for face and gesture recognition tasks. Foundation models (such as CLIP, GPT, etc.) enable robust feature extraction and transfer learning. In addition, generative models allow synthetic data generation, privacy-preserving learning, and advanced data augmentation techniques. As LLMs increasingly support multimodal functionalities, they provide a promising avenue to advance the field beyond traditional techniques, facilitating richer, contextually aware, and potentially more accurate recognition systems, among other key aspects in the field such as explainability.

This workshop will foster collaboration among researchers interested in advancing foundation, multimodal LLMs and generative models for face and gesture recognition, encouraging interdisciplinary insights, and new research that leverages these models for tasks such as real-time emotion recognition, social behavior analysis, and advanced biometrics. Expected outcomes include:

Identification of critical challenges and research gaps for foundation, multimodal LLMs and generative models in face and gesture recognition.
Presentation of cutting-edge research methodologies and results that leverage these models.
Engagement in meaningful dialogue regarding ethical considerations, bias mitigation, and deployment challenges for these systems.

Papers are invited to report on following topics, but not limited to:

Adapting foundation models for face and gesture recognition,
Zero-shot and few-shot learning for gestures and facial analyses using LLMs,
Contextual reasoning and interpretation of gestures and expressions via LLMs,
Ethical, privacy, and robustness challenges in LLM-driven biometric systems,
Biometric system components (beyond recognition) based on foundation models and LLMs,
Generating synthetic datasets for face, body, and gesture analysis,
and many more...

The FM&LLM&GM2025 workshop is organized within the scope of IEEE FG 2025 and all accepted and presented papers will be published within the main "IEEE FG 2025 Proceedings" that will appear on IEEE Xplore.

Important Dates

Paper Submission Deadline: April 9, 2025 April 20, 2025 (extended)

Notification of Acceptance: April 25, 2025

Camera-Ready Submission: May 9, 2025 (same as main conference)

Workshop: May 30, 2025

Submission Guidelines

Submissions to FM&LLM&GM2025 can be up to 8-pages and with an unlimited number of references, similar to the main conference. For paper formatting, please follow the instructions posted on the main IEEE FG 2025.

The paper templates are posted here as well:

Kindly note that we are using a CMT instance for paper collection that is distinct from the CMT instance of the main conference. To submit your paper, use the following URL:

Submission link

And select "First International Workshop on Foundation and Multimodal Large Language and Generative Models for Face and Gesture Recognition".

The reviewing process will be “double blind” and the submitted papers should, therefore, be appropriately anonymized not to reveal the authors or authors’ institutions. The final decisions will be rendered by the workshop organizers and will take into account the review content as well as the decision recommendations made by the Technical Program Committee members.

Keynote Speaker: Kevin Bowyer

Title: "What makes a good quality face recognition training set?"

Abstract: This talk will start with comments on web-scraped, in-the-wild training sets. And possibly also on CVPR and FG reviewing. Then we will touch on the “identity problem” in training sets of images of persons who don’t exist. Lastly, we will describe an example of a training set with targeted synthetic enhancements as a means of training set augmentation to solve a specific problem. This talk should be entertaining and informative for a broad audience.

Bio: Kevin Bowyer is the Schubmehl-Prein Family Professor of Computer Science and Engineering at the University of Notre Dame. He is a Fellow of the AAAS, IEEE and IAPR, past EIC of the IEEE Transactions on Pattern Analysis and Machine Intelligence and the IEEE Transactions on Biometrics, Behavior, and Identity Science, recipient of a Technical Achievement Award from the IEEE Computer Society, and of the Meritorious Service Award and the Leadership Award from the IEEE Biometrics Council.

Workshop Program

(Friday, May 30, 2025)