Egocentric vision puts a camera on the user’s head and lets algorithms see exactly what what the person sees. From this first-person view, we can model the user’s behaviour, forecast the next object or action, and decode multi-step activities, such as cooking a recipe. These capabilities are fundamental building blocks for wearable AI assistants which can understand the user’s intent and provide assistance. Towards the creation of such wearable assistants, the kitchen offers an ideal proving ground, combining dense hand-object interactions, multi-tasking, and procedural goals. In this talk, I will trace how first‑person datasets—such as EPIC‑KITCHENS, Ego4D, and Ego-Exo4D—have turned everyday cooking into a laboratory for tasks like action recognition, action forecasting, next-active-object prediction, and task-graph recovery. I’ll highlights our lab’s contributions across those tasks, and discuss how the resulting tools can enable recipe guidance, smart‑kitchen assistance, and safety monitoring in the kitchen. I’ll close by outlining the open challenges in egocentric vision, including online and streaming inference.
Since 2008, our group has advanced food image recognition, and we have translated these findings into easy-to-use smartphone applications. First, we developed an on-device, low-latency food-recognition app that classifies dishes without cloud support. Building on this, a real-time food detection app estimates the amount of calories of each dish just by holding a smartphone camera over a meal, while a 3-D volume-estimation app reconstructs food shape to refine nutritional values. We also address complex dining scenarios: an app for shared dishes like hot pots or family platters dynamically tracks individual servings, and another system simultaneously recognizes every diner’s plate around a table, creating collective food logs. This talk introduces these apps, outlines the underlying computer-vision techniques, and discusses future directions for personalized, seamless dietary management on mobile devices.
Dr. Keiji Yanai is a Professor in the Department of Informatics at the University of Electro-Communications, Tokyo. His research spans object recognition, web multimedia processing, social media mining, and food-related multimedia computing. Since 2008 he has pioneered food image recognition and released the widely used UEC-FOOD 100/256 dataset. He has served as General Co-Chair for IEEE MIPR 2021, ACM Multimedia Asia 2022, and MMM 2025, and as Technical Program Co-Chair for ACM ICMR 2018. Dr. Yanai also co-founded the MADiMa workshop series.
Stavroula Mougiakakou, Associate Professor at the Faculty of Medicine, University of Bern, holds a Ph.D. in electrical and computer engineering from the National Technical University of Athens. She leads the AI in Health and Nutrition Lab at the ARTORG Center for Biomedical Engineering Research, Bern. Her research focuses on developing and validating AI and ML approaches for analyzing multimodal health data. Her work supports prevention, personalized diagnosis, prognosis, and treatment of acute and chronic diseases. She has successfully led numerous R&D projects, resulting in various publications, patents, and technology transfer activities.
Grocery product detection and recognition pose significant challenges in computer vision due to the fine-grained nature of item classes, frequent packaging variations, and high visual similarity among products. This task has been initially approached through hybrid pipelines that integrate traditional computer vision techniques with deep features extracted from convolutional neural networks, yielding interesting results. With the advent of fully deep learning-based architectures, recent approaches have demonstrated improved accuracy and generalization capabilities in complex and highly unconstrained environments. This talk examines the evolution from hybrid to end-to-end deep learning methods, highlighting the benefits of modern approaches for grocery detection and recognition tasks. The discussion includes key challenges in dataset construction, along with strategies to address these issues such as data augmentation or synthetic data generation. Emerging research directions are finally presented, focusing on the development of efficient and scalable solutions for real-world applications, including smart retail environments, automated inventory systems, and assistive technologies.
Food plays a pivotal role in human society, offering vital nutrients while anchoring social interactions and representing cultural identities. Modern technologies have enriched the social experience with food by providing a wealth of information beyond the traditional sensory aspects. Today, computer vision algorithms show near-perfect performance, better than human when there are clear, well curated and enough amount of data. However, there remains a substantial gap when it comes to applying state-of-the-art computer vision algorithms to food data, particularly when dealing with food in its natural, uncontrolled environment, often referred to as “data in the wild.” This gap stems from the inherent challenges in noisy, watermarked, and low-quality food data readily available on the internet, as well as food data presented in different languages and cultural contexts. Food images present unique challenges for few-shot learning models due to their visual complexity and variability. For instance, a pasta dish might appear with various garnishes on different plates and in diverse lighting conditions and camera perspectives. This problem leads to losing focus on the most important elements when comparing the query with support images, resulting in misclassification. To address this issue, we proposed Stochastic-based Patch Filtering for Few-Shot Learning (SPFF) to attend to the patch embeddings that show greater correlation with the class representation. The key concept of SPFF involves the stochastic filtering of patch embeddings, where patches less similar to the class-aware embedding are more likely to be discarded. We validate our approach through extensive experiments on few-shot classification benchmarks: Food-101, VireoFood-172 and UECFood-256, outperforming the existing SoA methods.
Prof. Petia Radeva is a Full professor at the Universitat de Barcelona (UB), Head of the Consolidated Research Group “Artificial Intelligence and Biomedical Applications (AIBA)” at the University of Barcelona. Her main interests are in Machine/Deep learning and Computer Vision and their applications to health. Specific topics of interest: data- centric deep learning, uncertainty modeling, self-supervised learning, continual learning, learning with noisy labeling, multi-modal learning, NeRF, food recognition, food ontology, etc. She was PI of UB in 12 European and international and 45+ national projects devoted to applying Computer Vision and Machine learning for real problems like food intake monitoring. She is Editor in Chief of Pattern Recognition journal (Q1, IP=7.6). She was a Research Manager of the State Agency of Research (Agencia Estatal de Investigación, AEI) of the Ministry of Science and Innovation of Spain (2019-2024). She supervised 26 PhD students and published 120+ SCI journal publications and in total, 400+ international chapters and proceedings, her Google scholar h-index is 58 with 12800+ cites. Petia Radeva belongs to the top 2% of the World ranking of scientists with the major impact in the field of TIC according to the citations indicators of the popular ranking of Stanford. Also, she was selected in the first 6% of the ranking of Spanish and foreign most cited female researchers from any field according to the Ranking of CSIC: https://lnkd.in/djx2Yz5p. Moreover, she was awarded the prestigous “Narcis Monturiol” medal in 2024, IAPR Fellow since 2015, ICREA Academia’2015 and ICREA Academia’2022 assigned to the 30 best scientists in Catalonia for her scientific merits, received several international and national awards (“Aurora Pons Porrata” of CIARP, Prize “Antonio Caparrós” for the best technology transfer at UB, etc). She was declared in 2024 as The Research TIC Women of the Year by the Catalonian Government. Included in the Forbes Women list 2025 for one of the most influential women in Catalunya.
Food is one of the most powerful levers for positively impacting people's health and global sustainability. However, the major diets that characterize current societies worldwide fail to represent nutritionally balanced and environmentally friendly eating patterns. Simultaneously, health technologies have gained increasing attention given the widespread and daily use of mobile applications, representing a promising tool for promoting healthy food habits. In this context, AI can become a useful tool in numerous contexts to guide food choices, monitor food consumption and food waste, and mention how important it is in supporting education processes for healthy and sustainable behaviors and habits in the general population. Several real-life settings will be presented in which the use of AI, with the above purposes, can be applied. Considering that the scientific literature indicates how our eating habits are neither healthy nor sustainable, there is a need to develop population-tailored intervention strategies to redirect food choices to a healthier and more sustainable diet. Improving diet adequacy in the population is a needed strategy, but it requires multidimensional systems involving policy interventions, education, and community support. the integration of AI and local food policy can lead to greater food awareness, better health, and a more sustainable food system.
Francesca Scazzina is associate professor of human nutrition at the University of Parma, where she is the president of Master Degree in Human Nutrition Sciences. She is senior collaborator of The Need for Nutrition. Education/Innovation Programme (NNEdPro), Global Centre for Nutrition and Health, St John’s Innovation Centre, Cambridge, UK. She is board member of Italian Society of Human Nutrition (SINU). Her research work started in 2005 with a main focus on complex carbohydrate, dietary fibre, prebiotics, and phytochemicals in foods, and their effects on metabolism and intestinal functions. During her Ph.D. studies, she was a visiting scientist at the School of Food Biosciences, Food Microbial Sciences Unit, University of Reading (UK). She was also involved in population surveys in the Italian section of the European Prospective Investigation into Cancer and Nutrition (EPIC). Since 2009, being involved in food educational projects implemented in primary schools of the Parma area, she has acquired a deep experience in educational health learning programs and children’s population surveys. She was the co-founder of MADEGUS spin-off of the University of Parma (founded in 2013), focused on developing specific strategies and educational tools to improve nutritional knowledge, dietary habits, and lifestyle. On this topic, she has been and is involved in the European Union's Horizon 2020 Projects and PRIMA Projects. Since 2012, she has been working on the “Pappa di Parma” project, consisting of the optimization of several recipes developed according to food availability, nutritional needs, technological accessibility, and sensory acceptance, in countries where chronic malnutrition is the main cause of death among children. In this framework, she has been involved in cooperation projects in African countries. From 2016, she has been working in the framework of diet sustainability being a partner in the European Union's Horizon 2020 Projects. In this field, I also participated in the European Food Risk Assessment (EU-FORA) Fellowship Programme (EFSA). She has been appointed from the University of Parma as a contact person for the "Territorial Food Policy" project promoted by the Municipality of Parma for food security, food education, healthy and sustainable diet. Moreover, she is the reference person for the University of Parma for the Spoke 7 (Policy, behaviour, and education: smarter behaviours for healthier diets) of the ONFoods PNRR Project. In the research fields mentioned above, she is co-author of several publications in international peer reviewed journals. About the studies promoted in the above fields, she has participated as a speaker and as a member of the scientific committee in numerous national and international congresses. Moreover, on the same topics, she is a member of scientific committees of research institutions as well as public and private institutions dealing with science dissemination.
The increasing complexity of global food supply chains has raised the demand for rapid, reliable, and transparent food authentication solutions. Multispectral imaging (MSI) has emerged as a powerful tool to detect food fraud, adulteration, and traceability issues by capturing spectral features beyond human vision. However, the use of artificial intelligence (AI) to interpret MSI data introduces new challenges regarding transparency and trust. This talk explores how Explainable AI (XAI) techniques can be integrated with MSI to build interpretable and trustworthy food authentication systems, with a focus on industrial applicability and regulatory compliance.
Sylvio Barbon Junior is an Associate Professor in the Department of Engineering and Computer Science at the University of Trieste and Head of the Machine Learning Laboratory. His research interests include process mining, artificial intelligence applied to food science, explainable AI (XAI), and intelligent systems for quality control and automation. He is actively involved in the industrial PhD program "Applied Data Science and Artificial Intelligence" (ADSAI), developed in collaboration with companies in the agri-food sector across Brazil and Europe. https://www.linkedin.com/in/barbon/