The 5th IEEE/CVF CVPR Precognition Workshop
Vancouver, Canada
June 18th, 2023
Precognition: Seeing through the Future
in conjunction with
Vancouver, June 18th - 22nd, 2023
Topics of the Workshop
Vision-based detection and recognition studies have been recently achieving highly accurate performance and were able to bridge the gap between research and real-world applications. Beyond these well-explored detection and recognition capabilities of modern algorithms, vision-based forecasting will likely be one of the next big research topics in the field of computer vision. Vision-based prediction is one of the critical capabilities of humans, and the potential success of automatic vision-based forecasting will empower and unlock human-like capabilities in machines and robots.
One important application is in autonomous driving technologies, where vision-based understanding of a traffic scene and prediction of movement of traffic actors is a critical piece of the autonomous puzzle. Various sensors such as camera and lidar are used as the "eyes" of a vehicle, and advanced vision-based algorithms are required to allow safe and effective driving. Another area where vision-based prediction is used is the medical domain, allowing deep understanding and prediction of future medical conditions of patients. However, despite its potential and relevance for real-world applications, visual forecasting or precognition has not been in the focus of new theoretical studies and practical applications as much as detection and recognition problems.
Through the organization of this workshop we aim to facilitate further discussion and interest within the research community regarding this nascent topic. This workshop will discuss recent approaches and research trends not only in anticipating human behavior from videos but also precognition in multiple other visual applications, such as medical imaging, healthcare, human face aging prediction, early even prediction, autonomous driving forecasting, etc.
In this workshop, the topics of interest include, but are not limited to:
Early event prediction
Activity and trajectory forecasting
Multi-agent forecasting
Human behavior and pose prediction
Human face aging prediction
Predicting frames and features in videos and other sensors in autonomous driving
Traffic congestion anomaly prediction
Automated Covid-19 prediction in medical imaging
Visual DeepFake prediction
Short- and long-term prediction and diagnoses in medical imaging
Prediction of agricultural parameters from satellite, drone, and ground imagery
Databases, evaluation, and benchmarking in precognition
This is the fifth Precognition workshop organized at CVPR. It follows very successful workshops organized since 2019, which all featured talks from researchers across a number of industries, insightful presentations, and large attendance. For full programs, slides, posters, and other resources, please visit the 2019, 2020, 2021, and 2022 workshop websites.
Important Dates (anywhere on Earth)
Paper submission deadline: March 19, 2023
Notification to authors: April 4, 2023
Camera-ready deadline: April 13, 2023
Video presentation submission: May 26th, 2023
Workshop: June 18th, 2023
Invited Speakers
Director, Perception at Aurora
Professor and Chair, IEEE Fellow
Dept. of Computer and Information Sciences
University of Delaware
Research Director, Salesforce Research
Co-Director, Stanford Vision and Learning Lab
Adjunct Professor, CS Dept., Stanford University
Post-doctoral Scholar at Stanford Vision and Learning Lab (SVL)
(Incoming) Assistant Professor, CS Dept., University of Illinois Urbana-Champaign (UIUC)
Program (times are in PT timezone)
Location: Vancouver Convention Center, West 207
Time: 1:00 PM–5:00 PM, June 18th
11:30AM - Poster session (spots #12-#22 in the West Exhibit Hall A), includes all full papers listed in the main program as well as the following extended abstracts:
“WalkingDynamicsH36M: a Benchmarking Dataset for Long-term Motion and Trajectory Forecasting”, Cecilia Curreli (Technical University of Munich / National Institute of Informatics Japan), Andreu Girbau (National Institute of Informatics), Shin'ichi Satoh (National Institute of Informatics) [paper] [video] [poster]
“Low-latency Event-based Object detection with Asynchronous Graph Neural Networks”, Daniel Gehrig (University of Zurich & ETH Zurich), Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland) [paper] [video] [poster]
“Latency Matters: Real-Time Action Forecasting Transformer”, Harshayu Girase (University of California, Berkeley), Nakul Agarwal (Honda Research Institute USA), Chiho Choi (Samsung Semiconductor US), Karttikeya Mangalam (UC Berkeley) [paper] [video] [poster]
1:00PM - Main program kick-off
1:10PM - Invited talk: Yunzhu Li, "Learning Structured World Models From and For Physical Interactions"
Abstract: Humans have a strong intuitive understanding of the physical world. Through observations and interactions with the environment, we build a mental model that predicts how the world would change if we applied a specific action (i.e., intuitive physics). My research draws on insights from humans and develops model-based reinforcement learning (RL) agents that learn from their interactions and build predictive models of the environment that generalize widely across a range of objects made with different materials. The core idea behind my research is to introduce novel representations and integrate structural priors into the learning systems to model the dynamics at different levels of abstraction. I will discuss how such structures can make model-based planning algorithms more effective and help robots to accomplish complicated manipulation tasks (e.g., manipulating an object pile, shaping deformable foam into a target configuration, and making a dumpling from the dough using various tools).
1:45PM - Lightning talks (full papers)
“A Unified Model for Continuous Conditional Video Prediction”, Xi Ye (Polytechnique Montreal), Guillaume-Alexandre Bilodeau (Polytechnique Montréal) [open access] [video] [poster]
“Best Practices for 2-Body Pose Forecasting”, Muhammad Rameez Ur Rahman (Sapienza University of Rome), Luca Scofano (Sapienza University of Rome), Edoardo De Matteis (La Sapienza University of Rome), Alessandro Flaborea (Sapienza University of Rome), Alessio Sampieri (Sapienza University), Fabio Galasso (Sapienza University of Rome) [open access] [video] [poster] - BEST PAPER AWARD
“3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive Physics under Challenging Scenes”, Haotian Xue (Georgia Tech), Antonio Torralba (MIT), Joshua Tenenbaum (MIT), Daniel Yamins (Stanford University), Yunzhu Li (Stanford University & University of Illinois at Urbana-Champaign), Hsiao-Yu Tung (Carnegie Mellon University) [open access] [video] [poster]
2:15PM - Invited talk: Varun Ramakrishna, "Perception for Autonomous Trucking: Lessons from the Trenches"
Abstract: Autonomous trucks have the potential to fundamentally change modern logistics and supply chains while improving road safety. In this talk we outline some of the technical challenges of building a perception system for this problem and walk through some of the lessons learned while building this technology to operate in the real world.
2:50PM - Lightning talks (full papers)
“StillFast: An End-to-End Approach for Short-Term Object Interaction Anticipation”, Francesco Ragusa (University of Catania), Giovanni Maria Farinella (University of Catania), Antonino Furnari (University of Catania) [open access] [video] [poster]
“Bush Detection for Vision-based UGV Guidance in Blueberry Orchards: Data Set and Methods”, Vladan Filipovic (BioSense Institute), Dimitrije Stefanovic (BioSense Institute), Nina Pajevic (BioSense Institute), Zeljana Grbovic (BioSense Institute), Nemanja Djuric (BioSense Institute), Marko Panic (BioSense Institute) [open access] [video] [poster]
“DPOSE: Online Keypoint-CAM Guided Inference for Driver Pose Estimation with GMM-based Balanced Sampling”, Yuyu Guo (Alibaba), Yancheng Bai (Alibaba), Daiqi Shi (Alibaba), Yang Cai (Alibaba), Wei Bian (Alibaba) [open access] [video] [poster]
3:20PM - Invited talk: Weisong Shi, "Vehicle Computing: Vision and Challenges"
Abstract: Vehicles have been majorly used for transportation in the last century. With the proliferation of onboard computing and communication capabilities, we envision that future connected vehicles (CVs) will serve as a mobile computing platform in addition to their conventional transportation role for the next century. In this article, we present the vision of Vehicle Computing, i.e., CVs are the perfect computation platforms, and connected devices/things with limited computation capacities can rely on surrounding CVs to perform complex computational tasks. We also discuss Vehicle Computing from several aspects, including key and enabling technologies, case studies, open challenges, and the potential business model.
3:55PM - Lightning talks (full papers)
“CIPF: Crossing Intention Prediction Network based on Feature Fusion Modules for Improving Pedestrian Safety”, Je-Seok Ham (Electronics and Telecommunications Research Institute), Dae Hoe Kim (Electronics and Telecommunications Research Institute), NamKyo Jung (Korea University), Jinyoung Moon (Electronics and Telecommunications Research Institute) [open access] [video] [poster]
“DNA: Deformable Neural Articulations Network for Template-free Dynamic 3D Human Reconstruction from Monocular RGB-D Video”, Khoa Vo (University of Arkansas), Trong Thang Pham (University of Arkansas), Kashu Yamazaki (University of Arkansas), Minh Tran (University of Arkansas), Ngan Le (University of Arkansas) [open access] [video] [poster]
4:15PM - Invited talk: Juan Carlos Niebles, "Procedural Knowledge and Instructional Video Understanding"
Abstract: Assistive technologies of the future will benefit from a detailed understanding of events and actions in the environment, so that they can proactively assist users in a contextualized manner. In particular, when people are performing tasks or learning how to perform a task, it will be important for these systems to exploit knowledge about the task such as the encompassing steps, objects, tools, and other procedural information. In this talk, I’ll discuss some of our efforts around understanding instructional videos, which include extracting and utilizing procedural knowledge to enable models to recognize tasks, perceive the state of the process, and predict the potential next steps given a partially executed task.
4:50PM - Workshop wrap-up
5:00PM - End of workshop
Submission Instructions
All submitted work will be assessed based on their novelty, technical quality, potential impact, insightfulness, depth, clarity, and reproducibility. For each accepted submission, at least one author must attend the workshop and present the paper. Information about formatting and style files is available here. There are two ways to contribute submissions to the workshop:
Extended abstracts submissions are single-blind peer-reviewed, and author names and affiliations should be listed. Extended abstract submissions are limited to a total of four pages (including references). Extended abstracts of already published works can also be submitted. Accepted abstracts will not be included in the printed proceedings of the workshop.
Full paper submissions are double-blind peer-reviewed. The submissions are limited to eight pages, including figures and tables, in the CVPR style. Additional pages containing only cited references are allowed. Accepted papers will be presented in an oral session. All accepted full papers will be published by the CVPR in the workshop proceedings.
Submission website: https://cmt3.research.microsoft.com/Precognition2023
Organizers
For questions please contact the organizers at precognition.organizers@gmail.com.
Program Committee
Apoorv Singh (Motional)
Abhishek Mohta (Aurora)
Boris Ivanovic (Nvidia Research)
Fang-Chieh Chou (DoorDash Labs)
Henggang Cui (Motional)
Joshua Manela (Waymo)
Kha Gia Quach (PDActive)
Li Liu (HKUST)
Meng Fan (Aurora)
Mohana Moorthy
Sebastian Lopez-Cot (Aurora)
Shivam Gautam (Ford)
Shreyash Pandey (Apple)
Vladan Radosavljevic (Spotify)
Yan Xu (CMU)
Zhaoen Su (Meta)
It was a virtual workshop for the paper presentations, the posters and the talks. Google generously sponsored to reward the authors of the best paper.
It was a virtual workshop for the paper presentations, the posters and the talks. Google generously sponsored to reward the authors of the best paper.
It was a virtual workshop for the paper presentations, the posters and the talks. Uber ATG generously sponsored to reward the authors of the best paper and the best student paper.
There were about 300 attendants for the paper presentations, the posters and the talks. Uber ATG generously sponsored to reward the authors of the best paper and the best student paper.