1st Workshop on Traditional Computer Vision in the Age of Deep Learning (TradiCV)

in conjunction with ICCV 2021, October 16th

Contacts: tradicv-iccv2021@googlegroups.com

NEWS:
* IJCV special issue is out: https://link.springer.com/journal/11263/topicalCollection/AC_e86ef73ac48d2cfba531f759ffe99431

tradiCV-2021.mp4

Detailed Program (EDT time)

1.00pm - Introduction by the organizers
1.10pm - Invited talk #1 by Venu Madhav Govindu
1.45pm - Invited talk #2 by Kathlén Kohn
2.20pm - Spotlight Session #1
3.00pm - Coffee break
3.10pm - Spotlight Session #2
3.50pm - Invited talk #3 by Hongdong Li
4.25pm - Invited talk #4 by David Suter
5.00pm - Closing remarks

Invited speakers

Hongdong Li

Australian National University

Title: Multi-View 3D Reconstruction for Object with Unknown Material: Traditional approach versus deep-learning method

Recovering the 3D shape of an object with unknown (possibly non-Lambertian) material is a traditional problem in computer vision, and is rather challenging to solve. In this talk, I will report two of our most recent methods for tackling this classic computer vision problem. Specifically, our first method follows the traditional approach which formulates the problem as a complex energy minimization problem. I will explain how we manage to solve the underlying non-convex optimization via a parameter-splitting and efficient randomized search algorithm. Our second method is a new deep learning based approach, based on a simple and small recurrent neural-net, which is used to over-parameterize the unknown shape and materials to allow for efficient loss minimization. Both methods are tested on synthetic and real images to validate their efficacy and superior performance.

Biography

Dr Hongdong Li is a level-E professor with ANU - Australian National University. He is a founding Chief Investigator for the Australia ARC Centre of Excellence for Robotic Vision. His research interests include Computer Vision and Machine Learning. His researches address both fundamental/theoretical and applied problems in computer vision. He is best known for contributions in non-rigid structure from motion, point cloud registration and camera pose and robot navigation (autonomous driving) and 3D computer vision in general. He has won multiple times most prestigious paper awards at major international conferences, including the CVPR Best Paper Award and the Marr Prize (Honourable Mention) in 2017, and IEEE ICIP best student paper and IEEE ICPR Best Biometrics Paper Award. One of his recent works on visual sign language processing (machine translation) was selected among the finalists for the CVPR 2020 Best Paper Award. Professor Li served on the Editorial Boards for several major international journals, including the IEEE Transactions on PAMI. He was Area Chair for IEEE CVPR, ICCV, and ECCV. He was a Co-Program Chair for ACCV-2018 and a General Chair for ACCV 2022. During 2008-2010 he was seconded to the NICTA working on the 'Australia Bionic Eyes' project whose aim is to create an implantable artificial retina to restore vision for visually impared patients. He created the ANU Master of Machine Learning and Computer Vision Program, and Served the first Program Convenor. To date he has supervised/cosupervised over 25 PhD students in the area of computer vision and artificial intelligence. His research projects are funded in part by the ARC (Australia Research Council), CSIRO, DSTG, as well as by major global technical firms including Microsoft Research, Ford, Baidu, General Motors, Toshiba, OPPO, Optus, and NVIDIA etc.

Venu Madhav Govindu

Indian Institute of Science

Title of the talk: In Defense of Geometric Estimation

Most computer vision problems can be classified as either perceptual or engineering tasks, i.e. involving tacit or explicit models respectively. In many instances, we move from an intuitive understanding of perceptual tasks to complex, engineered pipelines built around explicit models. For instance, while we can speak of the perception of depth from a single image, the problem of large-scale 3D reconstruction is distinctly different. While learning has been successful in addressing problem instances of the tacit variety, arguably we should not jettison our knowledge and understanding of explicit models and their associated methods, especially when accuracy is of importance. In this talk I will argue that, where available, explicit methods offer a richer understanding than merely the ability to provide a reasonable solution. I will illustrate this argument by considering some geometric problems in 3D reconstruction. I will also suggest that learning models can help address ambiguities in such geometric formulations.

Biography

Venu Madhav Govindu is an Associate Professor at the Department of Electrical Engineering, Indian Institute of Science, Bengaluru, India. His primary research interests are in geometric problems in large-scale 3D reconstruction. He also has an interest in modern Indian history, especially the life and work of Mahatma Gandhi.

Kathlén Kohn

KTH Stockholm

Title: Classifying Minimal Problems

Minimal problems are 3D reconstruction problems recovering camera poses or world coordinates (or both) from given images such that random input instances have a finite positive number of solutions. They play an important role in structure from motion, image matching, visual odometry, and visual localization. Many minimal problems have been described and solved and new minimal problems are constantly appearing. In this talk, we present an overview of how to determine *all* minimal problems in a given setting. Here, a setting refers to a choice of camera model, the type of present 3D objects, assumptions on the visibility, a choice of what to recover (camera poses, the 3D scene, or both), etc. The techniques to find all minimal problems come from algebraic geometry. We also describe symbolic and numerical methods to compute the algebraic degree of a minimal problem (i.e., its number of complex solutions on random input instances). The algebraic degree is an important invariant since it measures the intrinsic difficulty of the minimal problem. As an example, we show that there are exactly 30 minimal problems that recover both camera poses and world coordinates for generic arrangements of points and lines completely observed by calibrated perspective cameras. Moreover, we discuss extensions of this result to partial visibility and rolling shutter cameras. This talk is based on joint works with Timothy Duff (Georgia Tech Atlanta), Marvin Hahn (MPI MiS Leipzig), Anton Leykin (Georgia Tech Atlanta), Orlando Marigliano (KTH Stockholm), and Tomas Pajdla (CIIRC, CTU in Prague).

Biography

Kathlén Kohn is an assistant professor in Mathematics of Data and AI at KTH Royal Institute of Technology since September 2019. She obtained her Ph.D. from the Technical University of Berlin in 2018. Afterward, she was a postdoctoral researcher at the Institute for Computational and Experimental Research in Mathematics (ICERM) at Brown University and at the University of Oslo. Kathlén’s goal is to understand the intrinsic geometric structures within machine learning, computer vision, and AI systems in general and to provide a rigorous and well-understood theory explaining them. Her areas of expertise are algebraic, differential, and tropical geometry as well as invariant theory. Kathlén believes in the importance of the interaction between different scientific fields. She enjoys collaborating with scientists within and outside of mathematics to tackle applied problems and to discover interesting questions for pure mathematics motivated by applications. At the IEEE International Conference on Computer Vision (ICCV) 2019, her joint work with Tomas Pajdla (CTU Prague), Anton Leykin and Timothy Duff (both at Georgia Tech) received the Best Student Paper Award. The best papers including the Best Student Paper Award were selected by the Award Committee as the top 4 papers from all 4303 submissions.

David Suter

Edith Cowan University

Title: If (deep) learning is the methodology of choice – what do we miss?

Deep learning is the dominant “paradigm” now for computer vision. The original successes were in problems/areas lacking in good mathematical models (what is a cat? What is the structure of an image of a cat?); but has now impinged heavily even in areas where we thought we had good models (e.g., structure from motion). Of course, a paradigm doesn’t become dominant and successful, at least in an applied or engineering area, unless it successfully produces results. Moreover, if deep learning (and other learning based) approaches continue to deliver – should we seek anything more? What do we miss if we do everything by deep learning? Well, for one thing, we don’t produce a thorough (some might say, meaningful, insightful) understanding of the problem we are tackling. To hazard a sweeping generalisation: one might say that deep learning is characterised by “plumbing” (selecting successful modules for which we only have a vague characterisation of “what they do, and how they do it”), and dreaming up ways to tweak these modules or to interconnect these in useful ways. Gone is the careful problem definition, and the analysis of that problem (and how it might relate to other problems or a useful general framework). On the other hand, “traditional model based” computer vision at least affords the opportunity to discover surprising connections between problems. I will illustrate this for one area I know reasonably well: robust fitting – but more specifically, robust fitting using the maximum consensus criterion, a.k.a, MaxCon. This is a problem that has been studied in computer vision for 40 years; and yet, only recently, has been demonstrated to connect to a significant body of mathematical theory/structure. It is also linked in surprising ways to some standard “prototypical” computer science problems.

Biography

David Suter is a Research Professor at Edith Cowan University (Perth, Western Australia). Prior to that, he held a Professorship at the University of Adelaide (South Australia) from 2008 to 2017 (including a term as Head of School of Computer Science); a Professorship (and prior to that: Senior Lecturer and Associate Professor) at Monash University (Melbourne, Australia) 1992-2008; and a Lecturer at La Trobe University (Melbourne, Australia) 1988-1992. He was awarded a PhD in Computer Vision from La Trobe University (1991), BSc (applied maths and physics) from The Flinders University of South Australia 1977. He has served on the editorial boards of the journals: Int. J. Computer Vision, Pattern Recognition, IPSJ Trans. Computer Vision and Application, J. Mathematical Imaging and Vision, Machine Vision and Applications. He has also served as General Chair of major conferences (ACCV and ICIP), and on the Australian Research Council – College of Experts. His research interests have included motion estimation, robust fitting, tracking, segmentation, medical image and signal analysis, and topics in machine learning and image processing.

List of Accepted Papers (proceedings)

A Robust End-to-end Method for Parametric Curve Tracing via Soft Cosine-similarity-based Objective Function
Boran Han (Shell); Jeremy Vila (Shell)
A Technical Survey and Evaluation of Traditional Point Cloud Clustering Methods for LiDAR Panoptic Segmentation
Yiming Zhao (Worcester Polytechnic Institute); Xiao Zhang (Worcester Polytechnic Institute); Xinming Huang (Worcester Polytechnic Institute)
Finite Aperture Stereo: 3D Reconstruction of Macro-Scale Scenes
Matthew J Bailey (University of Surrey); Adrian Hilton (University of Surrey); Jean-Yves Guillemaut (University of Surrey)
Robust face frontalization for visual speech recognition
Zhiqi Kang (INRIA); Radu Horaud (Inria); Mostafa Sadeghi (INRIA); Jacob Donley (Facebook); Anurag Kumar (Facebook Research)
Object Detection in Cluttered Environments with Sparse Keypoint Selection
Viktor Seib (University of Koblenz-Landau); Dietrich Paulus (n/a)
Effect of Parameter Optimization on Classical and Learning-based Image Matching Methods
Ufuk Efe (Middle East Technical University); Kutalmis Ince (Middle East Technical University); Aydin Alatan (Middle East Technical University, Turkey)
Building 3D Morphable Models from a Single Scan
Skylar Sutherland (Massachusetts Institute of Technology); Bernhard Egger (Massachusetts Institute of Technology); Joshua Tenenbaum (MIT)
CAFT: Class Aware Frequency Transform for Reducing Domain Gap
Vikash Kumar (Indian Institute of Science); Sarthak Srivastava (Birla Institute of Technology and Sciences - Pilani); Rohit Lal (Indian Institute of Science);
Anirban Chakraborty (Indian Institute of Science)
Adapting Deep Neural Networks for Pedestrian-Detection to Low-Light Conditions without Re-training
Vedant K Shah (BITS Pilani, KK Birla Goa Campus); Anmol Agarwal (BITS Pilani K.K. Birla Goa Campus); Tanmay Verlekar (BITS Pilani, KK Birla Goa Campus); Raghavendra Singh (Oyla-Inc)
Towards realistic symmetry-based completion of previously unseen point clouds
Taras Rumezhak (Ukrainian Catholic University); Oles Dobosevych (Ukrainian Catholic University); Rostyslav Hryniv (Ukrainian Catholic University);
Vladyslav Selotkin (SoftServe); Volodymyr Karpiv (SoftServe); Mykola Maksymenko (SoftServe Inc.)
A closed form solution for viewing graph construction in uncalibrated vision
Carlo Colombo (DINFO, University of Florence); Marco Fanfani (DINFO)
DC-VINS: Dynamic Camera Visual Inertial Navigation System with Online Calibration
Jason J Rebello (University of Toronto); Chunshang Li (University of Toronto Institute for Aerospace Studies); Steven L Waslander (University of Toronto)
Absolute and Relative Pose Estimation in Refractive Multi View
Xiao HU (Technical University of Denmark); François Lauze (University of Copenhagen, Kopenhagen); Kim Steenstrup Pedersen (University of Copenhagen);
Jean Mélou (IRIT)

List of Accepted Papers (presentations)

A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces
Claudio Ferrari (University of Florence); Stefano Berretti (University of Florence, Italy); Pietro Pala (University of Florence);
Alberto Del Bimbo (University of Florence)