Speakers

A/Prof. Wanli Ouyang

A/Prof. Wanli Ouyang received the PhD degree in the Department of Electronic Engineering, Chinese University of Hong Kong. Since 2017, he is an associate professor with the University of Sydney. His research interests include deep learning and its application to computer vision and pattern recognition, image and video processing.

Leveraging Rich Information for Detecting Hard Examples

Hard examples, e.g. small objects, occluded objects, objects with abnormal deformation, are objects that are difficult to be detected by deep models. To learn deep models for detecting these hard examples, this talk will introduce our explorations, e.g. leveraging context [1], designing better deep model [2], relying on better supervision [3-5], and designing better loss function [6].

[1] X. Zeng, W. Ouyang, J. Yan, et. al., "Crafting GBD-Net for Object Detection" IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), 2017.

[2] W. Ouyang, K. Wang, X. Zhu, et. al., "Chained Cascade Network for Object Detection", Proc. ICCV, 2017.

[3] X. Ma, Z. Wang, H. Li, P. Zhang, W. Ouyang, X. Fan, "Accurate Monocular Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving", Proc. ICCV, 2019.

[4] X. Ma, S. Liu, Z. Xia, H. Zhang, X. Zeng, W. Ouyang, "Rethinking Pseudo-LiDAR Representation", Proc. ECCV, 2020.

[5] X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li, W. Ouyang, "Delving into Localization Errors for Monocular 3D Object Detection", Proc. CVPR 2021.

[6] Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan, W. Ouyang, "Geometry Uncertainty Projection Network for Monocular 3D Object Detection", Proc. ICCV 2021.

DR. Emre Akbaş

Dr. Emre Akbas is an assistant professor at the Department of Computer Engineering, Middle East Technical University (METU). Prior to joining METU, he was a postdoctoral research associate at the Department of Psychological and Brain Sciences, University of California Santa Barbara. He received his PhD degree from the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His BS and MS degrees are both from the Department of Computer Engineering, METU. Dr Akbas’s research received the Beckman Institute’s Cognitive Science AI Award, “METU thesis of the year” award (three times), the Parlar Foundation Research Incentive Award and the Young Investigator Award of Science Academy, Turkey. His research interests are in computer vision and deep learning with a focus on object detection and human pose estimation.

Imbalance in object detection and ranking-based loss functions to the rescue

Object detection is a needle-in-a-haystack problem. While there are usually just a few foreground object instances that you need to detect, the number of background (negative) instances can easily go up to millions, which creates an extreme foreground-background imbalance. Researchers have tackled this problem with a cascaded training approach, widely known as the two-stage or multi-stage training as represented by Faster R-CNN and its successors. In this approach, a small number of object candidates, out of all possible instances, are determined first, and then this small set is further refined through additional classification layers. Another approach to foreground-background imbalance is to weigh examples during training by their difficulty, which achieves a similar effect of cascading in a soft-sampling manner. This approach is best exemplified by Focal Loss and RetinaNet. A recent and robust alternative to these traditional methods is the ranking-based approach (e.g. AP Loss, aLRP Loss, RS Loss), where foreground examples are forced to rank above the background examples during training. These novel ranking-based loss functions are robust to extreme imbalance and require little to no tuning. They can easily be integrated to existing object detectors and yield consistent improvements on widely used object detection and instance segmentation benchmarks. In this talk, after presenting several different imbalance problems in object detection and the traditional methods to tackle them, I will introduce ranking-based losses. This talk will essentially cover parts of the following papers:

- K. Oksuz et al. "Imbalance problems in object detection: A review", IEEE TPAMI, 2020.

- K. Chen et al. "AP-loss for accurate one-stage object detection", IEEE TPAMI, 2020.

- K. Oksuz et al. "A ranking-based, balanced loss function unifying classification and localisation in object detection", NeurIPS 2020.

- K. Oksuz et al. "Rank & sort loss for object detection and instance segmentation", ICCV 2021.

- K. Oksuz et al. "One metric to measure them all: Localisation recall precision (LRP) for evaluating visual detection tasks. TPAMI 2021.