The First MFI 2022 Workshop on

Neuromorphic Event Sensor Fusion Algorithms and their Applications

Workshop date: 1300-1700 UK summer time (GMT+1) on Thursday 22 Sept 2022, online (see MFI 2022 on IEEE Xplore for proceedings)

Many thanks to all the speakers and participants!

See the YouTube workshop playlist or talks below for videos of the talks and final Jeopardy quiz game.

Reason for the workshop

Event-based sensors such as silicon retinas and cochleas have attracted attention from the robotics, IoT, and TinyML communities. Event cameras have the potential to improve current visual odometry and mapping thanks to its low latency, high dynamic range, and high time resolution, and cochleas can provide always-on activity-driven audio inference at low average power consumption.

Fusing data from event sensors and conventional sensors can provide advantages of both modalities. For example, fusing event cameras and IMUs can stabilize visual input while providing continuous IMU self-calibration. Similarly, it is possible to fuse vision and auditory sensors, e.g. for lip reading and auditory-visual scene classification.

The CVPR2021 Workshop for Event-Based Vision, and the CapoCaccia and Telluride neuromorphic workshops cover work in event sensors and there are a few papers listed in the Event-Based Vision Resources site that mention fusion.

However, no workshop has focused on sensor fusion for event sensors. It is necessary to have a wider discussion on this topic to facilitate its use in robotics, human interaction, and IoT.

Fusing retina and cochlea output resolves ambiguity(Kiselev, Neil and Liu, ISCAS 2016)

Recorded talks

Prof. Tobi Delbruck (Sensors Group, UZH-ETH Zurich) (scholar link)

(presented the day before the workshop in MFI 2022)

Neuromorphic Event Sensor Fusion: Neuromorphic event sensors such as Dynamic Vision and Audio Sensors mimic biology’s eyes and ears. They output sparse, quick events rather than regular Nyquist samples, enabling systems that can respond quickly at low average power consumption so that they can beat the usual power-latency tradeoff of frame-based perception. How can this event data be fused, either with conventional sensors or with other event sensors? This talk reviewed progress in this interesting area of multisensor fusion as an introduction to the workshop talks that followed.

Prof. Boxin Shi (Peking University) (scholar link)

NeurImg: Hybrid Imaging Fusing Neuromorphic and Conventional Cameras: I will take the event camera as an example to introduce hybrid imaging framework fusing neuromorphic and conventional cameras from two complementary perspectives: On one hand, conventional images can be used to increase the robustness of neuromorphic events for denoising and super resolution, and their data association problem will be discussed as well; on the other hand, the event signals can be employed to guide super resolution and deblurring, and how they are effective in removing the rolling shutter effects from a conventional sensor will also be introduced.

Prof. Laurent Kneip (Shanghai Tech University)(scholar link)

Geometric vision with event Cameras: Traditional visual SLAM has reached a level of maturity that enables cost-effective and robust, real-time mapping and localization on novel hardware devices such as autonomous aerial vehicles and XR headsets. However, common approaches still easily suffer from robustness issues in highly dynamic or challenging illumination conditions. Event cameras in turn represent an interesting complementary sensor alternative with high dynamic range and low latency. In this talk, Prof. Kneip will introduce recent works of the Mobile Perception Lab at ShanghaiTech, in which traditional geometric methods are adapted to the novel case of event data, and event-inclusive solutions to a few important practical problems are presented.

Accepted papers

Event-based Driver Distraction Detection and Action Recognition, (lab link)
Yang, Chu*; Chen, Guang; Liu, Peigen; Liu, Zhengfa; Wu, Ya; Knoll, Alois C. (Paper 69)

Global-local Feature Aggregation for Event-based Object Detection on EventKITTI, (lab link)
Liang, Zichen; Cao, Hu; Yang, Chu; Zhang, Zikai; Chen, Guang* (Paper 71)

Enhancing Event-based Structured Light Imaging with a Single Frame,
Wang, Huijiao*; Liu, Tangbo; He, Chu; Li, Cheng; Liu, Jianzhuang; Yu, Lei (Paper 72)

Prof. Shih-Chii Liu (UZH-ETH Zurich) (scholar link)

Fusing Cochlea and Retina Events in a Deep Belief Network: This talk will summarize the highly-cited 2013 paper Real-time classification and sensor fusion with a spiking deep belief network, where DVS and DAS silicon retina and cochlea outputs were fused in a spiking DBN to improve recognition of ambiguous MNIST digits. This talk will also describe the implementation on the Minitaur FPGA accelerator illustrated to top left here.

Prof. Guillermo Gallego (TU Berlin) (scholar link)

Event-based stereo 3D reconstruction for SLAM: Most stereo methods exploit event simultaneity across cameras to establish matches and estimate depth. Instead, we estimate depth by fusing Disparity Space Images originated in efficient monocular methods. We develop fusion theory and design state-of-the-art multi-camera 3D reconstruction algorithms. (PDF of Guillermo's slides).

Dr. Cornelia Fermüller and PhD student Levi Burner (University of Maryland, College Park) (scholar link)

The EVIMO Dataset and Applications to Motion Segmentation: This talk will introduce EVIMO and EVIMO2, a collection of indoor datasets for Structure-from-Motion tasks gathered with multiple event-based sensors and classic video, with the ground truth obtained from a motion capture system and depth scans. Our resource also provides a toolkit for fusing the different kinds of data to automatically generate annotations for motion, depth, and scene segmentation. Finally, we demonstrate this resource in learning-based motion segmentation algorithms.

Prof. Keigo Hirakawa (U Dayton Ohio) (scholar link)

Joint APS-DVS Sensor Optical Flow: The classical optical flow equation describes the relation between the object velocity and the spatial-temporal derivative of the pixels. We propose a novel optical flow method aimed at taking advantage of the spatial fidelity of the APS sensors and the temporal resolution of DVS sensors to improve the pixel-level velocity estimation, which we refer to as DAVIS-OF. DAVIS OF method yields reliable motion vector estimates while overcoming the fast motion and occlusion problems.

Event Sensor Fusion Jeopardy Game

Event Sensor Fusion Jeopardy Champion: Guillermo Gallego, TU Munich

He (almost) correctly responded to the final Jeopardy answer "A centerpiece of the book with this title and author was its cooperative and competitive recurrent neural network that collectively solved the one-dimensional stereo correspondence problem." with the response "Who is David Marr?". The correct response should have been: "What is “VISION”, by David Marr? The jury decided unanimously to honor him as the Champion. Congratulations!

Workshop organizers:

Min Liu (Postdoc, University of Zurich and ETH Zurich)
Tobi Delbruck (Professor, University of Zurich and ETH Zurich)
Guang Chen (Professor, Tongji University)
Shu Wang (PhD student, University of Zurich and ETH Zurich)

Call for papers/demos

We are inviting researchers to submit papers/demos on related topics, which include but are not limited to:

Event sensors fused with other sensor modalities, such as
Event camera fused with IMU
Event cameras fused with depth sensors
Event cameras fused with auditory sensors
Deep networks for event sensor fusion
Bayes sensor fusion for event sensors
Spiking neural networks and other sparsity-aware networks for sensor fusion
Event sensor fusion applications in robotics
Event sensor modeling
Measurement selection algorithms
Event cameras fused with LIDAR
Sensor synchronization algorithms
Hardware platform/designs for multi event sensor synchronization
Datasets containing multiple sensor sources
Hardware acceleration of event senor fusion
Multisensor-aware denoising

Contact Information:

Min Liu and Tobi Delbruck are the contact points of the workshop. We are from the Sensors Group at the Institute of Neuroinformatics, University of Zurich and ETH Zurich.

Emails: minliu@ini.uzh.ch, tobi@ini.uzh.ch

We thank Westwell for sponsoring the invited speaker registration fees