Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Matthew Lisondra*^, Junseo Kim*^, Riku Murai**, Kourosh Zareinia*, and Sajad Saeedi*

* Toronto Metropolitan University (TMU)       ** Imperial College London

^ Both authors contributed equally to this research at TMU

*** Accepted for Presentation Yokohama, Japan for IEEE 2024 ICRA *** 

*** We Presented Successfully in ICRA 2024 WATCH NOW *** 

Paper (on arXiv) / Paper (on IEEE)

*** Accepted for Presentation Yokohama, Japan for IEEE 2024 ICRA *** 

Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)

Focal-Plane Sensor-Processor Arrays (FPSP)s are an emerging technology that can execute vision algorithms directly on the image sensor. Unlike conventional cameras, FPSPs perform computation on the image plane – at individual pixels – enabling high frame rate image processing while consuming low power, making them ideal for mobile robotics. FPSPs, such as the SCAMP-5, use parallel processing and are based on the Single Instruction Multiple Data (SIMD) paradigm. In this paper, we present BIT-VIO, the first Visual Inertial Odometry (VIO) which utilises SCAMP-5. BIT-VIO is a loosely-coupled iterated Extended Kalman Filter (iEKF) which fuses together the visual odometry running fast at 300 FPS with predictions from 400 Hz IMU measurements to provide accurate and smooth trajectories.

What is the Project about

The reduced power consumption and latency associated with Visual Odometry (VO) and Visual Inertial Odometry (VIO) are becoming increasingly important as future mobile devices are anticipated to require rich and accurate spatial understanding capabilities. Currently, conventional camera technology typically operates at 30-60 frames per second (FPS) and transfers a non-trivial amount of data from the sensor to the host device (e.g. a desktop PC). Such data transfer is not free – in terms of both power and latency –, and additionally, all these pixels must be then later processed on the host device. 


As an alternative, Focal-Plane Sensor-Processor Arrays (FPSP)s, such as SCAMP-5, is a new technology that enables computation to occur on the imager’s focal plane before transferring the data to a host-device. By performing early-stage computer vision algorithms on the focal plane such as feature detections, FPSPs compress the image data down to the size of the features. By transferring only the detected features, redundant pixel information is not transferred or potentially even not digitized as FPSPs such as SCAMP-5 can perform analog computation. 


In this work, we extend on BIT-VO, a visual odometry algorithm which uses SCAMP-5, and present BITVIO, the first 6-Degrees of Freedom (6-DOF) Visual Inertial Odometry (VIO) algorithm to utilize the advantages of the FPSP for vision-IMU-fused state estimation. BIT-VIO achieves a much smoother trajectory estimate when compared to BIT-VO, while retaining all the advantageous properties of BIT-VO such as low latency and high frame rate pose estimation. In short, the contributions of our work are:


(I) Efficient Visual Inertial Odometry operating and correcting by loosely-coupled sensor-fusion iterated Extended Kalman Filter (iEKF) at 300 FPS using predictions from IMU measurements obtained at 400 Hz.

(II) Uncertainty propagation for BIT-VO’s pose as it is based on binary-edge-based descriptor extraction, 2D to 3D re-projection.

(III) Extensive real-world comparison against BIT-VO, with ground-truth obtained using a motion capture system.

VIDEO: BIT-VIO video presentation for Yokohama, Japan for IEEE 2024 ICRA.

VIDEO: Matthew Lisondra, Live Presentation in ICRA 2024 [BIT-VIO] + Q/A.

BITVIO-Poster-LisondraKim.pdf

POSTER: BIT-VIO poster presentation for Yokohama, Japan for IEEE 2024 ICRA.

FIGURE 1: Pipeline of BIT-VIO. The multi-sensor fusion is to the left. BIT-VO is to the right. From the BIT-VO algorithm, the vision sensor utilizes the SCAMP-5 FPSP, highlighted in red. New corner/edge features are detected via the FPSP, off-putting computational load by allowing some image and signal processing to be done on the chip before transferring to a PC host or other external device to be further processed.

IMU at 400 Hz and BIT-VO at 300 FPS Experiments

FIGURE 2 AND 3: IMU and FPSP frames. Intel RealSense D435i IMU is used in this work (but can assume black-box IMU), and SCAMP-5 is the FPSP used as a camera sensor. Four coordinate frames, with two being a part of SCAMP-5 FPSP (camera and vision frames). Advantage of the SCAMP-5 FPSP is being able to track well and alleviate motion blur (right).

FIGURE 4: Comparison of the proposed BIT-VIO algorithm and visual odometry (BIT-VO) overlaid on the reference ground-truth trajectory. BIT-VIO estimates are closer to the ground-truth trajectory compared to predictions from BIT-VO. Notice that BIT-VIO effectively removes the high-frequency noise visible in BIT-VO’s trajectory. 

TABLE I: We measure the Absolute Trjaectory Error (ATE) and report the Root Mean Squared Error (RMSE) and the median to evaluate the accuracy of BIT-VO compared with our BIT-VIO algorithm after this with respect to ground-truth. TABLE I showcases ATE with IRSD435i at 400 Hz, BIT-VO at 300 FPS. 

FIGURE 5: Plots of the estimated translational RMSE (left) and rotational RMSE (middle) for Traj. G from Table I. To the very right is the total translational RSME (top) and total rotational RMSE (bottom). For both translation and rotation, BIT-VIO is much closer and smoother to ground-truth data than IMU-alone and BIT-VO. The drift of the IMU-alone is very evident, as well as the high-frequency noise of BIT-VO.

FIGURE 5: When projecting the error on BIT-VO (left) and BIT-VIO (right) for Traj. H of Table I, we see that BIT-VO has high frequency noise with red tail-ends on its trajectory when compared to BIT-VIO's stabler closer to ground-truth trajectory with little regions of large ATE error (in red).

Contact

If you have any questions, feel free to reach out to us at the following email us at: 

{matthew.lisondra, junseo.kim, kourosh.zareinia, s.saeedi}@torontomu.ca or rm3115@ic.ac.uk

Reference and Bibtex Entry

M. Lisondra, J. Kim, R. Murai, K. Zareinia and S. Saeedi, "Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)," 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 2024, pp. 1661-1668, doi: 10.1109/ICRA57147.2024.10610838.

keywords: {Visualization;Single instruction multiple data;Robot vision systems;Trajectory;Odometry;Kalman filters;Velocity measurement},

@INPROCEEDINGS{10610838,

  author={Lisondra, Matthew and Kim, Junseo and Murai, Riku and Zareinia, Kourosh and Saeedi, Sajad},

  booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)}, 

  title={Visual Inertial Odometry using Focal Plane Binary Features (BIT-VIO)}, 

  year={2024},

  volume={},

  number={},

  pages={1661-1668},

  keywords={Visualization;Single instruction multiple data;Robot vision systems;Trajectory;Odometry;Kalman filters;Velocity measurement},

  doi={10.1109/ICRA57147.2024.10610838}}

Acknowledgements

This research is supported by Natural Sciences and Engineering Research Council of Canada (NSERC). 

We would like to thank Piotr Dudek, Stephen J. Carey, and Jianing Chen at the University of Manchester for kindly providing access to SCAMP-5.