I began building this vision system to gain a better understanding of the challenges associated with autonomous driving. Building this system has been a constant practice in design consideration, balancing functionality with cost, which eventually lead to this node-like stereo camera system that uses high-speed LAN to pass data to my host PC, avoiding compute limitations on SBCs.
Power from the vehicle
Signal degradation distance
Camera resolution + FOV
Compute Limitations
Vibration control
Bringing all of these components together, I decided to use 2040 extrusion as my build base to enable future modularity in the event I wanted to change the configuration of my system. As is, I modeled the baseline distance of my cameras to replicate the specifications of data sets that were collected using much high quality components.
I decided I was going to 3D print many of the parts to cut costs, because of this, I opted not to worry about weather proofing the system as it's only meant to be a learning platform. This allowed me to speed up my prototype development and quickly generate parts that were modeled specifically for my hardware.
To speed up the project development, I am currently building Vision Flow. Having several models to work with, and many steps that can go wrong in each task can be extremely tedious and time consuming to catch errors and debug for all of the different lighting and weather conditions a car can be in. Vision Flow is a hub that lets me check all of the intermediate process functions side by side, and is also a tool for efficiently collecting and organizing new data for labeling later on.
My own personal version of a sparse optical flow algorithm, feature flow shows the tracking of FLANN matched ORB features. Feature detection is already a critical step for other processes, so by adding this additional functionality I can reduce the total compute for each image. When combining this with object detection and instance segmentation, I hope to have a cheap method of estimating planar velocity of the dynamic objects in the scene.
One of the most important features enabled by having the second camera in a stereo system is the ability to garner depth information from a scene. The Semi Global Block Matching (SBGM) algorithm is used to produce the disparity map (right) from the input image (left)
Tested on Gryphon Dataset: https://www.kaggle.com/datasets/evdoteo/gryphon-dataset?select=camera_calibration.txt
Camera Manufacturing Inconsistencies:
One issue has been the inconsistency of the low cost cameras I chose (Sony IMX219-160). The two lenses have very different distortion properties, and It has made calibration extremely challenging. To overcome this, I'm working on a feature within Vision Flow that will provide live feedback to me for manually adjusting the camera orientation within the housing. By doing this, I hope to reduce the amount of rectification necessary on the images and create a more reliable calibration for disparity mapping.
Model Training:
As with any pipeline from academia to real world deployment, acquiring the data sets, as well as implementing the proper data processing prior to model training has been frustrating, however I do get the sense that this is a skill that I can develop and improve on as I practice.
Scene Variance:
I have already seen first hand how difficult certain situations that occur in daily driving can be. From the delay in exposure time that can happen when entering and exiting a tunnel, to road marking issues and occlusions, even lane detection can be extremely tough to accurately isolate.
Panel gaps are the thin spaces between exterior vehicle body parts. While uniform panel gaps typically indicate high quality assembly of the car, non uniform gaps usually indicate issues during assembly of the vehicle that may also be found in critical operational components within. I developed a computer vision algorithm that allows a smart phone user to quickly inspect the panel gaps around a vehicle and observe exactly where issues exist, which may provide better new/used car purchasing insight. Green gaps show regions within a manufacture's tolerance, red gaps indicate poor excess spacing, and blue gaps indicate poor tight spacing. (Project still in development)