The project is aimed at creating an Assistive Technology, specifically an Assisted Driving Car, with an altruistic view in mind. NVIDIA’s “End to End Deep Learning”, was implemented in order to successfully predict the path to be traversed by the vehicle. Detecting the presence of the driver’s eyes on the road governs whether the control of the vehicle would pass to the algorithm or not. This, in turn, provides assistance to the driver (at moments when his/her gaze is not on the road). The algorithm involves detecting obstacles and accordingly, controlling the vehicle’s movements; hence fulfilling the motive of preventing road accidents that can occur due to a driver’s negligence.
This Github Repository contains :
● a folder named Files comprising the libraries required to be uploaded on Google Colaboratory,
● a folder named Media comprising a few of the videos which the Neural Network was trained on, along with a JSON file containing the data marked with timestamp and a few demo images that can be used for the detection of eyes in the eye tracking program,
● a folder named Weights comprising the weights trained with the help of NVIDIA’s End to End Deep learning paper, along with the Haar Cascade weights that were used to detect face,
● Python scripts to detect eyes, train the Neural Network, predict the output from an image using the self-trained Neural Network, as well as to detect and display the masks of the obstacles trained on the MS COCO dataset.
The eyes of the driver were detected by initially tracking the face (Using Haar Cascade Dataset) and then, using a self-developed method to track the pupils of the person. The position of the pupils is, then, used to determine whether the person’s gaze is fixed on the road or not.
A highly efficient means of detecting the face using Haar Cascade Weights was used. The eyes were detected in a similar fashion.
From the detected eyes, pupils were detected and then the approximate position of the center was used to determine the gaze of the driver.
I. Structure
The structure involves a Sequential model consisting of 5 convolutional layers and a dense Neural Network comprising 4 layers.
II. Training and Testing
The Neural Network was then trained and tested on the Berkeley Deep Drive dataset. The dataset was a collection of video files with output as the instantaneous change in the X,Y,Z coordinates of the vehicle (with respect to the car itself)
III. Prediction
Prediction pertaining to steering the vehicle was made on the above trained Neural Network. For each frame then a prediction is made for the car to steer left or right or to continue moving straight, the extent, to which, the vehicle will go left/right depends upon the number of frames the vehicle was steered left/right for.
Detection of obstacles is done using Mask-RCNN which uses semantic segmentation technique to detect objects by creating bounding boxes and segmentation masks around them. Mask-RCNN is implemented on Python, Keras and Tensorflow and trained on the MS COCO dataset.
A high efficiency model is being trained on a self made simulator to provide 120 frames per second to provide real time On Road analysis for the self driving car.
The End to End deep Neural Network which is currently being trained shows about 92% accuracy. The accuracy is expected to increase as the algorithm is yet to be trained on video data containing more turns and curvy paths.
● Open CV : Haar Cascade Dataset
● Mariusz Bojarski, Ben Firner, Beat Flepp, Larry Jackel, Urs Muller, Karol Zieba and Davide Del Testa. End-to-End Deep Learning for Self-Driving Cars 2016. URL : https://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf
● Waleed Abdulla. Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow GitHub repository 2017
● Stack Overflow