Our inspiration in this project is to integrate computer vision and natural language processing. This project is about identifying multimodal objects through artificial intelligence with the deep learning algorithm. In this project, the system scans objects with the help of a camera module based on NLP commands and search for them in the dataset by feature extraction. And then the object name is said by the system as voice as output.
Object Detection with the fastest and better accuracy using the latest SSD MobileNet V3.
Train model with a huge amount of data with complex background for better detection.
To make a system that merges computer vision with NLP.
To establish detecting objects as an action in NLP.
Establish a smart system for object tracking.