Implementation

Automatic Speech Recognition Engine

Previous Iterations:

Wav2Vec2^[1] (+ Language Modeling)

Hubert^[2] (+ Language Modeling)

Limitations:

Low accuracy (WER).

Sensitive to noise.

Accents.

Current Iteration:

Whisper^[3]

Limitations:

Sentence-level transcription.

Registration failure (rare).

Object Detection model

Libraries and tools

Model - YOLOv7[4]
Roboflow - Data annotation and Data augmentation[5]
Accuracy: 96%

Challenges we encountered first

Object detection from top view
Small object detection
Fast and accurate

Performance of the model in real interaction

We observed the user’s interaction in the following situations:

When objects are in close proximity
When objects are moving fast
When one object is on the top of another one
When objects in the hand
When objects are in the color sensitive

Data Collection Model - How did we address these issues?

Collected data during user’s interaction

Applied data augmentation policies

Geometric transformation
Color transformation - not to bias the model with colors
Mosaic - to handle small object detection problem
Cutout - to handle occlusion problem

[1] Baevski, Alexei, et al. "wav2vec 2.0: A framework for self-supervised learning of speech representations." Advances in Neural Information Processing Systems 33 (2020): 12449-12460.

[2] Hsu, Wei-Ning, et al. "Hubert: Self-supervised speech representation learning by masked prediction of hidden units." IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021): 3451-3460.

[3] Radford, Alec, et al. "Robust speech recognition via large-scale weak supervision." OpenAI Blog (2022).

[4] Wang, Chien-Yao, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors." arXiv preprint arXiv:2207.02696 (2022).

[5] Cubuk, Ekin D., et al. "Autoaugment: Learning augmentation strategies from data." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

Page updated

Google Sites

Report abuse