Milestone 2

Project Plan and Task Breakdown

We plan to create standalone software that will enable people who sign ASL to communicate with English speakers who don't know ASL through [a live video feed that will use hand tracking points and a machine-learning algorithm to map to English words] and by [enabling users to translate at least 30 ASL words to English]. Our product will be scalable to allow for the translation of hundreds more ASL words moving forward. To accomplish this, our task breakdown is as follows:

  1. Decide what software, programming language, IDE, and ML SDKs would be best to use for our project

  2. Collect ASL datasets to use to train our ML model

  3. Build the backend ML dictionary that performs hand tracking data mapping from ASL to English words with MediaPipe

  4. Build the frontend user interface with design features our customers and stakeholders ask for

  5. Test our prototype live with the help of ASL signers

  6. Make improvements to the backend accuracy in translating words and to the UI features until we can consistently and effectively translate at least 30 ASL words to English. This will be our minimum viable product.

Design Concepts

People who are vocally or auditorily impaired currently have no way to independently communicate with someone who does not understand American Sign Language (ASL) -- they that can be expensive and not easy to use to communicate with others. ASL is used by over 500,000 Americans and it is the third most popular language in the US, yet most people do not know how to use it and there are no mainstream accommodation or accessibility services for people who sign ASL. By not having these integrated accessibility services, people in this community are excluded from society that can lead to an array of harmful ramifications. Thus, this problem must be addressed so that vocally and auditorily impaired persons can communicate with other people spontaneously and effectively – anytime and anywhere.

While approaches have been made with on-demand call services or with software-based translations with proprietary hardware, we are committing to delivering a novel solution that can be accessible as possible.

Our Product is a software application that uses computer vision and machine learning to be able to display real-time translated captions for American sign language. We hope to achieve this goal to give the 500,000 people who need to use ASL to communicate a way to live life more independently and make the world theirs by allowing them to have a more active role in society. At the end of our term we hope to get at least 30 ASL words / phrases to be recognized by our software and more to come in the future. While there will be a considerable effort to make this possible and understand that our systems and algorithms will not be infallible, we are still committed to that, even with flaws, our system can change lives for the better.

Concept Selection

  • Have a way for people who are vocally or auditorily impaired to be able to communicate

  • Make the product easy for them to use independently (currently no way for them to do so)

  • Have a way for them to have a concept of text to speech

  • Have a way for masses of people to communicate using the 3rd most popular language in America

  • Use machine learning to be able to train a product to learn a language in a unique way using computer vision


Design

Analysis

Hardware Specifications

  • To use this product you will need the following hardware:

      • a laptop with a webcam or any computer that has the capability to have an external webcam


  • Team's goal is to make it require as little hardware as possible to make the product as accessible as possible

Software Specifications

  • To use this product you will need the following software:

      • Download the desktop application

      • Having an up to date Windows/macOS to be able to use the media pipe software

      • space on laptop to be able to hold model and application

  • Teams goals for software requirements:

      • Simple and intuitive user interface

      • English captions that appear real-time

      • Precise hand and motion tracking that accurately captures ASL words or phrases

      • Smart hand coordinate to English word mapping library

      • Software is standalone and can be operated by anyone

      • Simple and effective instructions for use

      • A way for the person to sign and the computer use a text to speech attribute to be able to relay what they're signing to who they are speaking to

      • Reusability so that it can be implemented across several different applications across the tech industry

Test Plan

To test our prototype, we plan on meeting with Phillip Gehman, Director of Disability Services, as he has said he will connect us with vocally and auditorily impaired Stevens students who volunteer to test our product. Mr. Gehman has also offered to connect us with his parents who are auditorily impaired and recommended we locate local Schools for the Deaf, to further test our product. Working with people who are auditorily impaired and ASL signers will get us the most useful feedback on our software, and we will ask volunteers how we can improve our product and then implement those changes or additions.

References

“A communication bridge between Deaf and hearing,” SignAll. [Online]. Available: https://www.signall.us/.

Github: amin07, “Amin07/GMU-ASL51: Video and Pose Data Collection work for american sign language (ASL),” GitHub. [Online]. Available: https://github.com/amin07/GMU-ASL51.

“American Sign Language Lexicon Video Dataset (ASLLVD),” American Sign Language video dataset. [Online]. Available: http://vlm1.uta.edu/~athitsos/asl_lexicon/.

“Video sequences,” CARE: National Center for Sign Language and Gesture Resources. [Online]. Available: http://csr.bu.edu/asl/html/sequences.html.

Github: mediapipe, “Hands,” Mediapipe. [Online]. Available: https://google.github.io/mediapipe/solutions/hands.

R. Dias, “American sign language hand gesture recognition,” Medium, 15-May-2021. [Online]. Available: https://towardsdatascience.com/american-sign-language-hand-gesture-recognition-f1c4468fb177.