Assistive Technology Examples

Types of Assistive Technology for ELLs:

  • Word Prediction software

  • Visual aids

  • Speech-to-text

  • Text-to-speech software

  • Audio recorders

  • Translators

  • Speech enabled language translation (SELT)

Word Prediction Software

If you owned a mobile phone in the late 90's, you would have been exposed to the early technologies of predictive text using T9 on numeric keypads.

Source: Wikipedia

Originally an assistive technology to support individuals with physical disabilities, word prediction has extended its reach. It is a software that makes predictions about what the user is trying to write based on the initial letters that are typed. The program compiles a list of recommended words, allowing the user to select the appropriate one to form sentences. Word prediction software allow users to make meaning of their writing in a contextualized learning space. This is crucial in emergent literacy development as it highlights phonological awareness, alphabet principles and orthography. The user becomes the problem solver as they explore different letter combinations, patterns and morphological elements to learn through real writing experiences (Brown & Allmond, 2021). Word prediction can increase word fluency and vocabulary use along with the quality of writing for writers to become more independent, productive and motivated.

Visual Aids

Closed captions are an example of a text-based visual aid that can enhance student learning.

Source: Wikipedia

Visual aids can support learners with comprehension and remembering content. It serves as a mental scaffold and reduces the need for translation. Student-generated visual aids allow learners to to recall and transfer knowledge. In the earlier stages, picto-spelling provides an alternative method of demonstrating understanding of concepts and vocabulary. Examples of visual aids include anchor charts, digital or printed images, document cameras and vocabulary cards (Southern Oklahoma State University, 2021).

Speech-to-Text

This is done with recognition software and “the translation of spoken language in to text through computational linguistics” (What is speech to text, n.d.). Speech-to-text Technology exists in two main forms: speaker-dependent, which is largely used for dictation software; and speaker-independent, typically used in phone applications.


Speech-to-text is considered a largely time-saving assistive technology. However, it is also highly cost-efficient - allowing individuals or companies to pay subscription services rather than hiring individual human translators. It offers the ability to enhance audio and visual content with subtitling and transcription.


Surprisingly, the primary uses of speech-to-text are not educational! They are typically used for call centre analytics and agent assist - allowing insights from individual customer conversations to be gleaned and increase customer service productivity. Media content searching and localized subtitles also form a large portion of speech-to-text. This can often be seen on channels like YouTube which offer close-captioning/subtitling for the hearing impaired. Finally, there is the classic dictation function, allowing individuals to transcribe data orally with relative ease.

Source: Nordic APIs

The early history of speech-to-text and its current applications

Although the earliest speech-to-text device could be traced back to the Bell Telephone Laboratory in 1936, it was only in the 1970’s when the United States’ Department of Defense (DARPA) Collaborated with Carnegie Mellon University in the SUR ‘s (Speech Understanding Research) program research. This produced ‘harpy’, a program which was able to comprehend 1011 words and search for logical sentences. Fast forward to today and we have speech technology which can recognize spoken words and sentences, convert them into text, and interact with automated customer services And answer callers questions with varying degrees of accuracy (LDRFA). It is admittedly good, but still imperfect; accents can prove difficult programs, as can homonyms. Additionally, the program is limited by how good the hardware is; a bad microphone can prove just as detrimental as a program.


Speech ‘ recognition’ technology includes programs like Amazon Alexa, Apple's Siri, Google home. Even IBM's Watson speech-to-text uses AI powered speech recognition and transcription In multiple languages for a variety of purposes including customer self-service, agent assistant and speech Analytics. While these all incorporate speech-to-text technology oh, they are too broad for the purposes of this discussion, which will be focussed on how speech-to-text applies to language learning education


The Learning Disability Resources Foundation (LDRFA) explain that speech-to-text tools are important in empowering young learners who suffer from dysgraphia and dyslexia: learners who have difficult times of writing simple sentences. Additionally, Shadiev, Wu and Huang (2022) mention that speech-to-text recognition (STR) usage in lectures in English as the medium of instruction (EMI) has been recognized as an assistive technology for language learners.

Text-to-Speech (TTS)

Source: Medium.com

Text to speech is literally the other side of speech-to-text: where computers look at written input/data and provide output vocally.

Often referred to as “ read aloud” technology, it can take words and convert them into audio. It is the familiar voices like Sir, Alexa and the Google Assistant. Most text files and web pages can be paired with a TTS reader; TTS readers are even available as plugins for web browsers. If you look to your left, you will see an example that is built into The Economist. While this has been envisioned with the educational purposes of helping with learners who struggle with literacy, vision or language issues, it is also a very useful assistive technology for convenience. “The technology behind text-to-speech has evolved over the last few decades. Using deep learning, it is now possible to produce very natural-sounding speech that includes changes to pitch, rate, pronunciation, and inflection. Today, computer-generated speech is used in a variety of use cases and is turning into a ubiquitous element of user interfaces. Newsreaders, gaming, public announcement systems, e-learning, telephony, IoT apps & devices and personal assistants are just a few starting points.” (What is Text-to-Speech?, n.d.)

As mentioned previously, the biggest selling point of text-to-speech is convenience: individuals are able to consume information without having to focus on a computer screen. This enables them to conduct other manual tasks that require visual focus while also being able to process auditory information. Other advantages are cited below (What is Text-to-Speech?, n.d.)


Audio/Voice Recorders

Source: PC Mag

The familiar microphone icon has become popularized on smartphones to record voices.

Audio recorders were initially invented to support with memory processing and is now used for speech purposes (Lewis, 1998). Audio are digital devices that can record sound, format the saved file which is transferred to another device, such as MP3 files. In the classroom, it can help students develop their expressive vocabulary and allows students to hear themselves to evaluate their communication skills and clarity. It also amplifies student voices by helping them to convey their understanding.

Translators

Translators have been around for thousands of years. People used to translate from even before the Bible was written. Throughout human history, translation has been used for emotional, trade, and survival purposes. Following the industrial revolution, the economy moved at a very fast pace which gave innovators the opportunity to continuously develop and adapt new translation technologies. The Internet revolutionized the ability to translate and understand text and documents from all around the globe.

Speech-Enabled Language Translation (SELT) aka Speech-to-Speech translation

SELT is the process of computer-mediated translation from one language to a second language. Input is spoken in a first language and then audio output is provided in the target 2nd language. Undoubtedly, many of us have used Google Translate for individual sentences or entire web pages for text. However, here we are also considering simultaneous oral speech-to-speech translation.

This works on the traditional model of speech-to-speech translation, involving three basic processes (Toppan, n.d.):

  1. automatic speech recognition: which allows spoken words to be transcribed as text

  2. machine translation: where the text is transcribed into data that the computer can process followed by…

  3. text to speech synthesis: producing output in the target language

While still imperfect at the moment, it has attracted major technological players: Google, Microsoft and Baidu (from China). Some of this technology even exists presently for passengers of taxis in Dubai: allowing them to speak in their native language with the driver (with varying degrees of success).