Phoneme Speech‎ > ‎

ph Minspeak


Published September 1982 Byte Publications Inc
by Bruce Baker, Bruce Baker, 840 Rolling Rock Rd., Pittsburgh, PA 15234
  • About the Author: Bruce Baker did his undergraduate and graduate work in Greek and Latin at Wabash College, Indiana University, and the University of Paris and has taught widely in the United States and Europe. Currently, he is a doctoral candidate in French and Spanish at Middlebury College and Consulting Linguist to the Prentke Romich Co. in Shreve, Ohio. Last year he was named Contributing Editor to Communication Outlook, a publication of the Artificial Language Laboratory of Michigan State University.

            A semantic compaction system that makes self-expression easier for communicatively disabled individuals.

  • Minspeak
  • Minspeak is a new language prosthesis designed for disabled people who cannot express themselves through speech or hand signs. It is a semantic interface that uses microprocessor technology in a radically new system of communication that reduces the time and effort required for self-expression.
  • A person using a Minspeak board with fewer than 50 keys can produce thousands of clear, spoken sentences fewer than 7 strokes. Minspeak users don't even have to know how to spell; they can produce complete sentences without selecting letters, phonemes, or words. The unique Minspeak process permits the user to translate thought into speech.
  • Minspeak has a modern linguistic coding system based on general ideas underlying human communication. The coding technique uses sequence to define context, thus exploiting the human mind's ability to process semantic information. Easy-to-under-stand symbols on each key represent ideas. The meaning of each key image changes according to the sequence in which it is hit. By combining these symbols, whole spoken sentences can be generated. The simplicity or complexity of the symbols will depend on the needs and abilities of the user.
  • The best way to explain how Minspeak can do all of this is to start with the reasons behind its existence.

  • Research and Insights
  • Several years ago, as research for my dissertation, I set out to study the attitudes of able-bodied people toward people with obvious physical disabilities. To do the research, I needed to speak to disabled as well as able-bodied individuals. The most interesting and insightful group of people I met had cerebral palsy. Ironically, the condition which caused them to have these insights also prevented them from being able to express those insights easily. Communication was slow and inconclusive. Unless you have had some personal experience with severe physical communication disabilities, you may not fully realize what slow and inconclusive means in this context.
  • One man I met can communicate only with the aid of an IBM Selectric typewriter. His lack of voluntary muscle control, stemming from a birth injury, precludes not only hand signs and speech but also a reliable eye blink for Morse code. He expresses himself by pushing down on a board with his chin. This signal is in response to the presentation of letters on a revolving metal, disk. The disk pauses for two seconds to position each letter in front of a stationary arrow. When he sees the letter he wants, he presses the board with his chin, and the letter is typed. This method is slow and tedious. Creating the word "can" requires two and one half scans of the entire alphabet, and a single sentence often takes 30 minutes to complete.
  • Another man uses a communication system based on eye motion. A movement of the eyes upward and to the left indicates yes, while a move ment downward and to the right means no. In this system, the conversational partner performs the functions of the revolving disk. As I slowly recited the alphabet, he signaled his letter choice by making the "yes" eye movement. Although we divided the letters of the alphabet into separate groups of vowels and consonants, and further divided the consonants into those before and after "L" for easier reference, this system is still terribly slow and very limited.
  • For him to ask the simple question "What did you say?" requires a dozen scans through the alphabet and many questions to establish whether a word is ending or a new word is beginning. The degree of concentration that this system demands of the conversational partner is so great that my friend often lets many misunderstandings pass just to get the central message across. I often wonder if I have understood his message correctly or if my friend feels that the correction isn't worth the time and effort required to make the meaning clearer.
  • The inability to express oneself is one of the most widespread and catastrophic disabilities. According to a report from the University of Wisconsin's Trace Research and Development Center for the Severely Communicatively Handicapped, as many as 500,000 people in this country are unable to communicate either vocally or with standard hand signs. The causes are numerous, but among the most common are cerebral palsy, strokes, amyotrophic lateral sclerosis (Lou Gehrig's disease) and vehicular head trauma. One family in four is at some time touched by a serious communication disorder.
  • Because hundreds of thousands of these people have unimpaired cognitive abilities, the need for easy communication methods becomes all the more important. As the realities of physical communication disorders became apparent to me, I decided to focus my research on finding some means of facilitating nonvocal communication.
H. Zukas

Photo 1: 
 Hale Zukas has cerebral palsy and uses a communication board and headstick of his own design. A Phi Beta Kappa graduate in mathematics from the University of California at Berkeley, he is one of a group of highly skilled communication-aid users whose cooperation and insights into Minspeak have  been indispensable

  • A Communications Impasse
    The communicatively disabled constitute a group for whom access to microprocessors could mean a real revolution, and a common assumption is that recent technological advances have produced the necessary communication aids. Unfortunately, this is not the case, but the problem does not lie in the new technology.
    Neurological damage sufficiently extensive to hamper intelligible vocalization is regularly accompanied by difficulty in control of physical movements. To use any communication aid, the user must be able to actuate some type of switch. Consequently, existing communication systems do not solve the basic humanengineering problem of transferring information from the mind of the communicator to the communication aid, because all systems for complete communication, voiced or unvoiced, have been based upon actuating letters, words, word parts, or phonemes (minimal sound units).
    Magnetized or lightsensitive keyboards, new scanning methods, and eyetracking systems can make the selections easier, but still cannot reduce the number of selections required to communicate whole thoughts.
    A nonspeaking person with cerebral palsy faces the task of accessing between 30 and 40 keys to produce a single sentence. A neurologically impaired person able to make one selection every five seconds requires many minutes of intense concentration and labor to produce a single statement.
    The normal response time in conversation is less than three seconds. if someone is forced to wait 10 seconds for a reply, anxiety results. If a person is forced to wait five minutes, communication falters; conversation becomes impossible.
    If letters are too slow, what about words? Sadly, systems based on actuating words are too extensive and ironically too restrained. The more words there are, the longer it takes to scan through them. Imagine going one by one through 200 words. Even being able to jump through them five at a clip requires an enormous amount of time. And yet 200 words is really a small vocabulary.
    If direct selection is physically possible for the user, imagine a board with 400 words. The huge size of such a board, the smallness of the individual squares, and the intellectual complexity of remembering locations of words present obvious difficulties.
    Coding can reduce the size of a word board and increase the available vocabulary. A threenumber sequence can address up to 999 words, but the human memory requirements are staggering. "What is word 6437 Is it 'potato'? No, that's 512." The average person uses thousands of different words every day. And even if the word board could contain most of 'a user's vocabulary, a simple sentence like "Are you going to the store today?" would require the user to select 7 codes by hitting 21 keys. Research has shown that most people who have tried to use fixedword boards return to alphabetspelling boards.
    What about a hybrid system that mixes words, letters, and word parts? Photo 1 shows a person using such a system, which he actuates with a headstick. The board has more than 100 squares, each inscribed with a letter, word, or word part. (The word parts are morphemes, un-, -ed, -ly, or frequently used letter combinations, -th, -wh, -tion, -ize.)
    This approach is an improvement but, like the others, is still very slow, An average sentence requires in excess of 20 actuations. To get the number of actuations below 20, the board would have to have more than 400 keys. By combining the demand this would make on human memory with the considerable effort required to make a single key selection, it becomes obvious that communication on these systems demands considerable effort from sender and receiver.
    A system based on letters is not the answer, and one based on words is worse. A mix of words and letters affords some relief, but not enough. People with communications disorders simply need more "bang to the punch" if they are going to be able to exploit the computer's potential for equalizing physical differences.
    The source of the difficulty seems to lie outside the realm of technology. The very nature of the alphabet is at the heart of the problem. The quantity of information borne by a single letter is quite small. In formation transfers conducted in such small units will necessarily require many units. Biomedical engineering cannot change this. Perhaps a semantic approach can.

  • Addressing the Need
  • Minspeak began as a simple remedy to a single aspect of nonvocal communication needs, the problem of feedback-called phasis in linguistics. Sentences that check the channel of communication between sender and receiver serv'e a phatic function.
  • In face-to-face conversation, speakers need to be assured either through verbal or body language that the message is getting through. Because the listener is aware of this, he nods, makes sounds such as "unhuh,hmm" or says "yes, I see." If the message is complex or the speaker is anxious, the speaker may request additional phatic signs by saying "you know" or mentioning the receiver's name. When a person has a severe physical communication disorder, phatic problems take on a pressing importance for both conversational partners.
  • Able-bodied speakers have a wide range of vocabulary and syntactical phatic strategies at their disposal. In principle they can generate an infinite number of different phatic sentences, but they do not. Instead, the same phatic utterances are used again and again. A limited number of responses meets the five basic phatic needs most people experience in conversation. They are:
    • 1. To ascertain the quality and quantity of the information being received at the other end of the communication channel. (Am I being heard? Is my meaning comprehended?)
    • 2. To learn whether the information, once understood, is being judged correctly or erroneously. (Am I right, Joe?)
    • 3. To determine how the transmitted information is affecting the emotions of the receiver. (Doesn't he care the article is late?)
    • 4. To estimate how the transaction is affecting the receiver's opinion of the sender. (I won't tell her that; Ill sound so stupid.)
    • 5. To collect information about what's going to happen in the immediate future concerning: (a) the duration of the conversation, (b) possible topic shifts, (c) eventual results of the interaction.

Figure 1: The bulk of the Minspeak's memory is erasable programmable read-only memory (EPROM).
The voice synthesizer used in the first prototype was the Votrax sc-01.
Figure 1.

  • I prepared 26 sentences to satisfy these phatic needs. The simplicity of implementation can be illustrated with the rotating-disk communcation system. English sentences do not begin with question marks, so I decided to use them to designate the beginning of a phatic comment. Each of the 26 sentences is written on the user's lap tray and marked with a single letter. He can communicate an entire sentence by hitting the ? key and a letter. The receiver then consults the lap tray to see which sentence corresponds to the letter. For example, when the user selects ?C, the receiver can look at the lap tray and read 'I'm pleased by what is being said."
  • These sentences facilitated conversations on a number of different silent systems and had the potential of being even more effective if they could be generated on voice-synthesis equipment. If phatic sentences could be designed context free and reusable, so could other sentences. The success of the phatic experience could be applied to the rest of the communication process.
  • If users of communication aids had at their disposal a collection of several hundred multipurpose sentences, all sorts of routine but important transactions could be made easier for them and for their associates. If users could access these sentences through short codes, communication could be conducted almost at the speed enjoyed by able-bodied speakers.

Users can easily remember a large number of sentence sequences.

  • Taking It One More Step.
  • The redundant character of daily speech as seen in the phatic project became a primary concept of a new system for communication. I called it Minspeak, a parody on the "new-speak" in George Orwell's 1984, with the Min for minimum. My first task involved constructing thousands of sentences that were reusable and appropriate for most daily situations.
  • I designed short codes to access these sentences through a radical alteration in the representational information of an alphanumeric keyboard. Instead of letters, the keys bear images taken from daily life. These images stand for concepts rather than words. Some symbolize linguistic functions, some the activities of daily life; others denote styles of speech and mood.
  • Most important, each key has a range of significance, including a function, several activities, a style, and a mood. The sense of each key is defined by the order in which it is struck. This multiplicity of meaning is called polysemy and is the way human language works.
  • For example, in the sentence "They will play a tape of the play," no one would confuse the two uses of the word "play." Many of our words in English are polysemous and depend on their context for meaning.
  • Polysemy and redundancy are the foundation of Minspeak. The incorporation of polysemy into the design allows a small number of keys to have hundreds of referents. The amount of information carried by a letter is small; that borne by a word is considerably larger. The information in a visual image is enormous.

  • Hardware Configuration
  • Minspeak requires a keyboard coupled with a microprocessor. The EPROMs are used to store complete sentences without regard to individual words, phonemes, or letters. In addition, a commercially available speech synthesizer such as the Votrax Speech PAC with an SC-01 voicesynthesizer chip can be used. The output of the voice synthesizer is in turn coupled to a loudspeaker which generates audible synthetic speech. Because the preprogramming is done on the basis of semantic rules, Minspeak will be able to achieve a vocal quality unobtainable with text-to-speech methods. (See figure 1 for a diagram of that configuration.)
  • The keyboard design is illustrated in figure 2, with each circle representing an individual key. Each key has an illustration of a common object or an action. In most Minspeak embodiments the majority of the keys also have identifying sequential numbers, a letter that corresponds tq the number, a portion of the human anatomy, and a proper name. The keyboard design shown in figure 2 was intended to be used by someone with a relatively high level of intellectual achievement. (See table 1 for a detailed description of the keys.) Simpler keyboards are designed for users with different intellectual levels.
  • For example, with this keyboard design, key #10 has an illustration of philosopher Bertrand Russell, famous for his paradox, "the set Of all sets, not sets of themselves, etc." This key is used to change topics. A simpler board would use the same key for this purpose but would illustrate it with a frog that is jumping. (See figure 3 for examples of other keyboard images.)

Figure 2: The images on Minspeak keys represent neither letters nor words, but concepts.
Because a picture is, indeed, worth a thousand words, the meanings of the symbols can change according
to the order in which the keys are struck. Each image is rich in associations. In short and obvious combinations,
they represent whole thoughts. When such combinations are actuated, sentences are spoken by the synthesizer.
(See table 1 for a description of the information on the keys.
See table 2 for examples of specific sequences.)

Figure 2.

  • Hardware Configuration continued
  • The microprocessor is programmed so that hitting any one key twice designates that key's central image as the topic (see figure 4). All keys hit thereafter designate ideas associated with that topic. This continues until the user signifies a change of topic by hitting key #10.
  • For example, when the user hits key #1 twice, the topic of eating is established. When key #2 is hit, the sentence "Get that food out of my mouth!" is read from memory and spoken through the voice synthesizer and loudspeaker. If key #3 had been hit after the eating topic had been established, the sentence "The position of my chair is not right for eating" would have been generated. Using key #4 would have produced "Look out; the food is getting on my clothes."
  • The programming also recognizes a single keystroke after the establishment of a topic as a request for a negative sentence or expression. This was done because negative sentences are often of an emergency nature and the user needs to be able to convey the message quickly and easily. A positive phrasing of each of the preceding examples can be made by modifying the key sequence. The following sequencekey #1 twice (to set the topic), key #30 once (to denote a positive response), and then key #2 or key #4-would result in "It's okay; I'm not choking" or "It's all right if a little food gets on my clothes."

Table 1: Each key may have several functions depicted. The majority of the keys have a number,
a letter, a portion of human anatomy, a name, and an illustrution.
The theme of the key is the topic
that is selected when the key is hit twice.
The information in this table corresponds to the keys pictured in figure 2.

Table 1.

Figure 3: Minspeak keyboards designed for people who can read have numbers and letters to aid
in sequencing and lessen any unnecessary memorization. The letter generally stands for a word
associated with the central concept behind the key.
Key #1 prefaces statements dealing with numbers.
The associated word is algebra. This key was designed for a 40-year-old man with
cerebral palsy who is beginning college.
Key #2 deals with cleaning and liquids. It's associated word is bath.
Key #20 deals with transport and is from a keyboard for a person
who does not like the traditional wheelchair symbol.
The associated word is throne. Key #4 is from a keyboard designed for a Minspeak user who does not read.
The associated idea is "call 4 help." Key #6 is for commands.
The associated word is fetch. The names in the upper left area of the keys are of family members and friends.

Figure 3.

  • The Influence of Language
    Language has such a pervasive influence on perceptions and thought processes that so far we've been unable to devise a way to measure the depth or extent of that influence. To say an issue is "just semantic" is a contradiction. A person may as well say "just life or death." Americans of African descent are not nitpicking when they insist that "black" replace "colored." Nor are women being petty when they use "Ms. The way a person is described affects how he or she is treated.
    People with physical disabilities can be isolated by the language used to describe them. I recently formed a small company and one of my two partners uses a communication aid because he has cerebral palsy. For me to call him or even think of him as "afflicted" would be bad for business. To call someone a victim" of polio or to say a person is "suffering from multiple sclerosis" leaves a negative impression. Most people find it hard to deal with anyone they view as a "suffering victim." To say "He had polio" is easier and clearer.
    "Confined to a wheelchair" is an especially unfortunate phrase. People are not "confined" by wheelchairs; they use them for mobility. Some people are tortured for years by unsuccessful attempts to enable them to walk. Wheelchairs can operate with grace and efficiency. It's harmful to perpetuate prejudices against them.
    Adults with disabilities are often spoken of and hence thought of as children. I know a grayhaired profesional with cerebral palsy whose wife was recently asked who the crippled boy with her was.
    On the other hand, try not to let this list of "don'ts" make you feel anxious, because people with disabilities are often isolated by other people's fear of making a faux pas. Be natural. Most people with disabilities are skillful in dealing with all kinds of situations. it's the prejudices of the able-bodied community that are destructive.
    When I am in a quandary about whether to use a certain word or not, I just ask myself. "Would l like my partner described that way?"
    More information is available in a pamphlet, "4 Letter Words in the Dictionary of the Disabled," from United Cerebral Palsy, 66 East 34th St., New York, NY 10016.

Figure 4: The Mins peak algorithm. To select a topic strike the corresponding key twice.
All sequences then deal with that topic until another topic is selected.
Escapes, though not shown, are available for a variety of emergency situations.

Figure 4.

  • Hardware Configuration continued
  • For a severely disabled person to say these sentences on a text-to-speech or phonemic system would require the user to select dozens of keys plus have the ability to read and spell very well. Minspeak requires no more than four key selections, and reading and spelling don't matter.
  • Many other variations and combinations of the keys are available to the user and will result in different sentences being output. For examples of other sequences, see table 2. For users with some linguistic sophistication, a series of keys can provide a method for altering existing sentences through insertions and deletions.
  • Other options include changing the person, number, tense, voice, and mood of verbs. Subjects and objects can be modified, eliminated, or reversed. A "fudge-factor" key introduces sequences to produce more than 100 sentences linguistically designed to correct or clarify enunciated sentences that inaccurately represent the user's thoughts. An example of one of these sentences could be "That's not what I meant." Style and context keys can easily alter the vocabulary and social tone of the stored sentences.

Table 2: To generate a sentence, the user must hit a key twice to set the topic,
and then hit one or more keys to select a sentence pertaining to the topic.
For example, if the user hits key #3 twice to set the topic and follows that by hitting key #1,
a sentence pertaining to oiling the chair would be generated.
The information in this table corresponds to the keys pictured in figure 2.

Table 2. 

Photo 2: The Express 3, developed by Prentke Romich Co,, is a portable communication aid
powered by internal rechargeable batteries and designed for mounting on a wheelchair.
A special Express 3 is being prepared to implement the Mins peak concept.
The system will use a combination of powerstrobed EPROM and CMOS RAM.
A Votrax Speech PAC with an SC-01 voice synthesizer marketed by Vodex
will be coupled to the output of the microprocessor.
It will retain other features of the original Express 3, including a 40-character
upper- and lowercase liquidcrystal display with corresponding
thermal printer and serial ASCII output for connection to
other computers and environmentalcontrol devices.

Photo 2.

  • Considering the Possibilities
  • If you had 1000 sentences carefully constructed to cover most of the typical activities in your day, perhaps 75 percent of your utterances would be included in that group. Imagine adding 3000 more sentences composed to express a wide range of statements and questions concerning emotion and personal goals. If you then added another 1000 sentences which included statements of courtesies, greetings, thank yous, and you're welcomes, you would have enough sentences to cover most of the routine contingencies of life.
  • If communication-aid users could access any of these sentences with a few physical responses, their expressive difficulties would be on the road to resolution. Actual field work has shown that the number of sentences whose sequences can be easily remembered and used is unexpectedly high, perhaps approaching the thousands for a large percentage of potential users.
  • Minspeak is currently under development at the Prentke Romich Co. in Shreve, Ohio. PRC is working on the development of expressive communication aids for the severely physically disabled. A demonstration prototype of Minspeak will be available from the company later this year. Until now, the effectiveness of communication aids has caused agencies to question their definition as a prosthesis and this has limited the amount of outside funding available. Because of the advances represented by Minspeak, a coordinated multistate legal campaign has been launched to persuade private and public health care funding agencies to make funding available for purchase of this device.
  • People who hear and cannot speak have an enormous potential for contributing to society through their insights into human communication. It is my sincerest hope that Minspeak will give them access to modern technology that will enable them to make this contribution in an easier and more productive way.