SpeechJammer

SpeechJammer: A System Utilizing Artificial Speech Disturbance with Delayed Auditory Feedback

I won 2012 Ig nobel acoustics prize by this project. I am very honored to receive the prize. 2012/09/20

[English | Japanese]

 PR movie

             Concept
[Key Points]

  • A person's speech can be inhibited effectively by playing it back the sound of his/her voice in his/her ears, slightly delayed. We have proposed a concept applying this phenomenon to support communication among people. 
  • We have created a gun-type prototype combining a directional microphone and directional speaker.
  • Our research is currently in the initial stages, and the speech inhibiting effect varies greatly with the individual and with accustomization, so it is not ready for practical applications and we have no plans for release or sales. 
  • On the other hand, we have published a simple, free application that allows users to experience the speech-inhibiting effect easily. The application runs on Windows PCs and Windows Phones.
[Overview]

The goal of this research was to develop "SpeechJammer", a system able to act on a person speaking, and forcefully inhibit their speech. It is generally known that normal production of speech can be interrupted by delaying the sound of the speaker's voice for several-hundred milliseconds and feeding it back so they hear it. This phenomenon can inhibit speech without inflicting physical pain, and has other excellent characteristics, including: any cognitive effects disappear immediately when the speaker stops speaking, and it is harmless to surrounding people and only affects the speaker. We prototyped a system, combining a directional microphone and a directional speaker, to inhibit the speech of a specific person from a distance. We are studying applications of this technique, such as controlling manners and rules in conversation or supporting presentation training.

[Research Background]

We have conducted research for many years on using information technology to support communication among people. One product of this research, "Presentation Sensei", is a self-training support system for giving presentations. It analyzes speakers' speech using voice recognition technology and displays a warning if they are speaking too quickly. However, the warnings were given using a sound or visual stimulus, and were found not to be powerful enough. This led to the idea that more suitable training support could be achieved by applying a phenomenon using characteristics of human cognition to forcefully inhibit speech. The principle used to inhibit speech in this instance is very common, and can be experienced at science museums such as the National Museum of Emerging Science and Innovation ("Miraikan"). We conceived this research when experiencing the effect ourselves during a visit to the Miraikan, and felt that it could, at some point, be useful in our research.
Note that this technology is not limited to applications for presentations. It could also be useful to ensure speakers in a meeting take turns appropriately, when a particular participant continues to speak, depriving others of the opportunity to make their fair contribution. Generally, speaking occupies the air, which is a shared resource and the medium that transmits sound, so rules and ethics are needed to ensure smooth dialog. We expect that this research can be used as a technological base to support this.

[Prototype Work]

Prototype work was done for approximately one month, together with collaborating researcher, Koji Tsukada. 

[Current Status and Future Plans]

This research is currently in its initial stages. The speech inhibiting effect varies greatly with different individuals and with accustomization, and is not ready for application in practical situations. 
We will continue to make improvements with new prototypes, to achieve a more effective and accurate speech inhibiting effect. In addition to the gun-type system, we will also test its effectiveness in various other situations, such as a system implementation built into a meeting room. 


After receiving many requests, we have released a free, simple speech inhibitor application allowing users to experience the speech-inhibiting effect easily using their own computer. The application runs on a Windows PC or a Windows Phone. See the download Web page for details. With the application, and using a Bluetooth headset (a small device incorporating wireless headphones and a microphone), inhibiting the speaker's speech remotely can be tested easily. The application includes a button to inhibit speech immediately and a timer function. The latter in particular can be used as an aspect of presentation training.

[Concern Regarding Abuse]

When we established that in the future, a third party could use this research to inhibit a person's speech accurately and remotely, it was pointed out that it could be abused, depriving people of freedom of expression. In response, we have the following thoughts.
  • We began this research with the idea that freedom of expression is to be given with equality, and to overcome the unfairness when speech is monopolized by a particular speaker, as in "the loudest voice wins".
  • This issue arises because the decision regarding what sort of expression is unfair depends on the ethical sense of the person using SpeechJammer. As such, we recognize that if this technology is abused, situations could occur in which people's expression could be suppressed arbitrarily.
  • However, it is important to note that with this technology, compared with conflict using knives or guns, a place for discussion and resolution remains even if both parties use it, rather than just causing destruction. If someone uses SpeechJammer on you against your will to monopolize the discussion, you could also use SpeechJammer and wait until it was possible to continue the discussion calmly.
  • Conversely, some people imagine a dystopia in which organizations or governments install an automatic SpeechJammer everywhere to suppress expression. In such a case, the above type of discussion is not possible. 
  • However, the technology can be circumvented effectively by simply using earplugs, so the possibility of the above scenario should not cause concern.
  • This technology is suited to applications such as personal presentation training and meetings in which all participants have agreed to the rules of interaction and shared space. These are ordinary situations in which those concerned understand that their speech may be inhibited under certain conditions. They would be unlikely to use earplugs or other means to circumvent the system, or would be criticized immediately

[Researcher Information]

Kazutaka Kurihara, Associate Professor
Department of Computer Science
Tsuda College
email: qurihara [ at ] gmail.com
For Example, my research projects are:
Bio:
Kazutaka Kurihara is currently working as a senior researcher at AIST. He completed his PhD studies in March 2007 at the University of Tokyo. His primal research topic is to rethink "flexibility" on presentation tools. In addition, He is also interested in pen UIs, multi-modal UIs, practical applications of recognition technologies, and UIs for educational fields.

Koji Tsukada, Researcher
Precursory Research for Embryonic Science and Technology (PRESTO)
Japan Science and Technology Agency (JST)
email: tsuka [at] acm.org
He developed various interactive devices/techniques for ubiquitous computing as follows:

- Ubi-Finger: gesture input device for mobile/ubiquitous environment(2001)
- ActiveBelt: a belt-type tactile navigation system (2004)
- EyeCatcher: a digital camera for capturing a variety of natural looking facial expressions in daily snapshots (2010)
- TagTansu: a wardrobe to support creating a picture database of clothes (2008)
- EaTheremin: a fork-type instrument for dietary entertainment (2011)


Bio: 
Koji Tsukada is a researcher of Japan Science and Technology Agency (JST) since 2012. Before joining JST, he worked as an assistant professor at Ochanomizu University for 5 years and had been a researcher at National Institute of Advanced Industrial Science and Technology (AIST) for three years. He received his Ph.D. from Keio University in 2005. His research interests include human computer interaction, ubiquitous computing, home applications, and prototyping tools. 

[Policy for News Gathering]


We are able to provide media information through e-mail interview for this issue.
If you want telephone interviews, we will try to arrange our schedule. But note that a live broadcasting will be dangerous because we are not English natives.
If you need a photo session, higher-resolution photographs can be downloaded from here. So try them at first.
Also, the photographs and video published on these web pages and YouTube can be used in the form in which they appear.
Policy for face-to-face interviews with/without video sessions:
  • If you want face-to-face interviews without video sessions, we will try to arrange our schedule. But note that a live broadcasting will be dangerous because we are not English natives.
  • Our research is currently in the initial stages, and the speech inhibiting effect varies greatly with the individual and with accustomization, so video sessions with live demonstrations using the gun-type prototype are respectfully declined.
  • However, the simple speech-inhibiting application, which is available free-of-charge, can be used freely in program production or other uses. If you want video sessions including live demonstrations using the simple speech-inhibiting application, we will try to arrange our schedule.


Link to a related video on youtube
Publications 
  • Kazutaka Kurihara, Koji Tsukada, "SpeechJammer: A System Utilizing Artificial Speech Disturbance with Delayed Auditory Feedback," http://arxiv.org/abs/1202.6106 , 2012. 

© Kazutaka Kurihara