Google Demonstrates AI Making a Phone Call on Your Behalf

Oren Scheer - June 1st, 2018

“How long is the wait usually to, uh, be seated?”

“Do you have anything between 10 am, and, uh, 12 pm?”

“Hey, um, I wanted to know what are your hours for today.”


The above are all perfectly plausible sentences. They could have been uttered by anyone carrying out a natural conservation. They could have been spoken in any of a variety of conversations that everyone must carry out at some point in their life. Yet these three sentences don’t originate from an ordinary conversation between two people. Instead, these perfectly natural sentences were spoken by the Google Assistant. Yes, the same Google Assistant that is on every recent Android phone, competing with the likes of Siri and Amazon’s Alexa.

Two years ago, Google introduced the Google Assistant, which largely replaced their previous Google Now interface. This virtual assistant could engage in two-way conversation with the user, meaning that it could not only understand the user’s input and provide an answer accordingly, but it could answer context-based questions and carry out more complicated conversations. Typically, the Assistant can be used for a variety of different purposes, including reading and sending text messages, creating reminders, setting alarms, research, changing device settings, and other uses. Of course, this technology is always improving as Google’s algorithms get better in understanding the specific demands of the user, and thus the Assistant gains more functionality.

Yet last month, during this year’s edition of Google’s annual I/O conference, they unveiled Duplex, an “extension” of Google Assistant and its most ambitious purpose thus far. The purpose of Duplex is to make calls on behalf of the user, for simple tasks such as reserving a table at a restaurant or scheduling an appointment. The idea is that Duplex is able to engage in conversation with the employee on the other side of the line, asking them the right questions to reach the AI’s goal - all while sounding completely human and dealing with any difficulties that the language of the person on the other side of the line could present.

At this point, a healthy bit of skepticism regarding this technology is not a bad thing. The demonstration consisted of a keynote speech by Sundar Pichai, CEO of Google, followed by the playing of two pre-recorded conversations between Duplex and two businesses. In the first, Duplex successfully made a hair salon reservation, navigating the conversation to compromise on a time. The second demonstration was more challenging, with the person on the other end of the line speaking heavily accented english and struggling at times to understand what the Duplex AI was asking. Eventually, the call reached a conclusion as Duplex understood that, although it was instructed to make a reservation, a reservation was not actually needed based on what the employee said.

On the same day, Google published a blog post with a few other similar recordings. Since there have been no live demonstrations, it’s quite difficult to verify Google’s claim that these are unscripted conversations with legitimate businesses. This technology is supposed to begin public testing in the summer, so at that point its actual effectiveness could be gauged.

While most of the crowd of tech enthusiasts and journalists gathered at the event reacted with astonishment (in a good way), there are many ethical questions that people have. After all, to the average listener who has never heard of this technology, this sounds like two real people talking. In the demos, Duplex never identified itself as a robot, so it’s possible that the employee talking to it thought they were speaking to a human as well. Google later confirmed that in the actual rollout of this technology Duplex would identify itself as an AI at the start of the conversation.

Without this identification, telling Duplex and a human apart is quite hard. Google Assistant’s voice in the phone calls sounded mostly normal, with the slight pitch raise at the end of each sentence that is quite common amongst English speakers. Duplex frequently uses fillers such as “uh” or “umm” during a conversation. These serve a dual purpose: they make Duplex sound much more natural to ensure a smooth conversation, and the provide an instant for more complicated language processing that the AI must perform. However, they also raise significant ethical concerns for some: should we really be making robots that sound just like we do?

At the moment, this technology is very much in development, and over the next few months we will undoubtedly see more of it. Some people will very much appreciate letting Google Assistant take over their mundane calls to make reservations; others will recoil at the thought of a robot impersonating a human. Regardless, in only two years, the Google Assistant has advanced so far that the future ubiquity of AI in our lives could be closer than we once thought.