Software for controlling android robot

Here, we will explain the dialogue robot system software that is lent at the time of registration. It consists of the following:

  • A program that recognizes the customer's position, voice, facial expression, gender, age, etc. using sensors.

    • ①Image recognition program (face_recognition)

      • It is possible to acquire the position of the face rectangle on the camera image, the facial expression of each face, age, gender, beauty score, and the state of the glasses. Face ++ is used to recognize the information of each face. For details, refer to the Face ++ homepage.

    • ②Speech recognition program (GoogleSpeechRecognitionServer)

      • When there is a voice input in the microphone, the final results of recognized text and reliability are returned while returning the recognition character string intermediate result. Google's Speech-to-Text is used.

  • A program for controlling Android I, a program for synthesizing the voice of Android I.

    • ③Speech synthesis program (AmazonPollyServer)

      • When the spoken text is received as input, the synthesized voice is output from the speaker. You can stop the playback voice in the middle or change the emotional parameters of the voice. Amazon Polly's speech synthesis API is used.

    • ④Lip movement generation program (OculusLipSyncServer)

      • When the synthetic voice is played, the lip movement of Android I is controlled to synchronize with it. The user does not need to access this program. Oculus Lipsync Unity is used.

    • ⑤Neck movement, upper body posture generation program (MiracleHuman)

      • When it receives the emotion of Android I, the direction of the gaze, and the degree of interest in the object to look (dialogue partner, monitor, etc.), it automatically generates the gaze and upper body posture suitable for it and controls Android I. Also, if you specify a predefined gesture such as bowing, that gesture action will be played on Android I. Unity is used.

    • ⑥Facial expression generation & motion integration program (JointMapperPlusUltraSuperFace)

      • When it receives the label of the facial expression of Android I as input, it automatically generates the facial expression of Android I and controls the position of the facial part (cheek, eyebrows, etc.).

  • A simulator program for checking the behavior of Android I with CG.

    • It is included in the program of ④,⑤,⑥.

    • When you develop a program, it is not possible to check the movement on Android I, but by using these, you can check the lip shape, gaze, upper body posture, gesture, facial expression generated by the actual Android I with CG.

The overall configuration of the dialogue robot system using each of the above software and the configuration of the PC of the entire system are as follows. Participating teams will develop the "dialogue control" part. In addition, the image recognition and voice recognition parts (① and ②) may be developed independently without using the above software.