ICub Tries to Play an Electronic Keyboard

One of the most demanding cognitive challenges for a human mind is music performing. This project provides the iCub with a unique opportunity to interact with the environment. The robot is able to compute the 3D location of each electronic keyboard key it can see, listen to a sequence of musical notes a human played on the keyboard, and press the same musical notes in the same order on the keyboard. The design consists of three modules: a vision module, a pitch detection module, and a motor control module.

The figure shows a keyboard labeled with scientific pitch notation and integer numbers on the pitch axis. C4 is used as the origin.

Vision Module

The vision module processes the visual input received from the iCub camera and obtain the 3D location of each keyboard key in the robot's sight. We do not use a set of predefined key locations. No markers are attached to the keys to giving any visual clues. The 3D location for each key can be obtained using the geometric relationship.

The image is seen by the robot

Thresholding

The contours for each key and the 2D image coordinates are found

Find the C notes

Pitch Detection Module

The pitch detection module detects a musical event and tells which musical notes were played. Given a wave file, an onset detection is first performed, so that the program knows when a new musical note is played. Then for each musical note, the YIN pitch detection algorithm is used for detecting the frequency of the note.

The result of the onset detection for a wave file containing C4, D4, E4, F4, and G4. C4 is played at the 50th frame and G4 is played at the 233rd frame. A frame has 2048 samples

The result of the YIN pitch detection for F4. The reported frequency is 1/[(252-126)/44100]=350 Hz. The true frequency for F4 is 349.23 Hz

Motor Control Module

The vision module computes the 3D location for each key in the robot's view. The pitch detection module finds which keys are pressed by a human. The motor control module sends commands to arm joints such that the iCub can move his arms and hands to the desired places using inverse kinematics.

The figure shows the path that the hand of the iCub robot follows when four musical notes, C4, G4, C5, and A4 are played by a human.

Demo

The video below shows that the robot presses C5 after C5 is heard by the robot: