For our third task, we decided that the robot should be able to place the item it has picked up into the palm of the human's hand. To accomplish this, we needed to be able to detect the palm of the hand of the human and extract the coordinates of its center point to feed into or robot system. This was accomplished using MediaPipe's Hand Landmark Detection and some simple calculations in the palm detection node.
MediaPipe Hand Landmark Detection is a real time hand tracking software available by MediaPipe. It first works by segmenting out a hands within the camera's field of view. Then the model predicts 21 3D landmarks representing key points on the hand, including fingertips, knuckles, and the base of the palm. These landmarks provide a detailed skeletal representation of the hand's pose. The real time hand landmark data was crucial for detecting palms.
Once the hand landmarks had been found, we then used this data to calculate the center point of the palm. To do so, we took the five landmarks across the top of the palm and the two landmarks in the wrist. With these landmarks, we averaged the x and y position and fed this into the model to get the palm center point. The x and y coordinates were passed back into the main node for frame conversion