Object Tracking using Arm Mounted Camera

GitHub: https://github.com/Rubo12345/visual_tracking

In this project, we have designed and implemented a object tracking system using robotic arm. This involves usage of 7 DOF Kinova Robot Manipulator with a stereo vision Intel RealSense Depth Camera mounted on the end-effector with eye-in-hand configuration. We try to keep the center of object at the center of camera. Using the camera feed, an object is detected in the camera frame and the center of object is estimated. The camera velocity in cartesian space is obtained as a product of the error distance between object center and camera center and the Image Jacobian. This is further converted in to joint velocities using Jacobian Inverse. This provides us the joint velocities required for tracking the object. Using the velocity controller, end-effector is moved in required direction and object is kept centered in the frame. Such kind of object tracking system, based on Visual servoing, is being used for cinematography, filming sports (for player tracking) and tracking and picking up a moving object on conveyer belt in manufacturing industries.

Kinova Gen3 Robot: Kinova Gen 3 is a 7 degrees of freedom robotic manipulator with a continuous payload capacity of 4kg and reach of 902mm. It has a modular hardware, robust Kinova Kortex application programming interface, and a very versatile end-effector interface module. A 2D/3D vision module can be integrated with the robot optionally.

Kinova Gen3

Method:

Color Thresholding: Used color thresholding to specify a color range and return a black and white image. Generally, color thresholding is done in HSV (Hue-Saturation-Value) range because HSV is more robust to lighting changes and keeps giving consistent results.
Center Estimation of Blob: Once the blob has been detected or segmented from the environment, we need a point on the blob to track it, and the most intuitive point is the center of the blob. We estimate the center of the object by calculating Moments or just by averaging all the bright pixels after color thresholding. After the calculation, we obtain the center of the object in Pixel frame, i.e., in terms of number of pixels from the upper left corner of the image.
Image Jacobian: Image Jacobian defines the relationship between the velocity of point in pixel frame to the camera velocity in world frame. Image Jacobian is 2 x 6 matrix. If the object moves forward, then by looking at two consecutive frames, we can get the position of center in both frames and the estimate the velocity of the object in pixel frame. This velocity can be then converted to the camera velocity by multiplying it with inverse of Image Jacobian. The image Jacobian is given by:- Pixel Velocity = Image Jacobian * Camera Velocity
Jacobian Inverse: Once we get the velocity of camera from the inverse image jacobian equation, we essentially obtain the end-effector velocity. Now, this can be converted to joint velocity using Inverse Jacobian of the robot. By multiplying Inverse Jacobian with the end-effector velocity we will get the joint velocities which are required to move the end-effector in that particular direction.

Implementation:

ROS Kortex: The Kinova Gen3 simulation package is provided by kinova named ros kortex which has ROS packages to use the Kinova Gen3 Arm in simulation as well as interact with a real Kinova Gen3 Arm. The catch here was the URDF file of Kinova Gen3 Arm did not have a camera integrated in it. We had to add the camera by making a couple changes in URDF file and adding the snippet to spawn the camera on the end-effector on the arm. We added Intel Realsense Camera using the intelrealsense in file libgazebo ros openni kinect.so

View from Camera in Home Position

Simulation Environment

Original Image

Mask Detection after Color Thresholding

Segmented Object

Velocity Control: Now we have a current and a desired location of the feature point, desired being the center of the frame. The error is calculated and it is multiplied with a Gain for fast convergence. This error is then multiplied with the Inverse of Image Jacobian to obtain required Camera Velocity. Now this Camera Velocity is in Body frame, we convert it to Space frame. This is the required camera velocity but we can only control the joint velocities so we convert the camera velocity to joint velocities by multiplying it with inverse robot jacobian. This provides us with the joint velocities which are then applied to the robot. The kortex driver which acts as an interface between the robot and ROS subscribes to a topic ”/my gen3/in/joint velocity”. When published relevant message with all 7 joint velocities to this topic the driver applies them in simulation. And this cycle continues on receiving each new image, it is thresholded, the center is estimated and required joint velocities are applied.

Results: With the above algorithm we were able to successfully track the object. It all came down to gain tuning for fast as well as smooth tracking. If the gain was kept too high (gain > 200) the robot would go berserk because the calculation gave insanely high joint velocities which would not even be possible in real life. For Moderately-high gains (gain > 100) the robot tends to overshoot and trace circles around the actual goal point. The time taken was considerable and also movement was a little shaky due to constant velocity changes. Moderate Gains (gains > 50) performed well in terms of smooth tracking but took some time to get to the point. For Low Gains (gains < 49) the tracking was very smooth, nearly free of shaking but took longer time.

Conclusion: To keep the moving object continuously focused in the real time video frame, it is very necessary to keep a track of the object while in motion. We describe a system to allow for real time object tracking in a defined work space using a robot manipulator. From the experiments, we found that gain tuning according to error between the camera center and object center, is of crucial importance for smooth motion of camera in real time following the object. We have followed a kind of approach which enables the robot to discover itself, the gain required for the robot to move smoothly according to detected error. Our results provide evidence that Visual servoing (vision-based robot control) is efficient way to track a moving object in real time.

Performance Testing for Initial Position

Performance Testing for position nearer to robot

Performance Testing for arbitrary position

Page updated

Report abuse

Object Tracking using Arm Mounted Camera

Contact

Email: rutwik.bonde@gmail.com