Target Tracking

All files related to target tracking are stored in the track_target package. The main program for the target tracking is track_target.py.

Image segmentation

1. Original image

2. HSV thresholding

3. Contour detection

4. Circle detection

Images from the camera are read using a subscriber. The camera image is in BGRA format and has a resolution of 1280x800.
The image is converted to the HSV color space, and the hue channel is shifted by +100. A lower and upper threshold on the yellow color on the innermost circle on the target is used to create a mask. The image is also smoothed with a median blur with filter size 9 to remove some noise.
- The hue channel is shifted because the original HSV image results in very high (>225) and very low (<25) values for the yellow color, which makes it hard for the threshold to work.
The contours are detected on the mask, and the one with the largest area is selected.
The minimum enclosing circle around the contour is found. The program only says the target circle has been found if the radius is large enough, to avoid false positives. The (x, y) position of the circle's center and the radius are used to calculate the position as described below.

Position calculation

Depth calculation

Since we don't use AR tags or depth cameras, we need to rely on proportions instead. First, we need to find the height of the image in meters. Since we know the diameter of the target circle in meters, as well as its height in the image in pixels, and the height of the whole image in pixels, we can calculate the effective "height" of the image in meters. This is described by the bottom-left formula above.

Next, we find how far away the target is from the camera, Z. We use the known diameter of the whole target in meters to find the distance from the camera at which the target covers the entire height of the camera image. Since we also know the "height" of the current image as described above, we can use the bottom right formula to find out how far away the target is from the camera.

X-Y calculation

Next, we need to find the X and Y positions of the target relative to the camera. To do this, we use the intrinsic camera matrix, as shown as the top formula, which normally converts 3D coordinates (X, Y, Z) to 2D coordinates (u, v, w). These are used to find the actual x and y coordinates on the image, as shown in the middle formula. This matrix can be found by querying the topic /cameras/left_hand_camera/camera_info, and it will always be remain static.

Now, we need to basically "invert" this equation to find X and Y. Since w = Z, where Z is the depth found in the previous section, we'll get equations to solve for X and Y in terms of x and y, respectively, as well as the constants from the intrinsic camera matrix. These solved equations can be seen above as the bottom two formulas.an be seen above as the bottom two formulas.

Publishing the target position

The target position needs to be published so the actuation module can calculate the transform. We use a TransformBroadcaster to publish the target position, as described above, relative to the left hand camera's axis. This ensures that if the camera is moved, either by a team member or by the robot to avoid colliding with itself, the target position is still published accurately. Our algorithm above to segment the image and calculate the position is fairly quick, which means the target's position gets published pretty often. The transform is being published as /target_new.

One problem we ran into was that the robot was publishing most of its transforms at a timestamp about 5 minutes into the future. This caused issues for the actuation module when it tried to find the transform between the robot and target. Therefore, we had to make sure to add 337 seconds to the header's timestamp to match the other transforms being published.

<< Hardware

Actuation for Static Target >>

Page updated

Report abuse