In the case of a current self-driving tractor, it is possible to autonomously drive in a relatively standardized farmland with square-shaped area. This performs straight driving based on GPS information at the vertex of the farmland. If GPS information outside the farmland is obtained at regular distances, it can be applied to unstructured farmland.
Development of an algorithm for recognizing the boundary of farmland using a stereo camera
Development of the Occupancy grid map derivation algorithm using the boundary recognition and GPS-based tractor position estimation system
Derivation and control of waypoint for driving inside the boundary of cultivated land
Generating GPS coordinates at the boundary of farmland using the Occupancy grid map
Autonomous driving algorithms of a tractor are as follows:
Acquire RGB image and depth image using ZED2 stereo camera
Input 4 channel data in PINet and output the information on the boundary of the farmland
Convert the boundary information into Bird Eye View
Update the Occupancy Grid (OG) map according to the location of the tractor using GPS information and heading angle
Control tractor using target heading angle obtained from OG and tractor position information
Repeat the above process to acquire GPS data on the boundary of farmland
Since the data were acquired with ZED2, the acquired data is a continuous image. Therefore VOS was utilized for easy labeling. After labeling the images in the first frame, VOS was used to generate labels for the remaining frames, correct the mistakes, and label them
When the orange line is the boundary line and the point is the average coordinate of the boundary line, the label is set as follows. If there is an boundary of farmland in the grid, the confidence value is 1, and the x, y offset of the average coordinate of the edge is normalized as the grid size. This can be expressed in one image as follows.
A camera is needed for an autonomous tractor to drive along the boundary of farmland based on image information. In this study, RGB images and depth images obtained from ZED2 stereo camera are used to maintain performance robust to environmental change.
Point Instance Network (PINet) receives 4 channel data of RGB and depth as input and outputs the information on the outside of the farmland.
Randomization was performed using Albumentations. I used Horizontalflip, ShiftScaleRotate, add noise, blur, rgb shift, etc.
The batch norm is mainly used in deep learning, and the domain norm was used in this study. The batch norm normalizes all features across the batch and spatial locations. The domain norm normalizes all features across spatial locations, and then across channel.
Data obtained from rice paddy were used as train dataset, and data obtained from fields were used as validation dataset.
Focal loss is based on Cross Entropy (CE) loss. Two losses are shown below. Focal loss multiply CE loss by the output to focus more on the unpredictable parts.
In order for the tractor to autonomously drive, it is necessary to obtain information on the boundary of the farmland and set the way point based on it. For this purpose, after transforming the output to Bird Eye View (BEV), farmland information is saved by Occupancy Gridmap (OG). Based on the opencv python library, the image is transformed after creating a transformation matrix by matching the image coordinates and distance information in real space. Currently, a fixed transformation matrix is used, but it needs to be set to change with the inclination of ZED2 in the future.
The prediction results with field data are as follows. The left is the RGB input, the middle is the Ground truth, and the right is the output.
Considering that it is difficult to recognize the boundaries even when looking at the RGB image with the human eye, I think it follows the shape to some extent. Analysis of the confidence in the value of the output follows.
Except for the algorithm for setting and controlling the waypoint, each process has been completed, and the work of merging it into one is currently in progress. It will complement the BEV conversion algorithm according to the inclination of the tractor, and finally complete the algorithm for the tractor to autonomously drive along the boundary of farmland and to store GPS data.
Ko, Yeongmin & Jun, Jiwon & Ko, Donghwuy & Jeon, Moongu. (2020). Key Points Estimation and Point Instance Segmentation Approach for Lane Detection.
https://www.youtube.com/watch?v=v-Rm9TUG9LA
Zhang, Feihu & Qi, Xiaojuan & Yang, Ruigang & Prisacariu, Victor & Wah, Benjamin & Torr, Philip. (2020). Domain-Invariant Stereo Matching Networks. 10.1007/978-3-030-58536-5_25.
Lin, Tsung-Yi & Goyal, Priyal & Girshick, Ross & He, Kaiming & Dollar, Piotr. (2018). Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. PP. 1-1. 10.1109/TPAMI.2018.2858826.