There has been significant work done on the topic of insect detection. These works focused on the successful detection and discrimination of insects from non-insect objects in images. Some of these studies add the following step of classification to their respective insect classes after being successfully being detected. In the literature, detection and discrimination make use of image processing techniques and feature extraction methods to discriminate with high degrees of accuracy, most of which are streamlined into a system for more efficiency. These systems use visual input varying from static imagery to video recording of objects of interest.
Insect detection and classification through images as the main form of input data is very popular due to the accessibility and availability of image-capturing methods. It is the most abundant when it comes to insect detection research and is the most varied when it comes to implementation. Combinations of digital image processing techniques, machine learning, and statistical approaches make up the bulk of the variation of the researches in the field.
For detection-focused problems, it is not uncommon to see the use of video footage and tracking as the main source of data and input for a model. One such system was created by Guoqing and Alam (2016) which detected mosquitos and tracked their respective positions based on video feed using Frame Difference Method. Mosquitos are placed in a mosquito box wherein a camera installed inside will track and analyze values (pixel value, robot position, coordinates, etc.) which will be used for detection. Background motions are removed and the difference frame algorithm is applied to detect the mosquito and its position accurately. A similar study involves camera surveillance inputs for the in-site detection of crop pests. Frames of the surveillance input are converted from RGB frames into grayscale images. From these images, potential insect locations are extracted based on centroid locations. The locations are processed by application of connected components algorithm and quick local conquer-and-merge segmentation strategy, then the likeliness of the detected objects being insects is classified from the validated images. (Bechar, Moisan, Pulsar, & Antipolis, 2010)
For image-based detection implementations, the approach differs compared to the video-based counterpart. Due to the nature of the static image, meaningful information must be extracted from datasets of images to detect patterns and features\. Image processing techniques alongside CNN (Convolutional Neural Network) architectures are used to solve computer vision problems like image classification, image localization, and object detection. In 2018, Zhong et al. discussed a system for insect counting that involves both insect detection and classification via YOLO (You Only Look Once) network architecture, a convolutional neural network object detection system. Surveillance footage acquired is processed per image and each undergo feature extraction by usage of SVM(Support Vector Machines). The results of YOLO were then split into 7 classes of insects and are tallied and evaluated. The study relied on the YOLO for a coarse initial counting which was supplemented with a finer counting of objects via SVM.
Convolutional neural network based systems are quite powerful enough on their own to solve common computer vision tasks, however it should be noted that image processing techniques are still quite a powerful tool to support computer vision studies. For example, one of the earlier studies done is one by Gassoumi, Prasad and Ellington in 1994 wherein they applied a soft computing approach for recognizing and classifying insects as either harmful or beneficial to the growth of cotton plants. Their solution was an artificial neural network for the object recognition process. The input images were enhanced via morphological operations and were segmented into individual objects to be used as training data. for their ANN. Miranda, Gerardo, and Tanguilig III (2014) provided a purely image processing based approach. They had developed an automatic pest detection and extraction system that only uses image processing techniques. Grayscale images of captured insects are compared to each other, one being a reference picture and the other as the input. Differences in pixel values will determine if insects have successfully been captured and detected. Pests are then extracted by horizontally and vertically scanning the background subtracted image of the pests and providing pixel values for the corresponding rows and columns. The resulting values are used as a base for determining the coordinates of each pest.
Another image processing based study was done by Thenmozhi and Reddy in 2018. In their study, they had proposed methods for preprocessing, segmentation, and feature extraction that detected the shape of insects in images of crops. His methods made use of calculations and formulas that determined class based on the morphological and geometric features. The preprocessed grayscale image is divided into smaller parts where segmentation techniques are applied. Sobel detection algorithm is used to detect edges of insects in the segmented images. After the segmentation, shape feature analysis is applied to extract nine geometric shape features (Area, Perimeter, Major Axis Length, Minor Axis Length, Eccentricity, Circularity, Solidity, Form factor, Compactness). These features are used to classify the insects into a selection of shapes. Alternatively, Umar, Abbas, and Gulzar (2017) introduced an approach for insect image classification feature extraction by use of edge detection and histogram. Their system focuses on the color, shape, texture and spatial layouts. Once general characteristics are acquired, Bayesian network was implemented to accurately classify insects and pests such as termites, ticks, and mosquitos.
In the realm of insect and pest detection studies, mosquito-focused computer vision problems are far from uncommon. Mosquitoes are a vector for disease and infection, so the need to find ways to suppress the spread pushes for more research and innovation. This is true with a study by Fuchida et. al (2017) in which he created an automated mosquito classification module that is to be deployed in mosquito habitats. Color and morphological features from mosquitos are extracted from the images captured and then are followed by support vector machine-based classification. The distinct features extracted are used to train the SVM classifiers. The work of Muhammad Danish Gondal (2016) makes use of SVM in a similar manner. Their implementation of SVM focuses less on the individual characteristics of objects and more on the properties of the images that are to be calculated.
Some literature focus mainly on the complexities and unique characteristics of mosquitos for detection and discrimination. Elemmi et al. (2017) proposed a system that identifies images of mosquitos as either dengue carrying or non-dengue carrying variants, and then even further classified as male or female dengue-carrying mosquito. An input image of a mosquito is compared to the ideal image of each class based on their respective descriptor values (size, leg patterns, body shape, and color). A specific study focusing on the Aeges aegypti species of mosquito was done by Reyes et al. back in 2016.The study involved the discrimination between dengue and non-dengue carrying mosquito species via SVM implementation. Images of the different mosquito parts per species were taken with image augmentations being applied to create appropriate datasets. The mosquito dataset was then used for Image Texture Analysis using an SVM implementation.
Based on the existing literature, the studies on insect detection and classification that were done were commonly held in laboratory-controlled environments. This is further supported by the study made by Martineau et al. (2017), that states that these lab-based setups are performed alongside entomologist and other similar experts in the field. These types of setups influence what techniques are used in the identification of insects, with the manner in which the image is acquired also being directly affected. Constraints such as object orientation, object clarity and quality can be fitted to what the problem requires. The downside to lab-based systems is that it is mostly done manually with constrained poses. The unconstrained setting in the open introduces a lot of variance not only to the process of acquisition but to the images captured. The implementation of insect trapping contraptions is apparent in both settings.
Image segmentation performed in the literature involve the input images undergoing morphological operations to remove noise and image processing techniques such as thresholding, contour and background subtraction to isolate the objects of interest. However, some studies choose not to segment the image into relevant objects, but to rely purely on the properties that the input image has, such as color histograms and gradient changes, to successfully isolate and detect the object in the image. As for insect feature extraction, combinations of structural, sensory and, mathematical features were extracted and fed to a classifier to detect patterns from the input. Some classifiers rely solely on feature vectors and support vector machines, but other also incorporate the use of deep neural network implementations in the classification and discrimination between objects.