Training Deep Learning Models for Industrial Defect Detection with Limited, Imbalanced, and Defect-free Samples

The research team is working closely with VisEra Technologies to develop algorithms for automated optical inspection, including computer vision technology, camera auto-calibration technology, and deep learning algorithms for defect detection. In addition to developing deep learning models for the classification, detection, and segmentation of product defects on the production line, the team aims to deepen further their deep learning research to reduce labor costs and algorithm design with limited learning samples. By using many learning samples labeled with defect locations, the defect detection model obtained can be used for real-time defect detection on the production line. Although this achieves the goal of saving labor costs for manual inspection and achieving more accurate and stable detection performance, there is a high labor cost in labeling defect samples for every new product line opened. Additionally, the number of positive models (with defects) collected at the beginning of a newly opened production line is minimal. In situations where sample numbers are severely limited, and the ratio of positive to negative samples is severely imbalanced, the resulting deep learning model will be biased and low-performing. Therefore, to reduce labeling costs and solve the problem of insufficient samples, the research team is actively developing algorithms for semi-supervised and weakly supervised learning and combining them with transfer learning. Leveraging the advantages of convolutional neural networks in feature extraction, they use feature matching to identify samples with inconsistent labeling in training sets for automatic or manual correction to reduce labeling costs. In addition, self-supervised learning and domain adaptation can be used to learn features from their data or to use labeled samples from similar past products to find standard features to solve the problem of insufficient training samples. These research directions are not only highly motivating for development in industrial production lines but are also trends in deep learning in recent years.


Developing defect detection technology using deep learning can easily and freely be applied to any item that customers want to detect, thus helping to effectively identify defective products by removing the many influences brought about by environmental factors and different objects. However, a large amount of training data is required to train a good detection model. It is difficult to collect samples with defects on available production lines due to high yield rates. In addition, labeling a large number of pieces will consume a lot of labor costs. Therefore, it is possible to effectively train a defect model through four techniques even when the number of correctly labeled samples is low. The four techniques are Active Learning, Semi-Supervised Learning, Domain Adaptation, and Transfer Learning.