Multimedia Information Processing Lab

Introduction

To realize a happier future society, Multimedia Information Processing Lab is conducting research in the area of computer vision using deep learning methods. This laboratory has been continuously making successful industrial projects related to computer visions tasks. Currently there is an ongoing NRF project on anomaly detection in fabric products based on self-supervised learning image classification and visualization for interpretability of the prediction and artificial intelligence-based parking sign recognition system for the disabled persons using one of the deep learning techniques.

Current Research

UzADL: Anomaly detection and localization using graph Laplacian matrix-based unsupervised learning method


Visual inspection is an essential quality control process in industrial businesses. It is usually automated due to its tedious procedure. An automated visual inspection (AVI) attempts to detect items with abnormal patterns based on image data. Recent developments in computer vision models, especially the introduction of deep convolutional neural networks, has extensively improved the accuracy and speed of AVI systems. However, supervised learning approaches for AVI necessitate a large number of annotated data, while the unsupervised ones lack accuracy and interpretability as well as require an extensive amount of time for training and inference. Therefore, in this study, we propose an unsupervised learning-based computationally inexpensive, efficient, and interpretable model UzADL for AVI to address the aforementioned problems. This system has three principal stages. First, unlabeled images are annotated using a pseudo-labeling algorithm. Second, the obtained instances are trained during a training process stage. Third, identified abnormal instances’ defective regions are explicitly visualized using an anomaly interpretation technique. Owing to an elaborate unsupervised learning method based on the pseudo-labeling algorithm using graph Laplacian matrix that allows transforming defect detection into a classification task, the proposed system has rapid convergence ability and significantly outperforms existing deep learning-based AVI methods. In the experiments conducted with three real-life fabric material databases NanoTWICE, MVTec anomaly detection (MVTec AD), and DWorld datasets UzADL outperformed other methods in terms of accuracy and speed when assessed using several evaluation metrics.

Past Research


Medical Image Segmentation based on AEDCN-NET

Image segmentation was significantly enhanced after the emergence of deep learning (DL) methods. In particular, deep convolutional neural networks (DCNNs) have assisted DL-based segmentation models to achieve state-of-the-art performance in fields critical to human beings, such as medicine. However, the existing state-of-the-art methods often use computationally expensive operations to achieve high accuracy and lightweight networks often lack a precise medical image segmentation. Therefore, this study proposes an accurate and efficient DCNN model (AEDCN-Net) based on an elaborate preprocessing step and a resourceful model architecture. The AEDCN-Net exploits bottleneck, atrous, and asymmetric convolution-based residual skip connections in the encoding path that reduce the number of trainable parameters and floating point operations (FLOPs) to learn feature representations with a larger receptive field. The decoding path employs the nearest-neighbor based upsampling method instead of a computationally resourceful transpose convolution operation that requires an extensive number of trainable parameters. The proposed method attains a superior performance in both computational time and accuracy compared to the existing state-of-the-art methods. The results of benchmarking using four real-life medical image datasets specifically illustrate that the AEDCN-Net has a faster convergence compared to the computationally expensive state-of-the-art models while using significantly fewer trainable parameters and FLOPs that result in a considerable speed-up during inference. Moreover, the proposed method obtains a better accuracy in several evaluation metrics compared with the existing lightweight and efficient methods.


General overview

Experimental results

Multiclass Semantic Segmentation in Autonomous Driving

Considering the importance of autonomous-driving applications for mobile devices, it is imperative to develop both fast and accurate semantic segmentation models. Thanks to the emergence of deep learning (DL) techniques, the segmentation model’s accuracy has seen great improvement. However, this enhanced performance of currently popular DL models for self-driving car applications comes at the cost of time and computational efficiency. Moreover, networks with efficient model architecture experience a lack of accuracy. Therefore, in this study, we propose a robust, efficient, and fast network (REF-Net) that combines carefully formulated encoding and decoding paths. Specifically, the contraction path uses a mixture of dilated and asymmetric convolution layers with skip connections and bottleneck layers, while the decoding path benefits from a nearest-neighbor interpolation method that requires no trainable parameters to restore the original image size. The carefully formulated model architecture considerably reduces the number of trainable parameters, required memory space, training, and inference time. In the experiments conducted using Cambridge-driving Labeled Video Database (CamVid) and Cityscapes datasets, the proposed model required an average of 90 times fewer trainable parameters and approximately four times less memory space, which allowed 3-fold faster training runtime and 4-fold inference speed-up compared with the existing computationally intensive models. Moreover, despite its notable efficiency in terms of memory and time, the REF-Net attained superior results in several segmentation evaluation metrics, yielding an average of 2%, 4%, and 3% increase in pixel accuracy, Dice coefficient, and Jaccard Index, respectively.

Contact

Kyungpook National University, Daegu Buk-gu Daehakro 80 IT-융복합관 256


jhk@knu.ac.kr

+82-53-5551