MiNa: A dataset of Micro- and Nanoplastic particles
MiNa: A dataset of Micro- and Nanoplastic particles
Plastic pollution presents an escalating global issue, impacting health and environmental systems, with micro and nanoplastics pervasive across mediums from potable water to air. Traditional methods for studying these contaminants are labor-intensive and time-consuming, necessitating a shift towards more efficient technologies. In response, our work introduces MiNa, a novel, open-source dataset engineered for the automatic detection and classification of micro- and nanoplastics using object detection algorithms. The dataset, comprising Scanning Electron Microscopy (SEM) images simulated under realistic aquatic conditions, categorizes plastics by polymer type across a broad size spectrum. In our paper, we have demonstrate the application of state-of-the-art detection algorithms on MiNa, assessing their effectiveness and identifying the unique challenges and potential of each method. The dataset not only fills a critical gap in available resources for microplastic research but also provides a robust foundation for future advancements in the field.
The micrographs in the MiNa dataset showcase the unique shapes and sizes of Micro- and Nanoplastics (MNPs) derived from Expanded Polystyrene (EPS), Polypropylene (PP), Polyethylene (PE), and Polyethylene Terephthalate (PET) materials. PS MNPs are polygonal with sharp edges and vary in size from tens of microns to around 1 micron. PP MNPs, on the other hand, have smoother, spherical shapes and tend to cluster after separation from the bulk polymer. PET MNPs display a wide size range and significant agglomeration, while PE MNPs are mostly found in nano and micro sizes with minimal agglomeration. These Scanning Electron Microscope (SEM) images confirm our previous findings, highlighting the diverse morphologies of MNPs from different polymers and their changes during degradation.
Polyethylene (PE)
Polystyrene (PS)
Polypropylene (PP)
Polyethylene terephthalate (PET)
The particles in the SEM images are manually annotated using the V7 platform. Due to varying MNP concentrations, some micrographs are challenging and time-consuming to annotate. To expedite this process, we employ Segment Anything Model (SAM) guided annotation and thresholding methods.
Polyethylene (PE)
Polystyrene (PS)
Polypropylene (PP)
Polyethylene terephthalate (PET)
We have trained and evaluated various networks using our dataset, focusing on both detection and classification of Microplastic Particles (MNPs). Our experiments included YOLOv10, renowned for its high accuracy in object detection, making it particularly suitable for our application. Additionally, we utilized Faster R-CNN and its extended version, Mask R-CNN, both incorporating a ResNet-50 backbone. These networks are widely recognized in similar research fields and have demonstrated high accuracy in our tests. For detection, the goal was to identify MNP particles, while for classification, we aimed to determine their specific types.
YOLOv10 frequently fails to identify partially visible particles. It also struggles with detecting overlapping particles in patches containing densely packed MNPs, and often misses particles when there is significant variance in particle sizes. Additionally, YOLOv10 tends to overlook small particles that have low contrast with the background .
PS reports the lowest recall and F1 score among all classes across all networks, including YOLOv10 as we see in the Figures. The dense and crowded image segments significantly challenge the network, leading to fewer true positives for PS, as many particles are ignored in the background.
Faster R-CNN performs better with partially visible particles but still misses some. It handles dense and overlapping scenarios more effectively than YOLOv10. This network is also more robust against varying particle sizes and less likely to ignore small particles in the image patches.
Faster R-CNN and Mask RCNN outperform YOLOv10 in all detection and classification experiments. In terms of precision, Faster R-CNN and Mask R-CNN perform comparably, with their precision differing by no more than 3% in the most extreme cases
Mask R-CNN is capable of segmenting partially visible particles and dense, overlapping regions. However, it sometimes misses particles, particularly when they are large relative to the patch size. While Mask R-CNN effectively segments small particles, it often fails to segment missed detection cases . Differentiating occluded or small particles from the background remains a challenge, impacting the overall F1 score for these RCNN networks.
Faster R-CNN exhibits marginally better recall values compared to Mask R-CNN, leading to higher F1 scores across all classes. This improvement may be attributed to the increased computational overhead required by Mask R-CNN for instance segmentation. The performance disparity between Faster RCNN and Mask R-CNN is particularly noticeable under the AP50 criterion. The complexity of images within each class influences this discrepancy; for instance, in the PS class, which poses the greatest challenge in terms of particle density, the difference reaches 5.34%, whereas it remains around 2% for the PE class.