A key difference between clustering-based approaches and similarity-based approaches is the source of the TPL feature. Similarity-based approaches extract the TPL feature directly from the complete TPL JAR/AAR files, while clustering-based approaches obtain the feature from the decoupled app packages.
Therefore, clustering-based approaches reply heavily on the accuracy of module decoupling without any prior knowledge of the TPL. However, it is quite difficult to find clear and correct boundaries of the in-app TPLs in practice. Below are two examples.