Role of similarity measure in data science

Introduction

Laymen explanation

Measuring similarity or distance between two data points is fundamental to many Machine Learning algorithms such as K-Nearest-Neighbor, Clustering ... etc. If you want to know other use-cases as well, then this documents help.

Technical explanation

In statistics and related fields, a similarity measure or similarity function is a real-valued function that quantifies the similarity between two objects. Cosine similarity is a commonly used similarity measure for real-valued vectors.

Similarity learning is an area of supervised machine learning in artificial intelligence. It is closely related to regression and classification, but the goal is to learn a similarity function that measures how similar or related two objects are. It has applications in ranking, in recommendation systems, visual identity tracking, face verification, and speaker verification.

Usage of similarity measure

Cosine similarity measure groups documents similar to the one given by the user.
BERT ML algorithm is used to identify similarity between sentences
CNN network is used for identifying similar images to the one given by user

Reference

https://en.wikipedia.org/wiki/Similarity_measure

https://medium.com/analytics-vidhya/semantic-similarity-in-sentences-and-bert-e8d34f5a4677

https://arxiv.org/pdf/1709.08761.pdf

https://en.wikipedia.org/wiki/Similarity_learning

https://dzone.com/articles/machine-learning-measuring

https://images.app.goo.gl/ktps3rhhRxesvWfq8

Page updated

Google Sites

Report abuse