Role of similarity measure in data science