5 Types of Tokenizers
Alphabetic tokenizer
Alphanumeric tokenizer
Delimiter tokenizer
Qgram tokenizer
Whitespace tokenizer
23 Types of String Similarity Measures
Affine gap
Bag distance
Cosine
Dice
Editex
Generalized Jaccard
Hamming distance
Jaccard
Jaro
Jaro Winkler
Levenshtein distance
Monge Elkan
Needleman Wunsch
Overlap coefficient
Partial ratio
Partial token sort
Ratio
Smith Waterman
Soft TF/IDF
Soundex
TF/IDF
Token sort
Tversky index