Tokenization is a way of separating a piece of text into smaller units called tokens. Here, tokens can be either word, characters, or sub-words. Hence, tokenization can be broadly classified into 3 types – word, character, and sub-word (n-gram characters) tokenization.
Sentence Tokenization
Word Tokenization