5.5. Anchors

Anchors are zero-width matches. They don’t match any actual characters in the search string, and they don’t consume any of the search string during parsing. Instead, an anchor dictates a particular location in the search string where a match must occur.

^ \A

Anchor a match to the start of <string>.

When the regex parser encounters ^ or \A, the parser’s current position must be at the beginning of the search string for it to find a match.

Whatever follows ^ or \A must constitute the end of the search string.

$ \Z

Anchor a match to the end of <string>.

When the regex parser encounters $ or \Z, the parser’s current position must be at the end of the search string for it to find a match.

Whatever precedes $ or \Z must constitute the end of the search string.

\b

Anchor a match to a word boundary.

\b asserts that the regex parser’s current position must be at the beginning or end of a word. A word consists of a sequence of alphanumeric characters or underscores ([a-zA-Z0-9_]), the same as for the \w character class.

> Raw strings do not process escape sequences ( \n , \b , etc.) and are thus commonly used for Regex patterns, which often contain a lot of \ characters. The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.

\B

Anchor a match to a location that isn't a word.

\B does the opposite of \b. It asserts that the regex parser's current position must not be at the start or end of a word.

> In the above examples, a match happens on lines 3, 4, and 6. On line 4, there's no st that is bounded by a non-word.

Exercise 5.5

Save the python.txt into a variable text. Then do the following:

  1. Split all the words
    Split all the words in the text (by whitespace) and save it into a variable splitted_text.

  2. Find all the words started with 're'
    In splitted_text, find all instances of words that started with 're'. Print the length of it.

  3. Find all the words ended with 'ing'
    In splitted_text, find all instances of words that ended with '-ing'. Print the length of it.

  4. Find all 'to', 'of', 'in'
    Using original text, find all instances of the words 'to', 'of', and 'in'. Print the length of each.