KMP String Search

May 6, 2021


I have done over 1750+ leetcode problems to prepare for software engineer interviews, and my goal is to finish all the leetcode problems by end of this year (there are 1851 in total right now, but it is increasing every week). To me this is an accomplishment to achieve.

I will try to make a few posts regarding some of the hard problems I have done. I found a very interesting one and a brilliant solution as well when doing Leetcode problems - the KMP for string searching.

The problem is easy to understand: given a text string and a pattern string, try to locate the index in the text string where we can find a match between the text and the pattern. For example: we have a text "12123234" and we have the pattern "23", we need to output index 3 and 5 because those are the places where we can find the pattern "23".

The brute-force solution will be checking every character one by one, O(m*n), suppose m is the length of the pattern and n is the length of the text. This is bad. We need a better solution - KMP which gives us O(m + n). The underlying algorithm explanation can be found online, so I will skip the discussion of it.

Step 1: Construct LPS table

We will take the pattern as the input, and generate an array with indexes; each index will be used for resuming a failed match. The idea is to conservatively step back instead of starting from the very beginning again. The LPS table records those indexes where to resume.

Step 2: KMP search

Now we will just use the LPS table and perform our searches.

Step 3: Tests

As a final step we can write up a few tests to make sure our algorithm works fine. Notice that our functions can take both array or string input.

The final output looks like:

Found pattern at index 0

Found pattern at index 3

Found pattern at index 2

And we can verify they are correct!