New regex generator tool online!

posted Dec 19, 2014, 6:32 AM by Alberto Bartoli   [ updated Dec 19, 2014, 6:44 AM by Eric Medvet ]
Our new regex generator is online!

Let's summarize briefly what has happened.

Epoch 1

At ACM GECCO 2012 we presented a paper in which we described our work "Automatic generation of regular expressions from examples with genetic programming". We greatly improved over the existing state-of-the-art, demonstrating a tool capable of synthesizing a regex for text extraction tasks of practical complexity:
  1. automatically,
  2. based only on examples of the desired behavior,
  3. without any external hint about how the target regex should look like.

(probably we may claim "for the first time")

Epoch 2

We continued to work intensively on this topic. We greatly improved our algorithm and had another paper accepted on IEEE Computer. We made this result publicly available as a webapp.

This tool generates regular expressions for extracting text snippets and attempts to generalize beyond the provided examples, i.e., it attempts to infer the general pattern that the user has in mind.

A side-effect of this work was another tool capable of playing regex golf (see here and here) automatically. Our results were published at ACM GECCO 2013 and we were also a finalist at the annual Human-Competitive awards. The regex golf tool classifies input strings and overfits the examples---i.e., no extraction, no generalization.

Epoch 3

We kept on working intensively and greatly improved our algorithm further, from many points of view. The new tool made public today is much more powerful than the previous one. That's why we have called it "Regex++".

Next month our IEEE Computer paper will come to press: we are now preparing another submission describing the new tool... stay tuned.