July 15, 2002

bugfixes in regex

oops, spoke too soon. expand.lisp was indeed being used (by defregex, which was used by the speed test code). A new version that fixes this is now available.

Posted by: mparker762 at 06:47 PM
Bugfixes in regex

Uploaded a new version of regex.tgz and tputils.tgz that doesn't have the spurious reference to "expand.lisp" in the system file. It wasn't being used by this release.

Posted by: mparker762 at 04:30 PM
July 14, 2002

Improved cmucl compatibility

Uploaded a new version of regex.tgz and tputils.tgz that doesn't use the "finally return" extensions to LOOP, which apparently CMUCL and ACL don't like very much.

Posted by: mparker762 at 10:34 AM
June 06, 2002

Regex Thoughts

After skimming Wall's latest Apocalypse on regexes , I think the next version of the CLAWK regex engine will move towards supporting something close to this, although probably as a separate syntax. Several of the features he's talking about are supported by the intermediate representation and the backend compilers, but there's no good way to add them to the surface syntax and remain compatible with AWK regexes.

Besides the changes to the surface parsers, I will also be putting in support for a sexpr surface syntax.

Posted by: mparker762 at 03:55 AM
May 09, 2002

Regex Compiler Experiments

After further testing the new sexpr-generating backend, I think I'm gonna have to abandon it. While it works just fine, the compile times get incredibly long for complicated patterns. The problem seems to be in the Lispworks compiler itself, trying to chew on the large number of internal functions, and the sheer size of the code.

Given the relatively small improvements in matching speed, I think my next tack is to rewrite it to simply generate code to build a closure-based matcher.

Posted by: mparker762 at 04:58 AM
May 08, 2002

New Regex Compiler Backend

It looks like the new back-end to my common-lisp regular expression engine is about finished. This one returns lisp s-exprs instead of closures, so it's suitable for use in macros like deflexer. It still needs a bit more brushing up before I'm ready to make it available, but it looks good so far. At the moment it doesn't seem to match any faster than the closure-based code. Using Lispworks 4.2, turning on inlining seems to improve the speed by 30% or so, but increases compile times exponentially and unacceptably. The old sexpr-generating regex compiler had the same unpleasant exploding-compile-time behavior with the Symbolics compiler, although there it gave an order-of-magnitude improvement in matching speed. At any rate, given the specialized nature of the code involved, I'm pretty sure I can hand-roll a customized inliner that that is linear-time.

Posted by: mparker762 at 03:48 AM
