Shoutout to +Jakob G
! Thanks to his efforts while interning in Aarhus, the 1.9 release of the Dart VM includes a port of V8’s Irregexp Engine for regular expressions making your regular expressions up to 150 times faster than before! We chose to take a different approach to integrating the Irregexp Engine: reuse Dart’s existing optimizing compiler and code-generation backend. This reuse helps reduce maintenance cost and share optimization efforts: optimizations for Dart will benefit regular expressions and vice versa.
In V8, Irregexp compiles a regular expression by parsing it and converting it into an intermediate automaton representation, which V8 then analyzes, optimizes and finally directly generates native machine code. The V8 implementation requires a native-code backend for each supported host architecture. Indeed, at the time of writing V8 has 7 distinct Irregexp backends.
In Dart, Irregexp initially compiles a regular expression, just as in V8, by parsing, converting, analyzing and optimizing it. Finally Dart generates IR (intermediate representation) instructions. This IR is the same representation used for ordinary Dart code and so we use the existing Dart optimizing compiler to further optimize the code and generate native machine code.
The Dart implementation has been tested against the same benchmark suite as developed for V8’s Irregexp. Here, the Dart VM is within a factor of two from V8. For short-running regular expressions, such as parsing URLs, Dart is actually faster due to a very fast entry to the generated matching code.
There are several reasons we don’t hit the same peak performance as V8 across the board. For example, Dart spends more time on compiling regular expressions because, after building the Dart IR, we further optimize the code. Also, V8’s hand-tuned machine-code backends are expertly tailored to executing regular-expression code on each individual platform. The machine code Dart produces is not as efficient because the existing optimizing compiler can’t make the same assumptions about properties of the code (such as what to hold in registers and what not to). We will be looking at these issues, and due to the single shared Dart backend, improvements become improvements to the Dart VM as a whole.
We hope you enjoy Dart's new and improved regular expressions. Look for the new implementation starting with Dart 1.9, which is now in the developer channel.