Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Dynamic Regexes

Overview

Static regexes are dandy, but sometimes you need something a bit more ... dynamic. Imagine you are developing a text editor with a regex search/replace feature. You need to accept a regular expression from the end user as input at run-time. There should be a way to parse a string into a regular expression. That's what xpressive's dynamic regexes are for. They are built from the same core components as their static counterparts, but they are late-bound so you can specify them at run-time.

Construction and Assignment

There are two ways to create a dynamic regex: with the basic_regex<>::compile() function or with the regex_compiler<> class template. Use basic_regex<>::compile() if you want the default locale. Use regex_compiler<> if you need to specify a different locale. In the section on regex grammars, we'll see another use for regex_compiler<>.

Here is an example of using basic_regex<>::compile():

sregex re = sregex::compile( "this|that", regex_constants::icase );

Here is the same example using regex_compiler<>:

sregex_compiler compiler;
sregex re = compiler.compile( "this|that", regex_constants::icase );

basic_regex<>::compile() is implemented in terms of regex_compiler<>.

Dynamic xpressive Syntax

Since the dynamic syntax is not constrained by the rules for valid C++ expressions, we are free to use familiar syntax for dynamic regexes. For this reason, the syntax used by xpressive for dynamic regexes follows the lead set by John Maddock's proposal to add regular expressions to the Standard Library. It is essentially the syntax standardized by ECMAScript, with minor changes in support of internationalization.

Since the syntax is documented exhaustively elsewhere, I will simply refer you to the existing standards, rather than duplicate the specification here.

Internationalization

As with static regexes, dynamic regexes support internationalization by allowing you to specify a different std::locale. To do this, you must use regex_compiler<>. The regex_compiler<> class has an imbue() function. After you have imbued a regex_compiler<> object with a custom std::locale, all regex objects compiled by that regex_compiler<> will use that locale. For example:

std::locale my_locale = /* initialize your locale object here */;
sregex_compiler compiler;
compiler.imbue( my_locale );
sregex re = compiler.compile( "\\w+|\\d+" );

This regex will use my_locale when evaluating the intrinsic character sets "\\w" and "\\d".


PrevUpHomeNext