semi-integrated regex

There are currently 2 forms of regex literals, both for re2. (Other literal types might be added later.)

re/pattern/
RE/pattern/

As with string literals, the lowercase forms, such as re// interpret langur escape codes and interpolations, and the uppercase forms, such as RE// do not.

Valid quote mark pairs are the same as for strings. Also the same are interpolations and langur escape codes.

These literals can be passed to regex functions (discussed below), such as match() and replace().

escape code resolution

The escape codes of langur will not always match the escape codes in a variety of regex. For example, the \P code represents a Unicode paragraph separator to langur, but a negated property class to re2 and other regexes. This is not a conflict, since they are not interpreted together.

Using the lowercase forms, langur escape codes will be interpreted before a pattern is passed to the regex compiler, so that re/\\P{Lu}/ and RE/\P{Lu}/ will produce the same regex.

escaping metacharacters

There are 2 ways to escape metacharacters (at the start of an interpolation and with the reEsc() function for re2). Using a backslash \ at the opening of an interpolation, such as $re/\{\ .x}/ (note the backslash after the opening curly brace) indicates that you want to escape metacharacters.

If this is used on a plain string interpolation, it has a different effect (doubles all backslashes from the interpolated value).

regex functions

The regex functions understand all regex types available in langur.

In place of a string to test, these functions accept anything and convert it to a string if necessary.

in given expressions

A regex in place of a variable or condition with no explicit operators in a given expression is used to test a value against the regex.

given .x, .y { case re/abc+/: ... # both match re2 regex pattern abc+ case _, re/zzz/: ... # .y matches re2 regex pattern zzz }

given re/a+/ { case "abcd": ... # "abcd" matches re2 pattern re/a+/ case "re/zzz/: ... # re/a+/ same regex as re/zzz/, which it isn't }