Regular Expressions
Regular expressions are enclosed in matching pound sign (#
) characters with a prepended re
: re#...#
. KRL regular expressions follow the JavaScript standard, which closely follows the conventions for Perl regular expressions. The following modifiers may appear in any order after the closing character:
i
. The i modifier makes the regular expression case insensitive.g
. The g modifier applies the regular expression globally.
For example, the following code replaces the first instance of foo
in p
with bar
:
p.replace(re#foo#, "bar")
In contrast, the following code replaces all instances of foo
in p
with bar
:
p.replace(re#foo#g, "bar")
Special characters
Like strings, the only special characters are the terminator (#
) and the backslash (/
). To use pound signs or backslashes inside regular expressions, escape them with backslashes:
re#\#\\# // '#' followed by '\'
A newline (\n) requires a line break:
re# #
Other characters can be inserted literally (some text editors are better at this than others), or consider using chr() and converting the string to a regular expression.
Rationale
KRL uses the hash character to delimit regular expressions instead of the more common (and acceptable) slash (/
) because the slash is a frequently used character in URL paths. This removes the need to quote the slash with backslashes: re/\/archives\/\d{4}\//
. Using an alternate delimiter makes the regex more readable and thus communicates its meaning more clearly: re#/archives/\d{4}/#
.
Samples
Some regular expressions found in KRL code "in the wild".
timestamp
select when d1 t1 time re#^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[.]\d{3}Z)$# setting(timestamp)
The regular expression above matches a string such as produced by the time:now()
function in the time library. Notice that it expects the string to consist entirely, from beginning (^) to end ($) of an ISO8601 time string and furthermore, that string is captured (note the surrounding pair of parentheses). In this context, a rule controlled by the event expression shown above would be selected when an incoming event had domain d1
, type t1
, and an event attribute named "time
" which satisfied the ISO8601 pattern specified by the regular expression. Within the rest of the rule the validated value will be available as the value bound to the name timestamp
.
ECI
re#^[A-Za-z0-9]{21,22}$#
The regular expression above matches an event channel identifier, which in the Node.js pico engine is a distributed identifier or DID.
non-negative integer
re#^(\d+)$#
non-empty string of characters
re#(.+)#
a literal decimal point
re#[.]#
This appeared in a function to compute the integer part of a non-negative value which might contain a decimal point:
math_int = function(num) { val = num.as("String"); dec = val.match(re#[.]#); dec => val.extract(re#(\d*)[.]\d*#)[0].as("Number") | num; } ... tens = math_int( x / 10 )
The function uses the string operator extract
to isolate the numerator from the number computed as x / 10
.
Copyright Picolabs | Licensed under Creative Commons.