In Grammars§
See primary documentation in context for Named Regexes.
The main ingredient of grammars is named regexes. While the syntax of Raku Regexes is outside the scope of this document, named regexes have a special syntax, similar to subroutine definitions: [1]
my regex number { \d+ [ \. \d+ ]? }
In this case, we have to specify that the regex is lexically scoped using the my
keyword, because named regexes are normally used within grammars.
Being named gives us the advantage of being able to easily reuse the regex elsewhere:
say so "32.51" ~~ &number; # OUTPUT: «True» say so "15 + 4.5" ~~ /<number>\s* '+' \s*<number>/ # OUTPUT: «True»
regex
isn't the only declarator for named regexes. In fact, it's the least common. Most of the time, the token
or rule
declarators are used. These are both ratcheting, which means that the match engine won't back up and try again if it fails to match something. This will usually do what you want, but isn't appropriate for all cases:
my regex works-but-slow { .+ q } my token fails-but-fast { .+ q } my $s = 'Tokens won\'t backtrack, which makes them fail quicker!'; say so $s ~~ &works-but-slow; # OUTPUT: «True» say so $s ~~ &fails-but-fast; # OUTPUT: «False» # the entire string is taken by the .+
Note that non-backtracking works on terms, that is, as the example below, if you have matched something, then you will never backtrack. But when you fail to match, if there is another candidate introduced by |
or ||
, you will retry to match again.
my token tok-a { .* d }; my token tok-b { .* d | bd }; say so "bd" ~~ &tok-a; # OUTPUT: «False» say so "bd" ~~ &tok-b; # OUTPUT: «True»