Known regular expression differences between perl and regldg
Here is an "as-comprehensive-as-possible" list of the regular expression
features in perl which are either different or missing in regldg. I can't make
a fully comprehensive list because I cannot float on carpets in front of the regular
expression altar in a guru trance. If you find other differences between the
regular expressions of perl and those in regldg, please
let me know!
Perl Regex Grammar | Purpose, Status, Explanation, and Workaround |
^ | Purpose: matches the start of a line
Status: not implemented
Workaround: leave no character classes or meta-character classes at the beginning of your regex.
|
$ | Purpose: matches the end of a line
Status: not implemented
Workaround: leave no character classes or meta-character classes at the end of your regex.
|
. | Purpose: matches any character (excluding newline)
Status: Different.
This meta-character is implemented, but it matches any character (including newline).
Workaround: If perl's functionality is desired, use the negated character class [^\n].
|
\nnn | Purpose: the ASCII character nnn (in octal), or possibly a backreference
Status: Different.
Explanation: As written, regldg will assume this is trying to be a backreference to grouping number nnn. (Groupings are counted from their opening parenthesis.)
(regldg will also allow backreferences to be specified as \!{nnn} and \!nnn.)
Specifying octal characters is implemented using the syntax \o{nnn}.
This difference is to avoid the ambiguity of \10 in perl. Is it a backreference? Is it an octal character?
|
Greedy and non-greedy quantifiers | Purpose: Preferring certain matches over others
Status: not implemented.
Explanation: given variable-length quantifiers, regldg will vary from the smallest
to the largest possible values, for the quantifiers from the left to the right.
Basicallly, non-greedy quantifiers are not implmented, and the "greedy"
quantifiers have the default behavior.
|
\N{name} | Purpose: specify characters by name
Status: not implemented
Workaround: use the ASCII character value in octal, decimal, or hexadecimal format.
|
\cX | Purpose: specify control-X character
Status: not implemented
Workaround: use the ASCII character value in octal, decimal, or hexadecimal format.
|
\l (lowercase EL), \L, \Q, \E | Purpose: specify lowercase characters (\l, \L) and escape meta-characters (\Q)
Status: not implemented
Workaround: Use the lowercase values for \l and \L, and escape meta-characters instead of using \Q.
|
\u, \U, \E | Purpose: specify uppercase characters
Status: Different
Explanation: \u{nnn} and \U{nnn} represent universe character sets.
See the documentation on universe character sets for more information.
Workaround: Use the lowercase values for \l and \L, and escape meta-characters instead of using \Q.
|
\pP, \PP, \X, \C | Purpose: match named properties, extended Unicode "combining character sequences," and single C chars
Status: not implemented
Workaround: use the ASCII character values in octal, decimal, or hexadecimal format, or use a pre-defined or custom character class.
|
[:class:] | Purpose: POSIX named character classes
Status: not implemented
Workaround: use the standard character and meta-character classes, or make your own.
|
\b, \B, \A, \Z, \G | Purpose: match word boundaries (\b), string boundaries (\A and \Z), or end-of-previous-match positions (\G)
Status: not implemented
Workaround: Be creative.
|
\z | Purpose: match the end of a string
Status: Different
Explanation: matching the end of a string is not explicitly implemented. \z{nnn}
will tell regldg that your are specifying an ASCII character in decimal format. See
individual characters for details.
Workaround: Put the character(s) you want at the end of the regex.
|
(?#comments) (?imsx-imsx) (?:pattern) (?imsx-imsx:pattern) | Purpose: write comments and others
Status: not implemented
Workaround: Don't make comments, or make them your own format with real text.
|
(?=pattern) (?!pattern) (?<=pattern) (?<!pattern) | Purpose: zero-width look-ahead and -behind assertions
Status: not implemented
Workaround: Weed out unwanted results using grep, or sort a file of regldg's output.
|
(?{ code }) (??{ code }) (?>pattern) (?(cond)y-pattern|n-pattern) (?(cond)y-pattern) | Purpose: various experimental features
Status: not implemented
Workaround: Probably not necessary. If you need them, you might want to look at extending
regldg, or making your own regldg.
|
|