Current version: 1.0.0

Known regular expression differences between perl and regldg


Here is an "as-comprehensive-as-possible" list of the regular expression features in perl which are either different or missing in regldg. I can't make a fully comprehensive list because I cannot float on carpets in front of the regular expression altar in a guru trance. If you find other differences between the regular expressions of perl and those in regldg, please let me know!


Perl Regex GrammarPurpose, Status, Explanation, and Workaround
^Purpose: matches the start of a line
Status: not implemented
Workaround: leave no character classes or meta-character classes at the beginning of your regex.
$Purpose: matches the end of a line
Status: not implemented
Workaround: leave no character classes or meta-character classes at the end of your regex.
.Purpose: matches any character (excluding newline)
Status: Different.
This meta-character is implemented, but it matches any character (including newline).
Workaround: If perl's functionality is desired, use the negated character class [^\n].
\nnnPurpose: the ASCII character nnn (in octal), or possibly a backreference
Status: Different.
Explanation: As written, regldg will assume this is trying to be a backreference to grouping number nnn. (Groupings are counted from their opening parenthesis.) (regldg will also allow backreferences to be specified as \!{nnn} and \!nnn.) Specifying octal characters is implemented using the syntax \o{nnn}. This difference is to avoid the ambiguity of \10 in perl. Is it a backreference? Is it an octal character?
Greedy and non-greedy quantifiersPurpose: Preferring certain matches over others
Status: not implemented.
Explanation: given variable-length quantifiers, regldg will vary from the smallest to the largest possible values, for the quantifiers from the left to the right. Basicallly, non-greedy quantifiers are not implmented, and the "greedy" quantifiers have the default behavior.
\N{name}Purpose: specify characters by name
Status: not implemented
Workaround: use the ASCII character value in octal, decimal, or hexadecimal format.
\cXPurpose: specify control-X character
Status: not implemented
Workaround: use the ASCII character value in octal, decimal, or hexadecimal format.
\l (lowercase EL), \L, \Q, \EPurpose: specify lowercase characters (\l, \L) and escape meta-characters (\Q)
Status: not implemented
Workaround: Use the lowercase values for \l and \L, and escape meta-characters instead of using \Q.
\u, \U, \EPurpose: specify uppercase characters
Status: Different
Explanation: \u{nnn} and \U{nnn} represent universe character sets. See the documentation on universe character sets for more information. Workaround: Use the lowercase values for \l and \L, and escape meta-characters instead of using \Q.
\pP, \PP, \X, \CPurpose: match named properties, extended Unicode "combining character sequences," and single C chars
Status: not implemented
Workaround: use the ASCII character values in octal, decimal, or hexadecimal format, or use a pre-defined or custom character class.
[:class:]Purpose: POSIX named character classes
Status: not implemented
Workaround: use the standard character and meta-character classes, or make your own.
\b, \B, \A, \Z, \GPurpose: match word boundaries (\b), string boundaries (\A and \Z), or end-of-previous-match positions (\G)
Status: not implemented
Workaround: Be creative.
\zPurpose: match the end of a string
Status: Different
Explanation: matching the end of a string is not explicitly implemented. \z{nnn} will tell regldg that your are specifying an ASCII character in decimal format. See individual characters for details. Workaround: Put the character(s) you want at the end of the regex.
(?#comments)
(?imsx-imsx)
(?:pattern)
(?imsx-imsx:pattern)
Purpose: write comments and others
Status: not implemented
Workaround: Don't make comments, or make them your own format with real text.
(?=pattern)
(?!pattern)
(?<=pattern)
(?<!pattern)
Purpose: zero-width look-ahead and -behind assertions
Status: not implemented
Workaround: Weed out unwanted results using grep, or sort a file of regldg's output.
(?{ code })
(??{ code })
(?>pattern)
(?(cond)y-pattern|n-pattern)
(?(cond)y-pattern)
Purpose: various experimental features
Status: not implemented
Workaround: Probably not necessary. If you need them, you might want to look at extending regldg, or making your own regldg.