Regular Expression Syntax
Below is a quick reference for the most common regular expression tags supported by IxoraRMS. The full supported syntax is that of the Pattern class, as described in Java documentation.
Characters
- x The character x 
- \\ The backslash character 
- \0n The character with octal value 0n (0 <= n <= 7) 
- \0nn The character with octal value 0nn (0 <= n <= 7) 
- \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) 
- \xhh The character with hexadecimal value 0xhh 
- \uhhhh The character with hexadecimal value 0xhhhh 
- \t The tab character ('\u0009') 
- \n The newline (line feed) character ('\u000A') 
- \r The carriage-return character ('\u000D') 
- \f The form-feed character ('\u000C') 
- \a The alert (bell) character ('\u0007') 
- \e The escape character ('\u001B') 
- \cx The control character corresponding to x 
Character classes
- [abc] a, b, or c (simple class) 
- [^abc] Any character except a, b, or c (negation) 
- [a-zA-Z] a through z or A through Z, inclusive (range) 
- [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) 
- [a-z&&[def]] d, e, or f (intersection) 
- [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) 
- [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) 
Predefined character classes
- . Any character (may or may not match line terminators) 
- \d A digit: [0-9] 
- \D A non-digit: [^0-9] 
- \s A whitespace character: [ \t\n\x0B\f\r] 
- \S A non-whitespace character: [^\s] 
- \w A word character: [a-zA-Z_0-9] 
- \W A non-word character: [^\w] 
POSIX character classes (US-ASCII only)
- \p{Lower} A lower-case alphabetic character: [a-z] 
- \p{Upper} An upper-case alphabetic character:[A-Z] 
- \p{ASCII} All ASCII:[\x00-\x7F] 
- \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}] 
- \p{Digit} A decimal digit: [0-9] 
- \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}] 
- \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ 
- \p{Graph} A visible character: [\p{Alnum}\p{Punct}] 
- \p{Print} A printable character: [\p{Graph}] 
- \p{Blank} A space or a tab: [ \t] 
- \p{Cntrl} A control character: [\x00-\x1F\x7F] 
- \p{XDigit} A hexadecimal digit: [0-9a-fA-F] 
- \p{Space} A whitespace character: [ \t\n\x0B\f\r] 
Classes for Unicode blocks and categories
- \p{InGreek} A character in the Greek block (simple block) 
- \p{Lu} An uppercase letter (simple category) 
- \p{Sc} A currency symbol 
- \P{InGreek} Any character except one in the Greek block (negation) 
- [\p{L}&&[^\p{Lu}]]  Any letter except an uppercase letter (subtraction) 
Boundary matchers
- ^ The beginning of a line 
- $ The end of a line 
- \b A word boundary 
- \B A non-word boundary 
- \A The beginning of the input 
- \G The end of the previous match 
- \Z The end of the input but for the final terminator, if any 
- \z The end of the input 
Greedy quantifiers
- X? X, once or not at all 
- X* X, zero or more times 
- X+ X, one or more times 
- X{n} X, exactly n times 
- X{n,} X, at least n times 
- X{n,m} X, at least n but not more than m times 
Capturing Groups
Capturing groups are created by enclosing parts of the regular expresion in brackets (). The string matched by a capturing group is accessible later on with the use of $n tags, where $1 .. $n represent capturing groups 1 to n.
The <format> attributes in IxoraRMS accept the standard Java syntax for number and dates formatting (DecimalFormat and SimpleDateFormat). For full information please refer to Java documentation
Formatting tokens for numbers
- 0  Number  Digit  
- #  Number  Digit, zero shows as absent  
- .  Number  Decimal separator or monetary decimal separator  
- -  Number  Minus sign  
- ,  Number  Grouping separator  
- E  Number  Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.  
- ;  Subpattern boundary  Separates positive and negative subpatterns  
- %  Prefix or suffix  Multiply by 100 and show as percentage  
- \u2030  Prefix or suffix  Multiply by 1000 and show as per mille  
- \u00A4  Prefix or suffix  Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.  
- '  Prefix or suffix  Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".  
Formatting tokens for dates:
- G  Era designator
- y  Year
- M  Month in year
- w  Week in year
- W  Week in month
- D  Day in year
- d  Day in month
- F  Day of week in month
- E  Day in week
- a  Am/pm marker
- H  Hour in day (0-23)
- k  Hour in day (1-24)
- K  Hour in am/pm (0-11)
- h  Hour in am/pm (1-12)
- m  Minute in hour
- s  Second in minute
- S  Millisecond
- z  Time zone (General)
- Z  Time zone (RFC 822)