Appendix

Regular Expression Syntax

Below is a quick reference for the most common regular expression tags supported by IxoraRMS. The full supported syntax is that of the Pattern class, as described in Java documentation.

Characters

x The character x
\\ The backslash character
\0n The character with octal value 0n (0 <= n <= 7)
\0nn The character with octal value 0nn (0 <= n <= 7)
\0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)
\xhh The character with hexadecimal value 0xhh
\uhhhh The character with hexadecimal value 0xhhhh
\t The tab character ('\u0009')
\n The newline (line feed) character ('\u000A')
\r The carriage-return character ('\u000D')
\f The form-feed character ('\u000C')
\a The alert (bell) character ('\u0007')
\e The escape character ('\u001B')
\cx The control character corresponding to x

Character classes

[abc] a, b, or c (simple class)
[^abc] Any character except a, b, or c (negation)
[a-zA-Z] a through z or A through Z, inclusive (range)
[a-d[m-p]] a through d, or m through p: [a-dm-p] (union)
[a-z&&[def]] d, e, or f (intersection)
[a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction)

Predefined character classes

. Any character (may or may not match line terminators)
\d A digit: [0-9]
\D A non-digit: [^0-9]
\s A whitespace character: [ \t\n\x0B\f\r]
\S A non-whitespace character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W A non-word character: [^\w]

POSIX character classes (US-ASCII only)

\p{Lower} A lower-case alphabetic character: [a-z]
\p{Upper} An upper-case alphabetic character:[A-Z]
\p{ASCII} All ASCII:[\x00-\x7F]
\p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}]
\p{Digit} A decimal digit: [0-9]
\p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}]
\p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph} A visible character: [\p{Alnum}\p{Punct}]
\p{Print} A printable character: [\p{Graph}]
\p{Blank} A space or a tab: [ \t]
\p{Cntrl} A control character: [\x00-\x1F\x7F]
\p{XDigit} A hexadecimal digit: [0-9a-fA-F]
\p{Space} A whitespace character: [ \t\n\x0B\f\r]

Classes for Unicode blocks and categories

\p{InGreek} A character in the Greek block (simple block)
\p{Lu} An uppercase letter (simple category)
\p{Sc} A currency symbol
\P{InGreek} Any character except one in the Greek block (negation)
[\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction)

Boundary matchers

^ The beginning of a line
$ The end of a line
\b A word boundary
\B A non-word boundary
\A The beginning of the input
\G The end of the previous match
\Z The end of the input but for the final terminator, if any
\z The end of the input

Greedy quantifiers

X? X, once or not at all
X* X, zero or more times
X+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n but not more than m times

Capturing Groups

Capturing groups are created by enclosing parts of the regular expresion in brackets (). The string matched by a capturing group is accessible later on with the use of $n tags, where $1 .. $n represent capturing groups 1 to n.

Formatting Syntax

The <format> attributes in IxoraRMS accept the standard Java syntax for number and dates formatting (DecimalFormat and SimpleDateFormat). For full information please refer to Java documentation

Formatting tokens for numbers

0 Number Digit
# Number Digit, zero shows as absent
. Number Decimal separator or monetary decimal separator
- Number Minus sign
, Number Grouping separator
E Number Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.
; Subpattern boundary Separates positive and negative subpatterns
% Prefix or suffix Multiply by 100 and show as percentage
\u2030 Prefix or suffix Multiply by 1000 and show as per mille
\u00A4 Prefix or suffix Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.
' Prefix or suffix Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".

Formatting tokens for dates:

G Era designator
y Year
M Month in year
w Week in year
W Week in month
D Day in year
d Day in month
F Day of week in month
E Day in week
a Am/pm marker
H Hour in day (0-23)
k Hour in day (1-24)
K Hour in am/pm (0-11)
h Hour in am/pm (1-12)
m Minute in hour
s Second in minute
S Millisecond
z Time zone (General)
Z Time zone (RFC 822)

Page updated

Google Sites

Report abuse