Appendix

Regular Expression Syntax

Below is a quick reference for the most common regular expression tags supported by IxoraRMS. The full supported syntax is that of the Pattern class, as described in Java documentation.

Characters

  • x The character x
  • \\ The backslash character
  • \0n The character with octal value 0n (0 <= n <= 7)
  • \0nn The character with octal value 0nn (0 <= n <= 7)
  • \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)
  • \xhh The character with hexadecimal value 0xhh
  • \uhhhh The character with hexadecimal value 0xhhhh
  • \t The tab character ('\u0009')
  • \n The newline (line feed) character ('\u000A')
  • \r The carriage-return character ('\u000D')
  • \f The form-feed character ('\u000C')
  • \a The alert (bell) character ('\u0007')
  • \e The escape character ('\u001B')
  • \cx The control character corresponding to x

Character classes

  • [abc] a, b, or c (simple class)
  • [^abc] Any character except a, b, or c (negation)
  • [a-zA-Z] a through z or A through Z, inclusive (range)
  • [a-d[m-p]] a through d, or m through p: [a-dm-p] (union)
  • [a-z&&[def]] d, e, or f (intersection)
  • [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
  • [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction)

Predefined character classes

  • . Any character (may or may not match line terminators)
  • \d A digit: [0-9]
  • \D A non-digit: [^0-9]
  • \s A whitespace character: [ \t\n\x0B\f\r]
  • \S A non-whitespace character: [^\s]
  • \w A word character: [a-zA-Z_0-9]
  • \W A non-word character: [^\w]

POSIX character classes (US-ASCII only)

  • \p{Lower} A lower-case alphabetic character: [a-z]
  • \p{Upper} An upper-case alphabetic character:[A-Z]
  • \p{ASCII} All ASCII:[\x00-\x7F]
  • \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}]
  • \p{Digit} A decimal digit: [0-9]
  • \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}]
  • \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
  • \p{Graph} A visible character: [\p{Alnum}\p{Punct}]
  • \p{Print} A printable character: [\p{Graph}]
  • \p{Blank} A space or a tab: [ \t]
  • \p{Cntrl} A control character: [\x00-\x1F\x7F]
  • \p{XDigit} A hexadecimal digit: [0-9a-fA-F]
  • \p{Space} A whitespace character: [ \t\n\x0B\f\r]

Classes for Unicode blocks and categories

  • \p{InGreek} A character in the Greek block (simple block)
  • \p{Lu} An uppercase letter (simple category)
  • \p{Sc} A currency symbol
  • \P{InGreek} Any character except one in the Greek block (negation)
  • [\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction)

Boundary matchers

  • ^ The beginning of a line
  • $ The end of a line
  • \b A word boundary
  • \B A non-word boundary
  • \A The beginning of the input
  • \G The end of the previous match
  • \Z The end of the input but for the final terminator, if any
  • \z The end of the input

Greedy quantifiers

  • X? X, once or not at all
  • X* X, zero or more times
  • X+ X, one or more times
  • X{n} X, exactly n times
  • X{n,} X, at least n times
  • X{n,m} X, at least n but not more than m times

Capturing Groups

Capturing groups are created by enclosing parts of the regular expresion in brackets (). The string matched by a capturing group is accessible later on with the use of $n tags, where $1 .. $n represent capturing groups 1 to n.

Formatting Syntax

The <format> attributes in IxoraRMS accept the standard Java syntax for number and dates formatting (DecimalFormat and SimpleDateFormat). For full information please refer to Java documentation

Formatting tokens for numbers

  • 0 Number Digit
  • # Number Digit, zero shows as absent
  • . Number Decimal separator or monetary decimal separator
  • - Number Minus sign
  • , Number Grouping separator
  • E Number Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.
  • ; Subpattern boundary Separates positive and negative subpatterns
  • % Prefix or suffix Multiply by 100 and show as percentage
  • \u2030 Prefix or suffix Multiply by 1000 and show as per mille
  • \u00A4 Prefix or suffix Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.
  • ' Prefix or suffix Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".

Formatting tokens for dates:

  • G Era designator
  • y Year
  • M Month in year
  • w Week in year
  • W Week in month
  • D Day in year
  • d Day in month
  • F Day of week in month
  • E Day in week
  • a Am/pm marker
  • H Hour in day (0-23)
  • k Hour in day (1-24)
  • K Hour in am/pm (0-11)
  • h Hour in am/pm (1-12)
  • m Minute in hour
  • s Second in minute
  • S Millisecond
  • z Time zone (General)
  • Z Time zone (RFC 822)