Unicode support


Unicode support


The Rexx Parser includes optional support for TUTOR-flavored Unicode.

API changes

New element categories

  • The Element API defines five new element categories, .EL.BYTES_STRING, .EL.CODEPOINTS_STRING, .EL.GRAPHEMES_STRING, .EL.TEXT_STRING and .EL.UNICODE_STRING, for the new Unicode strings. These are defined in the Globals.cls package, and they will be generated by the parser only when Unicode support has been requested.

In that case, the Parser will examine the contents of string literals, and raise the appropriate syntax errors when the strings do not follow the Unicode or TUTOR conventions. P- G- and T-strings will be checked for UTF-8 well-formedness, and, in the case of U-strings, codepoints will be range checked, names will be checked against UnicodeData.txt and NameAliases.txt, and labels will checked against their respective definitions.

New HTML classes

  • The HTMLClasses routine assigns new HTML classes to the newly defined string classes.

New Parser options

  • The Rexx.Parser class accepts a new boolean Unicode option, which activates Unicode support.

Updated programs and utilities

  • The FencedCode routine accepts TUTOR and Unicode as new atributes in Rexx fenced code blocks. They both activate Unicode support.
  • The Highlighter class class accepts TUTOR and Unicode options. They both activate Unicode support.
  • Both the highlight.rex and elements.rex utilities accept new -u, --tutor and --unicode options, which enable Unicode support.