Routine HTMLClasses


Routine HTMLClasses


Argument

Options., a stem. The following tails are examined:

  • Options.assignment, for (extended) assignment operator characters and character sequences.
  • Options.constant, for strings or symbols taken as a constant.
  • Options.operator, for operator characters and operator character sequences.
  • Options.special, for special characters and special character sequences.
  • Options.classprefix, a prefix prepended to every HTML class. The default is "rx-".

The routine code assigns one or two HTML classes to every element category, and, in the case of symbols taken as a constant, to every element subcategory.

When two classes are assigned, the first one is generic (for instance, "spe" for special characters), and the second one identifies the corresponding element (for example, "comma", or "colon").

The value of the corresponding compound variables determines the class associated with every element category and subcategory, in the following way:

  • When the value is "group", only the generic class is assigned.
  • When the value is "detail", only the detailed class is assigned.
  • When the value is "full", two classes are assigned, the generic one and the detailed one (in this order).

Returns

A stem ("HTMLClass") mapping element categories and subcategories to html classes. The default HTML class is "rexx".

Program logic

Irrespective of the whether we are producing output for ANSI terminals, HTML, or LaTeX, the Highlighter assigns one or more HTML class names to every element category and subcategory. This assignment is done in the HTMLClasses routine, which can of course be customized.

In many cases, an element category is mapped directly to a single HTML class. For example, .EL.SHEBANG, the category that identifies shebang lines, is assigned the class "shb".

In many other cases, an element category is mapped to two HTML classes: the first is a generic one, which is the same for a whole set of elements, and the second one is a more specialized one. For example, .EL.OP.MULTIPLICATION, which identifies The "*" operator, is assigned a generic class of "op", for "operator", and a specialized class of "mul".

These class names are then prefixed with a special prefix (by default, "rx-" is used) to avoid conflicts with classes by other programs. In our example, the "*" operator would be assigned classes "rx-op" and "rx-mul".

Coarse- vs. fine-grained class assignments

The sets of elements that are capable of being assigned two classes instead of only one are assignments ("asg" + a specific class), operators ("op" + a specific class), special characters and special character sequences ("spe" + a specific class) and taken constants ("const" + a specific class).

Depending on the supplied program options, each run of the Highlighter may assign to the elements belonging to each of these sets the first, generic class, the second, more specialized class, or both.

For example, when using rexx fenced code blocks, the assignment attribute may have a value of "group" (the first class), "detail" (the second class), or "full" (both).

This mechanism allows to choose, for every element set, between very fine-grained and more coarse-grained class assignments on a highlighter run-by-run basis.

~~~rexx {assignment=detail special=group operator=full constant=detail classprefix="P"}

Program source

/******************************************************************************/
/*                                                                            */
/* HTMLClasses.cls - element category/subcategory to html class translation   */
/* ========================================================================   */
/*                                                                            */
/* This program is part of the Rexx Parser package                            */
/* [See https://rexx.epbcn.com/rexx.parser/]                                  */
/*                                                                            */
/* Copyright (c) 2024-2025 Josep Maria Blasco <josep.maria.blasco@epbcn.com>  */
/*                                                                            */
/* License: Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)  */
/*                                                                            */
/* Version history:                                                           */
/*                                                                            */
/* Date     Version Details                                                   */
/* -------- ------- --------------------------------------------------------- */
/* 20241206    0.1  First public release                                      */
/* 20241208    0.1a c/CLASSIC_COMMENT/STANDARD_COMMENT/                       */
/* 20241209    0.1b Add shebang support                                       */
/*                  New call system                                           */
/* 20241225    0.1d Add doc-comment support                                   */
/* 20250103    0.1e Add TUTOR-flavored Unicode classes                        */
/* 20250116    0.1f Add support for EL.SUBKEYWORD                             */
/*                                                                            */
/******************************************************************************/

::Routine HTMLClasses Public

--
-- ARGUMENT
-- --------
--
--   "Options.", a stem.
--

  Use Strict Arg Options.

--
-- The following tails are examined:
--
-- * Options.assignment,
--     for (extended) assignment operator characters and character sequences.
-- * Options.constant,
--     for strings or symbols taken as a constant.
-- * Options.operator,
--     for operator characters and operator character sequences.
-- * Options.special,
--     for special characters and special character sequences.
-- * Options.classprefix,
--     a prefix prepended to every HTML class. Default is "".
--
-- The code assigns one or two HTML classes to every element category, and,
-- in the case of symbols taken as a constant, to every element subcategory.
--
-- When two classes are assigned, the first one is generic (for instance,
-- "spe" for special characters), and the second one identifies the
-- corresponding element (for example, "comma", or "colon").
--
-- The value of the corresponding compound variables determine the class
-- associated with every element category and subcategory,
-- in the following way:
--
-- * When the value is "group", only the generic class is assigned.
-- * When the value is "detail", only the detailed class is assigned.
-- * When the value is "full", two classes are assigned, the generic
--   one and the detailed one (in this order).
--
-- RETURNS
-- -------
--
-- A stem ("HTMLClass") mapping element categories and subcategories to html
-- classes. The default HTML class is "rexx".
--

  prefix      = options.classprefix

  HTMLClass   = .Stem~new  --
  HTMLClass[] = "rexx"     -- The default HTML class is 'rexx'

--
-- We will assign HTML classes to a number of element categories
-- using the "Assign" internal routine.
--
-- We include it here and skip over its code for readability.
--
--------------------------------------------------------------------------------

  Signal SkipOverAssign

Assign:
  value = ""                            -- An accumulator
  classes = Arg(2)~makeArray(" ")
  Do Counter c class Over classes
    If c > 1 Then value ||= " "
    value ||= prefix || class           -- Prepend the prefix to each class
  End
  HTMLClass[ Arg(1) ] = value
Return

SkipOverAssign:

/******************************************************************************/
/* SHEBANGS                                                                   */
/******************************************************************************/

  Call Assign   .EL.SHEBANG,                     "shb"

/******************************************************************************/
/* SYMBOLS                                                                    */
/******************************************************************************/

--------------------------------------------------------------------------------
-- Keywords                                                                   --
--------------------------------------------------------------------------------
--
-- Directive keywords have a special element category, different from
-- instruction keywords, and, correspondingly, they can be assigned a different
-- HTML class, if so desired.
--

  Call Assign   .EL.KEYWORD,                     "kw"
  Call Assign   .EL.SUBKEYWORD,                  "skw"
  Call Assign   .EL.DIRECTIVE_KEYWORD,           "dkw"

--------------------------------------------------------------------------------
-- Variables                                                                  --
--------------------------------------------------------------------------------
--
-- The parser is able to differentiate exposed (instance) variables from
-- local variables.
--

  Call Assign   .EL.SIMPLE_VARIABLE,             "var"
  Call Assign   .EL.COMPOUND_VARIABLE,           "cmp"
  Call Assign   .EL.STEM_VARIABLE,               "stem"

  Call Assign   .EL.EXPOSED_SIMPLE_VARIABLE,     "xvar"
  Call Assign   .EL.EXPOSED_COMPOUND_VARIABLE,   "xcmp"
  Call Assign   .EL.EXPOSED_STEM_VARIABLE,       "xstem"

--------------------------------------------------------------------------------
-- Environment symbols                                                        --
--------------------------------------------------------------------------------

  Call Assign   .EL.ENVIRONMENT_SYMBOL,          "env"

--------------------------------------------------------------------------------
-- Pure constant symbols                                                      --
--------------------------------------------------------------------------------
--
-- That is, constant symbols that are not environment symbols or numbers
--
-- .EL.PERIOD is used when parsing PARSE templates. By default, its
-- highlighting is the same as other symbol literals.
--

  Call Assign   .EL.SYMBOL_LITERAL,              "lit"
  Call Assign   .EL.PERIOD,                      "lit"

--------------------------------------------------------------------------------
-- Numbers                                                                    --
--------------------------------------------------------------------------------

  Call Assign   .EL.EXPONENTIAL_NUMBER,          "exp"
  Call Assign   .EL.FRACTIONAL_NUMBER,           "frac"
  Call Assign   .EL.INTEGER_NUMBER,              "int"

/******************************************************************************/
/* STRINGS                                                                    */
/******************************************************************************/

  Call Assign   .EL.BINARY_STRING,               "bstr"
  Call Assign   .EL.STRING,                      "str"
  Call Assign   .EL.HEX_STRING,                  "xstr"
  Call Assign   .EL.BYTES_STRING,                "ystr"
  Call Assign   .EL.CODEPOINTS_STRING,           "pstr"
  Call Assign   .EL.GRAPHEMES_STRING,            "gstr"
  Call Assign   .EL.TEXT_STRING,                 "tstr"
  Call Assign   .EL.UNICODE_STRING,              "ustr"

/******************************************************************************/
/* COMMENTS                                                                   */
/******************************************************************************/

-- We style doc-comments the same as non-doc comments by default

  Call Assign   .EL.LINE_COMMENT,                "lncm"
  Call Assign   .EL.DOC_COMMENT_MARKDOWN,        "doc-lncm"
  Call Assign   .EL.STANDARD_COMMENT,            "cm"
  Call Assign   .EL.DOC_COMMENT,                 "doc-cm"

/******************************************************************************/
/* RESOURCES                                                                  */
/******************************************************************************/
--
-- See also the .RESOURCE.NAME and .RESOURCE.DELIMITER.NAME
-- subcategories below.
--

  Call Assign   .EL.RESOURCE_DATA,               "res-data"
  Call Assign   .EL.RESOURCE_IGNORED_DATA,       "res-ignore"

/******************************************************************************/
/* WHITESPACE                                                                 */
/******************************************************************************/
--
-- Note that elements with the .EL.WHITESPACE class include
-- continuation characters "-" and ","
--

  Call Assign   .EL.WHITESPACE,                  "ws"

/******************************************************************************/
/* OPERATORS                                                                  */
/******************************************************************************/

  Signal SkipOverOperator

Operator:
  Select Case Options.operator
    When "group"  Then Call Assign Arg(1), "op"
    When "full"   Then Call Assign Arg(1), "op" Arg(2)
    When "detail" Then Call Assign Arg(1), Arg(2)
  End
Return

SkipOverOperator:

--------------------------------------------------------------------------------
-- Single-character operators                                                 --
--------------------------------------------------------------------------------

  Call Operator .EL.OP.AND,                      "and"
  Call Operator .EL.OP.DIVISION,                 "div"
  Call Operator .EL.OP.EQUAL,                    "eq"
  Call Operator .EL.OP.GREATER_THAN,             "gt"
  Call Operator .EL.OP.INTEGER_DIVISION,         "idiv"
  Call Operator .EL.OP.LOWER_THAN,               "lt"
  Call Operator .EL.OP.MINUS,                    "sub"
  Call Operator .EL.OP.MESSAGE,                  "msg"
  Call Operator .EL.OP.MULTIPLICATION,           "mul"
  Call Operator .EL.OP.NEGATION,                 "not"
  Call Operator .EL.OP.OR,                       "or"
  Call Operator .EL.OP.PLUS,                     "add"
  Call Operator .EL.OP.REFERENCE.LOWER_THAN,     "ref-lt"
  Call Operator .EL.OP.REFERENCE.GREATER_THAN,   "ref-gt"

--------------------------------------------------------------------------------
-- Prefix operators                                                           --
--------------------------------------------------------------------------------
--
-- "\" is always a prefix operator
--

  Call Operator .EL.OP.PREFIX.MINUS,             "prf-sub"
  Call Operator .EL.OP.PREFIX.PLUS,              "prf-add"

--------------------------------------------------------------------------------
-- Multi-character operator sequences                                         --
--------------------------------------------------------------------------------
--
-- There may be some whitespace, comments and continuations inside
-- the sequence; only the relevant characters will be tagged.
--

  Call Operator .EL.OP.CASCADING_MESSAGE,        "cmsg"
  Call Operator .EL.OP.CONCATENATION,            "cat"
  Call Operator .EL.OP.GREATER_OR_EQUAL,         "ge"
  Call Operator .EL.OP.GREATER_OR_LOWER_THAN,    "glt"
  Call Operator .EL.OP.LOWER_OR_EQUAL,           "le"
  Call Operator .EL.OP.LOWER_OR_GREATER_THAN,    "lgt"
  Call Operator .EL.OP.NOT_GREATER_THAN,         "ngt"
  Call Operator .EL.OP.NOT_LOWER_THAN,           "nlt"
  Call Operator .EL.OP.NOT_EQUAL,                "ne"
  Call Operator .EL.OP.POWER,                    "pow"
  Call Operator .EL.OP.REMAINDER,                "rem"
  Call Operator .EL.OP.XOR,                      "xor"

--------------------------------------------------------------------------------
-- Strict comparison operator sequences                                       --
--------------------------------------------------------------------------------
--
-- All are multi-character
--

  Call Operator .EL.OP.STRICT.LOWER_THAN,        "st-lt"
  Call Operator .EL.OP.STRICT.GREATER_OR_EQUAL,  "st-ge"
  Call Operator .EL.OP.STRICT.GREATER_THAN,      "st-gt"
  Call Operator .EL.OP.STRICT.NOT_EQUAL,         "st-ne"
  Call Operator .EL.OP.STRICT.NOT_LOWER_THAN,    "st-nlt"
  Call Operator .EL.OP.STRICT.NOT_GREATER_THAN,  "st-ngt"
  Call Operator .EL.OP.STRICT.LOWER_OR_EQUAL,    "st-le"
  Call Operator .EL.OP.STRICT.EQUAL,             "st-eq"

--------------------------------------------------------------------------------
-- Blank concatenation                                                        --
--------------------------------------------------------------------------------
--
-- The abuttal operator is zero-length, but blank concatenation is not.
--

  Call Operator .EL.OP.BLANK,                    "blank"

/******************************************************************************/
/* ASSIGNMENTS                                                                */
/******************************************************************************/

  Signal SkipOverAssignment

Assignment:
  Select Case Options.assignment
    When "group"  Then Call Assign Arg(1), "asg"
    When "full"   Then Call Assign Arg(1), "asg" Arg(2)
    When "detail" Then Call Assign Arg(1), Arg(2)
  End
Return

SkipOverAssignment:

--------------------------------------------------------------------------------

  Call Assignment .EL.ASG.EQUAL,                 "asg-equal"

  Call Assignment .EL.ASG.PLUS,                  "asg-add"
  Call Assignment .EL.ASG.MINUS,                 "asg-sub"
  Call Assignment .EL.ASG.MULTIPLY,              "asg-mul"
  Call Assignment .EL.ASG.DIVIDE,                "asg-div"
  Call Assignment .EL.ASG.INTEGER_DIVISION,      "asg-idiv"
  Call Assignment .EL.ASG.AND,                   "asg-and"
  Call Assignment .EL.ASG.OR,                    "asg-or"
  Call Assignment .EL.ASG.REMAINDER,             "asg-rem"
  Call Assignment .EL.ASG.CONCATENATION,         "asg-cat"
  Call Assignment .EL.ASG.XOR,                   "asg-xor"
  Call Assignment .EL.ASG.POWER,                 "asg-pow"

/******************************************************************************/
/* SPECIAL CHARACTERS                                                         */
/******************************************************************************/

  Signal SkipOverSpecial

Special:
  Select Case Options.special
    When "group"  Then Call Assign Arg(1), "spe"
    When "full"   Then Call Assign Arg(1), "spe" Arg(2)
    When "detail" Then Call Assign Arg(1), Arg(2)
  End
Return

SkipOverSpecial:

--------------------------------------------------------------------------------

  Call Special .EL.COMMA,                        "comma"
  Call Special .EL.COLON,                        "colon"
  Call Special .EL.LEFT_PARENTHESIS,             "paren"
  Call Special .EL.RIGHT_PARENTHESIS,            "paren"
  Call Special .EL.LEFT_BRACKET,                 "bracket"
  Call Special .EL.RIGHT_BRACKET,                "bracket"

--
-- Directive start. Although technically this is a sequence of special
-- characters, we assign a new class using "Assign" instead of "Special",
-- since we will probably want to highlight "::" differently from other
-- specials and special .
--

  Call Assign  .EL.DIRECTIVE_START,              "dir"

-- The period as a compound variable tail separator is a pseudo-special, ...

  Call Special .EL.TAIL_SEPARATOR,               "period"

  -- ... as is the "..." construct at the end of an argument list.

  Call Special .EL.ELLIPSIS,                     "ellipsis"

/******************************************************************************/
/* STRINGS OR SYMBOLS TAKEN AS A CONSTANT                                     */
/******************************************************************************/

  Signal SkipOverConstant

Constant:
  Select Case Options.["CONSTANT"]
    When "group"  Then value = prefix"const"
    When "full"   Then value = prefix"const" prefix||Arg(2)
    When "detail" Then value = prefix||Arg(2)
  End
  HTMLClass[ constant, Arg(1) ] = value
Return

SkipOverConstant:

--------------------------------------------------------------------------------

  constant = .EL.TAKEN_CONSTANT

-- Function and subroutine call constants

  Call Constant .INTERNAL.FUNCTION.NAME,           "int-func"
  Call Constant .INTERNAL.SUBROUTINE.NAME,         "int-proc"
  Call Constant .BUILTIN.FUNCTION.NAME,            "bif-func"
  Call Constant .BUILTIN.SUBROUTINE.NAME,          "bif-proc"
  Call Constant .PACKAGE.FUNCTION.NAME,            "pkg-func"
  Call Constant .PACKAGE.SUBROUTINE.NAME,          "pkg-proc"
  Call Constant .EXTERNAL.PACKAGE.FUNCTION.NAME,   "ext-pkg-func"
  Call Constant .EXTERNAL.PACKAGE.SUBROUTINE.NAME, "ext-pkg-func"
  Call Constant .EXTERNAL.FUNCTION.NAME,           "ext-func"
  Call Constant .EXTERNAL.SUBROUTINE.NAME,         "ext-proc"

-- Other constants

  Call Constant .ANNOTATION.NAME,                  "annotation"
  Call Constant .BLOCK.INSTRUCTION.NAME,           "block"
  Call Constant .CLASS.NAME,                       "class"
  Call Constant .ENVIRONMENT.NAME,                 "environment"
  Call Constant .LABEL.NAME,                       "label"
  Call Constant .METHOD.NAME,                      "method"
  Call Constant .NAMESPACE.NAME,                   "namespace"
  Call Constant .ROUTINE.NAME,                     "routine"
  Call Constant .REQUIRES.PROGRAM.NAME,            "requires"
  Call Constant .RESOURCE.NAME,                    "resource"
  Call Constant .RESOURCE.DELIMITER.NAME,          "res-delimiter"
  Call Constant .USER.CONDITION.NAME,              "user-condition"

  Call Constant .ANNOTATION.VALUE,                 "annotation-value"
  Call Constant .CONSTANT.VALUE,                   "constant-value"

--------------------------------------------------------------------------------

Return HTMLClass