New values for the OPTIONS instruction


New values for the OPTIONS instruction

/******************************************************************************
 * This file is part of The Unicode Tools Of Rexx (TUTOR)                     *
 * See https://rexx.epbcn.com/TUTOR/                                          *
 *     and https://github.com/JosepMariaBlasco/TUTOR                          *
 * Copyright © 2023-2025 Josep Maria Blasco <josep.maria.blasco@epbcn.com>    *
 * License: Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)  *
 ******************************************************************************/

OPTIONS DEFAULTSTRING

Diagram for the OPTIONS DEFAULTSTRING instruction

OPTIONS DEFAULTSTRING default, where default can be one of BYTES, CODEPOINTS, GRAPHEMES, TEXT or NONE. This affects the semantics of numbers and unsuffixed strings, i.e., "string", without an explicit B, X, Y, P, T or U suffix. If default is NONE, numbers and strings are not converted (i.e., they are handled as default Rexx numbers and strings). In the other cases, numbers and strings are transformed to the corresponding type. For example, if OPTIONS DEFAULTSTRING TEXT is in effect, "string" will automatically be a TEXT string, as if "string"T had been specified, i.e., "string" will be composed of extended grapheme clusters, and will be automatically normalized to the NFC Unicode normalization form if needed; if OPTIONS DEFAULTSTRING GRAPHEMES is in effect, "string" will automatically be a GRAPHEMES string, as if "string"G had been specified, i.e., "string" will be composed of extended grapheme clusters, with no automatical normalization; , and if OPTIONS DEFAULTSTRING CODEPOINTS is in effect, 12.3 will automatically be a CODEPOINTS string, as if CODEPOINTS(12.3) had been specified.

By default, RXU works as if OPTIONS DEFAULTSTRING TEXT had been specified.

Note. Currently, OPTIONS DEFAULTSTRING does not apply to variable and constant symbols. This will be fixed in a future release.

Implementation restriction: This is currently a global option. You can change it inside a procedure, and it will apply globally, not only to the procedure scope.

Examples.

Say Stringtype("string")                          -- BYTES (the default)
Options Defaultstring CODEPOINTS
Say Stringtype("string")                          -- CODEPOINTS
Say Stringtype(1024)                              -- CODEPOINTS too
Say Stringtype("12"X)                             -- BYTES: X, B and U strings are always BYTES strings
Say Stringtype("string"T)                         -- TEXT (Explicit suffix)

Implementation notes:

RXU translates an unsuffixed string "string" to the following expression:

(U:Default("string"))

Default is a helper routine defined in Unicode.cls, and U is a suitable namespace for Unicode.cls. The Default routine implements the current OPTIONS DEFAULTSTRING setting.

OPTIONS COERCIONS

Diagram for the OPTIONS COERCIONS instruction

OPTIONS COERCIONS behaviour, where behaviour can be one of PROMOTE, DEMOTE, LEFT, RIGHT or NONE. This instruction determines the behaviour of the language processor when a binary operation is attempted in which the operators are of different string types, for example, when a BYTES string is contatenated to a TEXT string, or when a CODEPOINTS number is added to a BYTES number.

  • When behaviour is NONE a Syntax error will be raised.
  • When behaviour is PROMOTE, the result of the operation will have the type of the highest operand (i.e., TEXT when at least one of the operands is TEXT, or else CODEPOINTS when at least one of the operands is CODEPOINTS, or BYTES in all other cases).
  • When behaviour is DEMOTE, the result of the operation will have the type of the lowest operand (i.e., BYTES when at least one of the operands is BYTES, or else CODEPOINTS when at least one of the operands is CODEPOINTS, or TEXT in all other cases).
  • When behaviour is LEFT, the result of the operation will have the type of the left operand.
  • When behaviour is RIGHT, the result of the operation will have the type of the right operand.

Currently, OPTIONS COERCIONS is implemented for concatenation, arithmentic and logical operators only.

By default, RXU works as if OPTIONS COERCIONS PROMOTE had been specified.

Note. This variant of the OPTIONS instruction is highly experimental. Its only purpose is to allow experimentation with implicit coercions. Once a decision is taken about the preferred coercion mechanism, it will be removed.

Implementation restriction: This is a global option. You can change it inside a procedure, and it will apply globally, not only to the current scope.

Examples.

Code could not be displayed because of the following error:
     2 *-* Say Stringtype( "Left"B || "Right"P )             -- CODEPOINTS
Error 15 in /media/sf_C/Dropbox/Webs/ssl/rexx.epbcn.com/TUTOR/doc/options/readme.md, line 82:  Invalid hexadecimal or binary string.
Error 15.4:  Only 0, 1, and whitespace characters are valid in a binary string; found "L".

Implementation notes

The RXU Rexx Preprocessor for Unicode implements the OPTIONS instruction in the following way: when an OPTIONS instruction is encountered, say

OPTIONS optiona optionb

the preprocessor transforms it into

Do; !Options = optiona optionb; Call !Options !Options; Options !Options; End

!OPTIONS is a routine defined in Unicode.cls.