Release notes for version 0.1, 20230716


Release notes for version 0.1, 20230716

/******************************************************************************
 * This file is part of The Unicode Tools Of Rexx (TUTOR)                     *
 * See https://rexx.epbcn.com/TUTOR/                                          *
 *     and https://github.com/JosepMariaBlasco/TUTOR                          *
 * Copyright © 2023-2025 Josep Maria Blasco <josep.maria.blasco@epbcn.com>    *
 * License: Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)  *
 ******************************************************************************/

I've just uploaded the current version of the Unicode Toys to https://github.com/RexxLA/rexx-repository/tree/master/ARB/standards/work-in-progress/unicode/UnicodeToys

Please have a look at https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/unicode/UnicodeToys/UnicodeToys.md to understand the spirit of this set of programs.

Things you can do (after you add ::Requires Unicode.cls to your program and maybe some wrappers to implement new functionality for old BIFs).

  • Continue to use normal "strings". You can explicitly say that a string is a Bytes string by using the new BYTES(string) BIF.
  • Create Runes strings with the new RUNES(string) BIF. Runes strings are composed of codepoints (i.e., they are functionally equivalent to UTF-32 strings). A few BIFs have been implemented, namely LENGTH, SUBSTR, POS, CENTER/CENTRE and the ooRexx-specific [] notation.
  • Convert Runes strings to Bytes strings using BYTES(string).
  • Create Text strings with the new TEXT(string) BIF. Text strings are composed of extended grapheme clusters. A few BIFs have been implemented, namely LENGTH, SUBSTR, POS, CENTER/CENTRE and the ooRexx-specific [] notation.
  • Convert Text strings to Runes using RUNES, and to Bytes using BYTES.
  • ALLRUNES(string) converts a Runes or Text string into a set of blank-delimited hexadecimal codepoints. For example, ALLRUNES('👩‍👨‍👩‍👧') = '1F469 200D 1F468 200D 1F469 200D 1F467'.
  • Given a hexadecimal codepoint, R2N(code) returns the corresponding Name (na) property: R2N('1F385') = "FATHER CHRISTMAS".
  • Given a name, N2R returns its codepoint, if there is a match: N2R("Hangul syllabe GEOL") = "AC78".

You will find numerous documentation details in the links included above and in the source files. You may also want to play with the three included demo programs.

This is a 0.1 release. Comments, feedback and criticism are very welcome.

Josep Maria

P.S. I am copying below the output of Unicode.demo.basic.rex:

Testing basic operations and conversions

Test number 1: a UTF8 Bytes string (.String)

string = 'noël👩‍👨‍👩‍👧🎅'
Length(string) = 34
StringType(string) = 'BYTES'
Elements of 'noël👩‍👨‍👩‍👧🎅':
 1: n ('6E'X) 18: � ('80'X)
 2: o ('6F'X) 19: � ('8D'X)
 3: � ('C3'X) 20: � ('F0'X)
 4: � ('AB'X) 21: � ('9F'X)
 5: l ('6C'X) 22: � ('91'X)
 6: � ('F0'X) 23: � ('A9'X)
 7: � ('9F'X) 24: � ('E2'X)
 8: � ('91'X) 25: � ('80'X)
 9: � ('A9'X) 26: � ('8D'X)
10: � ('E2'X) 27: � ('F0'X)
11: � ('80'X) 28: � ('9F'X)
12: � ('8D'X) 29: � ('91'X)
13: � ('F0'X) 30: � ('A7'X)
14: � ('9F'X) 31: � ('F0'X)
15: � ('91'X) 32: � ('9F'X)
16: � ('A8'X) 33: � ('8E'X)
17: � ('E2'X) 34: � ('85'X)

Press ENTER to continue
Test number 2: a Runes string (composed of codepoints)

string = Runes('noël👩‍👨‍👩‍👧🎅')
Length(string) = 12
StringType(string) = 'RUNES'
Elements of 'noël👩‍👨‍👩‍👧🎅':
 1: n ('6E'X)
 2: o ('6F'X)
 3: ë ('C3AB'X)
 4: l ('6C'X)
 5: 👩 ('F09F91A9'X)
 6: ‍ ('E2808D'X)
 7: 👨 ('F09F91A8'X)
 8: ‍ ('E2808D'X)
 9: 👩 ('F09F91A9'X)
10: ‍ ('E2808D'X)
11: 👧 ('F09F91A7'X)
12: 🎅 ('F09F8E85'X)
AllRunes(string) = '006E 006F 00EB 006C 1F469 200D 1F468 200D 1F469 200D 1F467 1F385'

Press ENTER to continue
Test number 3: a Text string (composed of extended grapheme clusters)

string = Text('noël👩‍👨‍👩‍👧🎅')
Length(string) = 6
StringType(string) = 'TEXT'
Elements of 'noël👩‍👨‍👩‍👧🎅':
 1: n ('6E'X)
 2: o ('6F'X)
 3: ë ('C3AB'X)
 4: l ('6C'X)
 5: 👩‍👨‍👩‍👧 ('F09F91A9E2808DF09F91A8E2808DF09F91A9E2808DF09F91A7'X)
 6: 🎅 ('F09F8E85'X)
AllRunes(string) = '006E 006F 00EB 006C 1F469 200D 1F468 200D 1F469 200D 1F467 1F385'

Press ENTER to continue
Test number 4: converting a Text to Runes

string = Runes('noël👩‍👨‍👩‍👧🎅')  -- Text to Runes
Length(string) = 12

Press ENTER to continue
Test number 5: converting a Runes to Bytes

string = Bytes('noël👩‍👨‍👩‍👧🎅')  -- Runes to Bytes (String)
Length(string) = 34

Press ENTER to continue
Test number 6: converting Text to Bytes

string = Bytes(Text('noël👩‍👨‍👩‍👧🎅')  -- Text to Bytes (String)
Length(string) = 34