Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add names for Mathematical Standardized variation sequences #18

Open
mkorje opened this issue Dec 14, 2024 · 0 comments
Open

Add names for Mathematical Standardized variation sequences #18

mkorje opened this issue Dec 14, 2024 · 0 comments
Labels
multi-character symbols This requires multi-code point symbols

Comments

@mkorje
Copy link
Collaborator

mkorje commented Dec 14, 2024

Quoting from https://charlottebuff.com/unicode/charts/standardized-variants/:

A variation sequence consists of some base character, immediately followed by an invisible, zero‐width combining mark called a variation selector. The presence of the variation selector restricts the intended appearance of the base character to one particular subset of all its possible graphical representations. This means that applications that care about these specific differences know to apply a distinct glyph to the base character while other applications can safely ignore the variation selector and display the base character however they see fit. The underlying character remains the same, after all. The meaning doesn’t change; only the presentation does.
Standardized variation sequences – the subject of this page – make use of the generic variation selectors 1 through 14 (U+FE00–U+FE0D) and the Mongolian free variation selectors (U+180B–U+180D, U+180F). They can be defined for pretty much any purpose. There also exist emoji variation sequences which use variation selectors 15 and 16 (U+FE0E, U+FE0F) to select either the text‐style or emoji‐style rendering of an appropriate base character, and ideographic variation sequences which use variation selectors 17 through 256 (U+E0100–U+E01EF) and are exclusive to CJK Unified Ideographs.
Only variation sequences explicitly defined as part of the Unicode Standard, Unicode Emoji, or the Ideographic Variation Database are valid; a variation selector in any other context must not affect the glyphic appearance of the character it applies to.

More information is available in Chapter 23.4 Variation Selectors of the Core Spec.

The Unicode Consortium maintains a pre-defined list of combinations of particular initial characters plus particular variation selectors. The Mathematical ones of interest are a part of the Standardized variation sequences defined in the file StandardizedVariants.txt in the Unicode Character Database.

Standardized variation sequences under Mathematical in the StandardizedVariants.txt file
# Mathematical

0030 FE00; short diagonal stroke form; # DIGIT ZERO
2205 FE00; zero with long diagonal stroke overlay form; # EMPTY SET
2229 FE00; with serifs; # INTERSECTION
222A FE00; with serifs; # UNION
2268 FE00; with vertical stroke; # LESS-THAN BUT NOT EQUAL TO
2269 FE00; with vertical stroke; # GREATER-THAN BUT NOT EQUAL TO
2272 FE00; following the slant of the lower leg; # LESS-THAN OR EQUIVALENT TO
2273 FE00; following the slant of the lower leg; # GREATER-THAN OR EQUIVALENT TO
# The following two entries were originally defined for Unicode 3.2
# but were determined to be in error and were removed from the list
# of standardized variation sequences. The entries are left commented out
# in the file for the historical record of changes made to the data.
# This change happened in Unicode 4.0, per UTC Consensus 92-C2
# and UTC Action Item 92-A4. See also L2/02-291 and L2/02-126.
#2278 FE00; with vertical stroke; # NEITHER LESS-THAN NOR GREATER-THAN
#2279 FE00; with vertical stroke; # NEITHER GREATER-THAN NOR LESS-THAN
228A FE00; with stroke through bottom members; # SUBSET OF WITH NOT EQUAL TO
228B FE00; with stroke through bottom members; # SUPERSET OF WITH NOT EQUAL TO
2293 FE00; with serifs; # SQUARE CAP
2294 FE00; with serifs; # SQUARE CUP
2295 FE00; with white rim; # CIRCLED PLUS
2297 FE00; with white rim; # CIRCLED TIMES
229C FE00; with equal sign touching the circle; # CIRCLED EQUALS
22DA FE00; with slanted equal; # LESS-THAN EQUAL TO OR GREATER-THAN
22DB FE00; with slanted equal; # GREATER-THAN EQUAL TO OR LESS-THAN
2A3C FE00; tall variant with narrow foot; # INTERIOR PRODUCT
2A3D FE00; tall variant with narrow foot; # RIGHTHAND INTERIOR PRODUCT
2A9D FE00; with similar following the slant of the upper leg; # SIMILAR OR LESS-THAN
2A9E FE00; with similar following the slant of the upper leg; # SIMILAR OR GREATER-THAN
2AAC FE00; with slanted equal; # SMALLER THAN OR EQUAL TO
2AAD FE00; with slanted equal; # LARGER THAN OR EQUAL TO
2ACB FE00; with stroke through bottom members; # SUBSET OF ABOVE NOT EQUAL TO
2ACC FE00; with stroke through bottom members; # SUPERSET OF ABOVE NOT EQUAL TO
FF10 FE00; short diagonal stroke form; # FULLWIDTH DIGIT ZERO

Given that these all modify existing glyphs, they would probably be best added as modifiers of their base glyph in Codex.

Adding names for these will require us to support symbols returning a sequence of characters, instead of just a single character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-character symbols This requires multi-code point symbols
Projects
None yet
Development

No branches or pull requests

2 participants