Skip to content

Commit

Permalink
Update to Unicode 16 data
Browse files Browse the repository at this point in the history
  • Loading branch information
kipcole9 committed Sep 10, 2024
1 parent b3df50b commit c3dd3ef
Show file tree
Hide file tree
Showing 34 changed files with 7,068 additions and 547 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Changelog

## Unicode v1.20.0

This is the changelog for Unicode v1.20.0 released on September 11, 2024. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)

### Enhancements

* Updates to [Unicode 16.0](https://unicode.org/versions/Unicode16.0.0/) data.

## Unicode v1.19.0

This is the changelog for Unicode v1.19.0 released on February 29th, 2024. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ The Elixir standard library does not provide introspection beyond that required

### Unicode version

As of [unicode version 1.17.0](https://hex.pm/packages/unicode/1.17.0) published on September 17th, 2023, [Unicode 15.1](https://www.unicode.org/versions/Unicode15.1.0/) forms the underlying data.
As of [unicode version 1.20.0](https://hex.pm/packages/unicode/1.20.0) published on September 11th, 2024, [Unicode 16.0](https://www.unicode.org/versions/Unicode16.0.0/) forms the underlying data.

## Additional Unicode libraries

Expand Down Expand Up @@ -187,7 +187,7 @@ The package can be installed by adding `unicode` to your list of dependencies in
```elixir
def deps do
[
{:unicode, "~> 1.19"}
{:unicode, "~> 1.20"}
]
end
```
Expand Down
19 changes: 15 additions & 4 deletions data/blocks.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# Blocks-15.1.0.txt
# Date: 2023-07-28, 15:47:20 GMT
# © 2023 Unicode®, Inc.
# For terms of use, see https://www.unicode.org/terms_of_use.html
# Blocks-16.0.0.txt
# Date: 2024-02-02
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see https://www.unicode.org/reports/tr44/
Expand Down Expand Up @@ -217,6 +218,7 @@ FFF0..FFFF; Specials
10500..1052F; Elbasan
10530..1056F; Caucasian Albanian
10570..105BF; Vithkuqi
105C0..105FF; Todhri
10600..1077F; Linear A
10780..107BF; Latin Extended-F
10800..1083F; Cypriot Syllabary
Expand All @@ -239,6 +241,7 @@ FFF0..FFFF; Specials
10C00..10C4F; Old Turkic
10C80..10CFF; Old Hungarian
10D00..10D3F; Hanifi Rohingya
10D40..10D8F; Garay
10E60..10E7F; Rumi Numeral Symbols
10E80..10EBF; Yezidi
10EC0..10EFF; Arabic Extended-C
Expand All @@ -258,12 +261,14 @@ FFF0..FFFF; Specials
11280..112AF; Multani
112B0..112FF; Khudawadi
11300..1137F; Grantha
11380..113FF; Tulu-Tigalari
11400..1147F; Newa
11480..114DF; Tirhuta
11580..115FF; Siddham
11600..1165F; Modi
11660..1167F; Mongolian Supplement
11680..116CF; Takri
116D0..116FF; Myanmar Extended-C
11700..1174F; Ahom
11800..1184F; Dogra
118A0..118FF; Warang Citi
Expand All @@ -274,6 +279,7 @@ FFF0..FFFF; Specials
11AB0..11ABF; Unified Canadian Aboriginal Syllabics Extended-A
11AC0..11AFF; Pau Cin Hau
11B00..11B5F; Devanagari Extended-A
11BC0..11BFF; Sunuwar
11C00..11C6F; Bhaiksuki
11C70..11CBF; Marchen
11D00..11D5F; Masaram Gondi
Expand All @@ -288,12 +294,15 @@ FFF0..FFFF; Specials
12F90..12FFF; Cypro-Minoan
13000..1342F; Egyptian Hieroglyphs
13430..1345F; Egyptian Hieroglyph Format Controls
13460..143FF; Egyptian Hieroglyphs Extended-A
14400..1467F; Anatolian Hieroglyphs
16100..1613F; Gurung Khema
16800..16A3F; Bamum Supplement
16A40..16A6F; Mro
16A70..16ACF; Tangsa
16AD0..16AFF; Bassa Vah
16B00..16B8F; Pahawh Hmong
16D40..16D7F; Kirat Rai
16E40..16E9F; Medefaidrin
16F00..16F9F; Miao
16FE0..16FFF; Ideographic Symbols and Punctuation
Expand All @@ -308,6 +317,7 @@ FFF0..FFFF; Specials
1B170..1B2FF; Nushu
1BC00..1BC9F; Duployan
1BCA0..1BCAF; Shorthand Format Controls
1CC00..1CEBF; Symbols for Legacy Computing Supplement
1CF00..1CFCF; Znamenny Musical Notation
1D000..1D0FF; Byzantine Musical Symbols
1D100..1D1FF; Musical Symbols
Expand All @@ -325,6 +335,7 @@ FFF0..FFFF; Specials
1E290..1E2BF; Toto
1E2C0..1E2FF; Wancho
1E4D0..1E4FF; Nag Mundari
1E5D0..1E5FF; Ol Onal
1E7E0..1E7FF; Ethiopic Extended-B
1E800..1E8DF; Mende Kikakui
1E900..1E95F; Adlam
Expand Down
35 changes: 31 additions & 4 deletions data/case_folding.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# CaseFolding-15.1.0.txt
# Date: 2023-05-12, 21:53:10 GMT
# © 2023 Unicode®, Inc.
# CaseFolding-16.0.0.txt
# Date: 2024-04-30, 21:48:11 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see https://www.unicode.org/reports/tr44/
Expand Down Expand Up @@ -603,6 +603,7 @@
1C86; C; 044A; # CYRILLIC SMALL LETTER TALL HARD SIGN
1C87; C; 0463; # CYRILLIC SMALL LETTER TALL YAT
1C88; C; A64B; # CYRILLIC SMALL LETTER UNBLENDED UK
1C89; C; 1C8A; # CYRILLIC CAPITAL LETTER TJE
1C90; C; 10D0; # GEORGIAN MTAVRULI CAPITAL LETTER AN
1C91; C; 10D1; # GEORGIAN MTAVRULI CAPITAL LETTER BAN
1C92; C; 10D2; # GEORGIAN MTAVRULI CAPITAL LETTER GAN
Expand Down Expand Up @@ -1240,9 +1241,13 @@ A7C5; C; 0282; # LATIN CAPITAL LETTER S WITH HOOK
A7C6; C; 1D8E; # LATIN CAPITAL LETTER Z WITH PALATAL HOOK
A7C7; C; A7C8; # LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY
A7C9; C; A7CA; # LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY
A7CB; C; 0264; # LATIN CAPITAL LETTER RAMS HORN
A7CC; C; A7CD; # LATIN CAPITAL LETTER S WITH DIAGONAL STROKE
A7D0; C; A7D1; # LATIN CAPITAL LETTER CLOSED INSULAR G
A7D6; C; A7D7; # LATIN CAPITAL LETTER MIDDLE SCOTS S
A7D8; C; A7D9; # LATIN CAPITAL LETTER SIGMOID S
A7DA; C; A7DB; # LATIN CAPITAL LETTER LAMBDA
A7DC; C; 019B; # LATIN CAPITAL LETTER LAMBDA WITH STROKE
A7F5; C; A7F6; # LATIN CAPITAL LETTER REVERSED HALF H
AB70; C; 13A0; # CHEROKEE SMALL LETTER A
AB71; C; 13A1; # CHEROKEE SMALL LETTER E
Expand Down Expand Up @@ -1525,6 +1530,28 @@ FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
10CB0; C; 10CF0; # OLD HUNGARIAN CAPITAL LETTER EZS
10CB1; C; 10CF1; # OLD HUNGARIAN CAPITAL LETTER ENT-SHAPED SIGN
10CB2; C; 10CF2; # OLD HUNGARIAN CAPITAL LETTER US
10D50; C; 10D70; # GARAY CAPITAL LETTER A
10D51; C; 10D71; # GARAY CAPITAL LETTER CA
10D52; C; 10D72; # GARAY CAPITAL LETTER MA
10D53; C; 10D73; # GARAY CAPITAL LETTER KA
10D54; C; 10D74; # GARAY CAPITAL LETTER BA
10D55; C; 10D75; # GARAY CAPITAL LETTER JA
10D56; C; 10D76; # GARAY CAPITAL LETTER SA
10D57; C; 10D77; # GARAY CAPITAL LETTER WA
10D58; C; 10D78; # GARAY CAPITAL LETTER LA
10D59; C; 10D79; # GARAY CAPITAL LETTER GA
10D5A; C; 10D7A; # GARAY CAPITAL LETTER DA
10D5B; C; 10D7B; # GARAY CAPITAL LETTER XA
10D5C; C; 10D7C; # GARAY CAPITAL LETTER YA
10D5D; C; 10D7D; # GARAY CAPITAL LETTER TA
10D5E; C; 10D7E; # GARAY CAPITAL LETTER RA
10D5F; C; 10D7F; # GARAY CAPITAL LETTER NYA
10D60; C; 10D80; # GARAY CAPITAL LETTER FA
10D61; C; 10D81; # GARAY CAPITAL LETTER NA
10D62; C; 10D82; # GARAY CAPITAL LETTER PA
10D63; C; 10D83; # GARAY CAPITAL LETTER HA
10D64; C; 10D84; # GARAY CAPITAL LETTER OLD KA
10D65; C; 10D85; # GARAY CAPITAL LETTER OLD NA
118A0; C; 118C0; # WARANG CITI CAPITAL LETTER NGAA
118A1; C; 118C1; # WARANG CITI CAPITAL LETTER A
118A2; C; 118C2; # WARANG CITI CAPITAL LETTER WI
Expand Down
Loading

0 comments on commit c3dd3ef

Please sign in to comment.