Skip to content

Commit

Permalink
Update to Unicode 15.1
Browse files Browse the repository at this point in the history
  • Loading branch information
kipcole9 committed Sep 17, 2023
1 parent 8e13513 commit 96b84ce
Show file tree
Hide file tree
Showing 24 changed files with 7,236 additions and 6,318 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Changelog

## Unicode v1.17.0

This is the changelog for Unicode v1.17.0 released on September 17th, 2023. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)

### Enhancements

* Updates to [Unicode 15.1](https://unicode.org/versions/Unicode15.1.0/) data.

* Improve the security of the `mix unicode.download` task.

## Unicode v1.16.2

This is the changelog for Unicode v1.16.2 released on August 16th, 2023. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)
Expand Down
7 changes: 4 additions & 3 deletions data/blocks.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Blocks-15.0.0.txt
# Date: 2022-01-28, 20:58:00 GMT [KW]
# © 2022 Unicode®, Inc.
# Blocks-15.1.0.txt
# Date: 2023-07-28, 15:47:20 GMT
# © 2023 Unicode®, Inc.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
Expand Down Expand Up @@ -352,6 +352,7 @@ FFF0..FFFF; Specials
2B740..2B81F; CJK Unified Ideographs Extension D
2B820..2CEAF; CJK Unified Ideographs Extension E
2CEB0..2EBEF; CJK Unified Ideographs Extension F
2EBF0..2EE5F; CJK Unified Ideographs Extension I
2F800..2FA1F; CJK Compatibility Ideographs Supplement
30000..3134F; CJK Unified Ideographs Extension G
31350..323AF; CJK Unified Ideographs Extension H
Expand Down
9 changes: 6 additions & 3 deletions data/case_folding.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CaseFolding-15.0.0.txt
# Date: 2022-02-02, 23:35:35 GMT
# © 2022 Unicode®, Inc.
# CaseFolding-15.1.0.txt
# Date: 2023-05-12, 21:53:10 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -929,6 +929,7 @@
1FCC; S; 1FC3; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
1FD2; F; 03B9 0308 0300; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA
1FD3; F; 03B9 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
1FD3; S; 0390; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
1FD6; F; 03B9 0342; # GREEK SMALL LETTER IOTA WITH PERISPOMENI
1FD7; F; 03B9 0308 0342; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
1FD8; C; 1FD0; # GREEK CAPITAL LETTER IOTA WITH VRACHY
Expand All @@ -937,6 +938,7 @@
1FDB; C; 1F77; # GREEK CAPITAL LETTER IOTA WITH OXIA
1FE2; F; 03C5 0308 0300; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA
1FE3; F; 03C5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
1FE3; S; 03B0; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
1FE4; F; 03C1 0313; # GREEK SMALL LETTER RHO WITH PSILI
1FE6; F; 03C5 0342; # GREEK SMALL LETTER UPSILON WITH PERISPOMENI
1FE7; F; 03C5 0308 0342; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
Expand Down Expand Up @@ -1328,6 +1330,7 @@ FB02; F; 0066 006C; # LATIN SMALL LIGATURE FL
FB03; F; 0066 0066 0069; # LATIN SMALL LIGATURE FFI
FB04; F; 0066 0066 006C; # LATIN SMALL LIGATURE FFL
FB05; F; 0073 0074; # LATIN SMALL LIGATURE LONG S T
FB05; S; FB06; # LATIN SMALL LIGATURE LONG S T
FB06; F; 0073 0074; # LATIN SMALL LIGATURE ST
FB13; F; 0574 0576; # ARMENIAN SMALL LIGATURE MEN NOW
FB14; F; 0574 0565; # ARMENIAN SMALL LIGATURE MEN ECH
Expand Down
22 changes: 12 additions & 10 deletions data/categories.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# DerivedGeneralCategory-15.0.0.txt
# Date: 2022-04-26, 23:14:35 GMT
# © 2022 Unicode®, Inc.
# DerivedGeneralCategory-15.1.0.txt
# Date: 2023-07-28, 23:34:02 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -284,13 +284,12 @@
2E9A ; Cn # <reserved-2E9A>
2EF4..2EFF ; Cn # [12] <reserved-2EF4>..<reserved-2EFF>
2FD6..2FEF ; Cn # [26] <reserved-2FD6>..<reserved-2FEF>
2FFC..2FFF ; Cn # [4] <reserved-2FFC>..<reserved-2FFF>
3040 ; Cn # <reserved-3040>
3097..3098 ; Cn # [2] <reserved-3097>..<reserved-3098>
3100..3104 ; Cn # [5] <reserved-3100>..<reserved-3104>
3130 ; Cn # <reserved-3130>
318F ; Cn # <reserved-318F>
31E4..31EF ; Cn # [12] <reserved-31E4>..<reserved-31EF>
31E4..31EE ; Cn # [11] <reserved-31E4>..<reserved-31EE>
321F ; Cn # <reserved-321F>
A48D..A48F ; Cn # [3] <reserved-A48D>..<reserved-A48F>
A4C7..A4CF ; Cn # [9] <reserved-A4C7>..<reserved-A4CF>
Expand Down Expand Up @@ -713,7 +712,8 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
2B73A..2B73F ; Cn # [6] <reserved-2B73A>..<reserved-2B73F>
2B81E..2B81F ; Cn # [2] <reserved-2B81E>..<reserved-2B81F>
2CEA2..2CEAF ; Cn # [14] <reserved-2CEA2>..<reserved-2CEAF>
2EBE1..2F7FF ; Cn # [3103] <reserved-2EBE1>..<reserved-2F7FF>
2EBE1..2EBEF ; Cn # [15] <reserved-2EBE1>..<reserved-2EBEF>
2EE5E..2F7FF ; Cn # [2466] <reserved-2EE5E>..<reserved-2F7FF>
2FA1E..2FFFF ; Cn # [1506] <reserved-2FA1E>..<noncharacter-2FFFF>
3134B..3134F ; Cn # [5] <reserved-3134B>..<reserved-3134F>
323B0..E0000 ; Cn # [711761] <reserved-323B0>..<reserved-E0000>
Expand All @@ -723,7 +723,7 @@ E01F0..EFFFF ; Cn # [65040] <reserved-E01F0>..<noncharacter-EFFFF>
FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
10FFFE..10FFFF; Cn # [2] <noncharacter-10FFFE>..<noncharacter-10FFFF>

# Total code points: 825345
# Total code points: 824718

# ================================================

Expand Down Expand Up @@ -2649,11 +2649,12 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
2B740..2B81D ; Lo # [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Lo # [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Lo # [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; Lo # [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Lo # [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Lo # [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Lo # [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 131612
# Total code points: 132234

# ================================================

Expand Down Expand Up @@ -4092,7 +4093,7 @@ FFE3 ; Sk # FULLWIDTH MACRON
2E80..2E99 ; So # [26] CJK RADICAL REPEAT..CJK RADICAL RAP
2E9B..2EF3 ; So # [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
2F00..2FD5 ; So # [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
2FF0..2FFB ; So # [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
2FF0..2FFF ; So # [16] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION
3004 ; So # JAPANESE INDUSTRIAL STANDARD SYMBOL
3012..3013 ; So # [2] POSTAL MARK..GETA MARK
3020 ; So # POSTAL MARK FACE
Expand All @@ -4101,6 +4102,7 @@ FFE3 ; Sk # FULLWIDTH MACRON
3190..3191 ; So # [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK
3196..319F ; So # [10] IDEOGRAPHIC ANNOTATION TOP MARK..IDEOGRAPHIC ANNOTATION MAN MARK
31C0..31E3 ; So # [36] CJK STROKE T..CJK STROKE Q
31EF ; So # IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION
3200..321E ; So # [31] PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU
322A..3247 ; So # [30] PARENTHESIZED IDEOGRAPH MOON..CIRCLED IDEOGRAPH KOTO
3250 ; So # PARTNERSHIP SIGN
Expand Down Expand Up @@ -4191,7 +4193,7 @@ FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
1FB00..1FB92 ; So # [147] BLOCK SEXTANT-1..UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
1FB94..1FBCA ; So # [55] LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK..WHITE UP-POINTING CHEVRON

# Total code points: 6634
# Total code points: 6639

# ================================================

Expand Down
12 changes: 7 additions & 5 deletions data/combining_class.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# DerivedCombiningClass-15.0.0.txt
# Date: 2022-04-26, 23:14:29 GMT
# © 2022 Unicode®, Inc.
# DerivedCombiningClass-15.1.0.txt
# Date: 2023-07-28, 23:33:58 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -988,7 +988,7 @@
2E80..2E99 ; 0 # So [26] CJK RADICAL REPEAT..CJK RADICAL RAP
2E9B..2EF3 ; 0 # So [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
2F00..2FD5 ; 0 # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
2FF0..2FFB ; 0 # So [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
2FF0..2FFF ; 0 # So [16] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION
3000 ; 0 # Zs IDEOGRAPHIC SPACE
3001..3003 ; 0 # Po [3] IDEOGRAPHIC COMMA..DITTO MARK
3004 ; 0 # So JAPANESE INDUSTRIAL STANDARD SYMBOL
Expand Down Expand Up @@ -1043,6 +1043,7 @@
3196..319F ; 0 # So [10] IDEOGRAPHIC ANNOTATION TOP MARK..IDEOGRAPHIC ANNOTATION MAN MARK
31A0..31BF ; 0 # Lo [32] BOPOMOFO LETTER BU..BOPOMOFO LETTER AH
31C0..31E3 ; 0 # So [36] CJK STROKE T..CJK STROKE Q
31EF ; 0 # So IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION
31F0..31FF ; 0 # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
3200..321E ; 0 # So [31] PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU
3220..3229 ; 0 # No [10] PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEOGRAPH TEN
Expand Down Expand Up @@ -1994,6 +1995,7 @@ FFFC..FFFD ; 0 # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
2B740..2B81D ; 0 # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; 0 # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; 0 # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; 0 # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; 0 # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; 0 # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; 0 # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
Expand All @@ -2003,7 +2005,7 @@ E0100..E01EF ; 0 # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
F0000..FFFFD ; 0 # Co [65534] <private-use-F0000>..<private-use-FFFFD>
100000..10FFFD; 0 # Co [65534] <private-use-100000>..<private-use-10FFFD>

# The above property value applies to 827393 code points not listed here.
# The above property value applies to 826766 code points not listed here.
# Total code points: 1113190

# ================================================
Expand Down
Loading

0 comments on commit 96b84ce

Please sign in to comment.