Skip to content

Commit

Permalink
CLDR-18129 Investigate and fix (where necessary) invalid codes (#4215)
Browse files Browse the repository at this point in the history
  • Loading branch information
macchiati authored Dec 2, 2024
1 parent fccc7c2 commit db1ba18
Show file tree
Hide file tree
Showing 17 changed files with 321 additions and 103 deletions.
4 changes: 2 additions & 2 deletions common/dtd/ldml.dtd
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT language ( #PCDATA ) >
<!ATTLIST language type NMTOKEN #REQUIRED >
<!--@MATCH:validity/locale-->
<!--@MATCH:validity/locale-for-names-->
<!ATTLIST language alt NMTOKENS #IMPLIED >
<!--@MATCH:literal/long, secondary, short, variant, menu, official-->
<!ATTLIST language draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
Expand Down Expand Up @@ -95,7 +95,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT variant ( #PCDATA ) >
<!ATTLIST variant type NMTOKEN #REQUIRED >
<!--@MATCH:validity/variant-->
<!--@MATCH:or/validity/variant||literal/AREVELA, AREVMDA, LAUKIKA, VAIDIKA-->
<!ATTLIST variant alt NMTOKENS #IMPLIED >
<!--@MATCH:literal/secondary, variant-->
<!ATTLIST variant draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
Expand Down
26 changes: 13 additions & 13 deletions common/dtd/ldmlSupplemental.dtd
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT region ( currency* ) >
<!ATTLIST region iso3166 NMTOKEN #REQUIRED >
<!--@MATCH:validity/region-->
<!--@MATCH:validity/region/all-->
<!ATTLIST region draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
<!--@METADATA-->
<!--@DEPRECATED-->
Expand Down Expand Up @@ -113,7 +113,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<!ATTLIST group type NMTOKEN #REQUIRED >
<!--@MATCH:validity/region-->
<!ATTLIST group contains NMTOKENS #IMPLIED >
<!--@MATCH:set/validity/region-->
<!--@MATCH:set/validity/region/all-->
<!--@VALUE-->
<!ATTLIST group grouping (true | false) #IMPLIED >
<!--@VALUE-->
Expand Down Expand Up @@ -284,7 +284,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<!ELEMENT minDays EMPTY >
<!ATTLIST minDays count (1 | 2 | 3 | 4 | 5 | 6 | 7) #REQUIRED >
<!ATTLIST minDays territories NMTOKENS #REQUIRED >
<!--@MATCH:set/validity/region-->
<!--@MATCH:set/validity/region/all-->
<!--@VALUE-->
<!ATTLIST minDays draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
<!--@METADATA-->
Expand All @@ -297,7 +297,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<!ELEMENT firstDay EMPTY >
<!ATTLIST firstDay day (sun | mon | tue | wed | thu | fri | sat) #REQUIRED >
<!ATTLIST firstDay territories NMTOKENS #REQUIRED >
<!--@MATCH:set/validity/region-->
<!--@MATCH:set/validity/region/all-->
<!--@VALUE-->
<!ATTLIST firstDay draft (approved | contributed | provisional | unconfirmed | true | false) #IMPLIED >
<!--@METADATA-->
Expand Down Expand Up @@ -702,7 +702,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT languageAlias EMPTY >
<!ATTLIST languageAlias type NMTOKEN #REQUIRED >
<!--@MATCH:or/validity/locale||literal/aa_saaho, aar, abk, afr, aka, alb, amh, ara, arg, arm, art_lojban, asm, ava, ave, aym, aze, bak, bam, baq, bel, ben, bih, bis, bod, bos, bre, bul, bur, cat, ces, cha, che, chi, chu, chv, cor, cos, cre, cym, cze, dan, deu, div, dut, dzo, ell, eng, epo, est, eus, ewe, fao, fas, fij, fin, fra, fre, fry, ful, geo, ger, gla, gle, glg, glv, gre, grn, guj, hat, hau, hbs, heb, her, hin, hmo, hrv, hun, hye, i_ami, i_bnn, i_hak, i_klingon, i_lux, i_navajo, i_pwn, i_tao, i_tay, i_tsu, ibo, ice, ido, iii, iku, ile, ina, ind, ipk, isl, ita, jav, jpn, kal, kan, kas, kat, kau, kaz, khm, kik, kin, kir, kom, kon, kor, kua, kur, lao, lat, lav, lim, lin, lit, ltz, lub, lug, mac, mah, mal, mao, mar, may, mkd, mlg, mlt, mol, mon, mri, msa, mya, nau, nav, nbl, nde, ndo, nep, nld, nno, no_bokmal, no_nynorsk, no_bok, no_nyn, nob, nor, nya, oci, oji, ori, orm, oss, pan, per, pli, pol, por, pus, que, roh, ron, rum, run, rus, sag, san, scc, scr, sgn_BE_FR, sgn_BE_NL, sgn_CH_DE, sin, slk, slo, slv, sme, smo, sna, snd, som, sot, spa, sqi, srd, srp, ssw, sun, swa, swe, tah, tam, tat, tel, tgk, tgl, tha, tib, tir, ton, tsn, tso, tuk, tur, twi, uig, ukr, urd, uzb, ven, vie, vol, wel, wln, wol, xho, yid, yor, zh_guoyu, zh_hakka, zh_min_nan, zh_xiang, zha, zho, zul, cel_gaulish, i_default, i_enochian, i_mingo, und_aaland, und_bokmal, und_hakka, und_lojban, und_nynorsk, und_saaho, und_xiang, zh_min, en_GB_oed, zh_cmn, zh_cmn_Hans, zh_cmn_Hant, zh_gan, zh_wuu, zh_yue-->
<!--@MATCH:or/validity/bcp47-wellformed||literal/aa_saaho, art_lojban, cel_gaulish, en_GB_oed, hy_arevmda, i_ami, i_bnn, i_default, i_enochian, i_hak, i_klingon, i_lux, i_mingo, i_navajo, i_pwn, i_tao, i_tay, i_tsu, no_bok, no_bokmal, no_nyn, no_nynorsk, sgn_BE_FR, sgn_BE_NL, sgn_BR, sgn_CH_DE, sgn_CO, sgn_DE, sgn_DK, sgn_ES, sgn_FR, sgn_GB, sgn_GR, sgn_IE, sgn_IT, sgn_JP, sgn_MX, sgn_NI, sgn_NL, sgn_NO, sgn_PT, sgn_SE, sgn_US, sgn_ZA, und_aaland, und_arevela, und_arevmda, und_bokmal, und_hakka, und_hepburn_heploc, und_lojban, und_nynorsk, und_saaho, und_xiang, zh_cmn, zh_cmn_Hans, zh_cmn_Hant, zh_gan, zh_guoyu, zh_hakka, zh_min, zh_min_nan, zh_wuu, zh_xiang, zh_yue-->
<!ATTLIST languageAlias replacement NMTOKEN #REQUIRED >
<!--@MATCH:or/validity/locale||literal/en_x_i_default, nan_x_zh_min, see_x_i_mingo, und_x_i_enochian, xtg_x_cel_gaulish-->
<!--@VALUE-->
Expand All @@ -711,7 +711,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT scriptAlias EMPTY >
<!ATTLIST scriptAlias type NMTOKEN #REQUIRED >
<!--@MATCH:validity/script-->
<!--@MATCH:validity/script/all-->
<!ATTLIST scriptAlias replacement NMTOKEN #REQUIRED >
<!--@MATCH:validity/script-->
<!--@VALUE-->
Expand All @@ -720,9 +720,9 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT territoryAlias EMPTY >
<!ATTLIST territoryAlias type NMTOKEN #REQUIRED >
<!--@MATCH:set/or/validity/region||regex/[0-9]{3}|[A-Z]{3}||literal/CT, DY, FQ, HV, JT, MI, NH, NQ, PC, PU, PZ, RH, UK, VD, WK-->
<!--@MATCH:set/or/validity/region/all||regex/[0-9]{3}|[A-Z]{3}||literal/CT, DY, FQ, HV, JT, MI, NH, NQ, PC, PU, PZ, RH, UK, VD, WK-->
<!ATTLIST territoryAlias replacement NMTOKENS #REQUIRED >
<!--@MATCH:set/validity/region-->
<!--@MATCH:set/validity/region/all-->
<!--@VALUE-->
<!ATTLIST territoryAlias reason (deprecated | overlong) #IMPLIED >
<!--@VALUE-->
Expand All @@ -738,7 +738,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT variantAlias EMPTY >
<!ATTLIST variantAlias type NMTOKEN #REQUIRED >
<!--@MATCH:or/validity/variant||literal/aaland, polytoni-->
<!--@MATCH:or/validity/variant/all||literal/aaland, polytoni-->
<!ATTLIST variantAlias replacement NMTOKEN #REQUIRED >
<!--@MATCH:or/validity/variant||validity/region||literal/hy, hyw-->
<!--@VALUE-->
Expand Down Expand Up @@ -914,7 +914,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT territoryCodes EMPTY >
<!ATTLIST territoryCodes type NMTOKEN #REQUIRED >
<!--@MATCH:validity/region-->
<!--@MATCH:validity/region/all-->
<!ATTLIST territoryCodes numeric NMTOKEN #IMPLIED >
<!--@MATCH:range/1~999-->
<!--@VALUE-->
Expand Down Expand Up @@ -962,9 +962,9 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT likelySubtag EMPTY >
<!ATTLIST likelySubtag from NMTOKEN #REQUIRED >
<!--@MATCH:validity/locale-->
<!--@MATCH:validity/locale-for-likely-->
<!ATTLIST likelySubtag to NMTOKEN #REQUIRED >
<!--@MATCH:validity/locale-->
<!--@MATCH:validity/locale-for-likely-->
<!--@VALUE-->
<!ATTLIST likelySubtag origin NMTOKENS #IMPLIED >
<!--@MATCH:set/literal/sil1, wikidata, special-->
Expand Down Expand Up @@ -996,7 +996,7 @@ CLDR data files are interpreted according to the LDML specification (http://unic

<!ELEMENT pluralRules ( pluralRule* ) >
<!ATTLIST pluralRules locales NMTOKENS #REQUIRED >
<!--@MATCH:set/validity/locale-->
<!--@MATCH:set/or/validity/locale-for-likely||literal/sh-->
<!ATTLIST pluralRules draft (approved | contributed | provisional | unconfirmed) #IMPLIED >
<!--@METADATA-->
<!--@DEPRECATED-->
Expand Down
5 changes: 1 addition & 4 deletions common/main/en.xml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ annotations.
<language type="cop">Coptic</language>
<language type="cps">Capiznon</language>
<language type="cr">Cree</language>
<language type="cr" alt="long">Woods Cree</language>
<language type="crg">Michif</language>
<language type="crh">Crimean Tatar</language>
<language type="crj">Southern East Cree</language>
Expand All @@ -152,7 +153,6 @@ annotations.
<language type="csw">Swampy Cree</language>
<language type="cu">Church Slavic</language>
<language type="cv">Chuvash</language>
<language type="cwd">Woods Cree</language>
<language type="cy">Welsh</language>
<language type="da">Danish</language>
<language type="dak">Dakota</language>
Expand Down Expand Up @@ -256,7 +256,6 @@ annotations.
<language type="hak">Hakka Chinese</language>
<language type="haw">Hawaiian</language>
<language type="hax">Southern Haida</language>
<language type="hdn">Northern Haida</language>
<language type="he">Hebrew</language>
<language type="hi">Hindi</language>
<language type="hi_Latn">Hindi (Latin)</language>
Expand Down Expand Up @@ -284,7 +283,6 @@ annotations.
<language type="ig">Igbo</language>
<language type="ii">Sichuan Yi</language>
<language type="ik">Inupiaq</language>
<language type="ike">Eastern Canadian Inuktitut</language>
<language type="ikt">Western Canadian Inuktitut</language>
<language type="ilo">Iloko</language>
<language type="inh">Ingush</language>
Expand Down Expand Up @@ -474,7 +472,6 @@ annotations.
<language type="oj">Ojibwa</language>
<language type="ojb">Northwestern Ojibwa</language>
<language type="ojc">Central Ojibwa</language>
<language type="ojg">Eastern Ojibwa</language>
<language type="ojs">Oji-Cree</language>
<language type="ojw">Western Ojibwa</language>
<language type="oka">Okanagan</language>
Expand Down
2 changes: 1 addition & 1 deletion common/main/fi.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Warnings: All cp values have U+FE0F characters removed. See /annotationsDerived/
<language type="afh">afrihili</language>
<language type="agq">aghem</language>
<language type="ain">ainu</language>
<language type="ajp">urduni</language>
<language type="apc">urduni</language>
<language type="ak">akan</language>
<language type="akk">akkadi</language>
<language type="akz">alabama</language>
Expand Down
10 changes: 3 additions & 7 deletions common/main/la.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<language type="az" draft="provisional">Atropatenica</language>
<language type="be" draft="provisional">Ruthenica Alba</language>
<language type="bg" draft="provisional">Bulgarica</language>
<language type="bh" draft="provisional">Bihari</language>
<language type="bn" draft="provisional">Bengalica</language>
<language type="bo" draft="provisional">Tibetana</language>
<language type="br" draft="provisional">Britonica</language>
Expand Down Expand Up @@ -66,12 +65,12 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<language type="ia" draft="provisional">Interlingua</language>
<language type="ie" draft="provisional">Interlingue</language>
<language type="ig" draft="provisional">Igbonica</language>
<language type="in" draft="provisional">Indonesia</language>
<language type="id" draft="provisional">Indonesia</language>
<language type="is" draft="provisional">Islandica</language>
<language type="it" draft="provisional">Italiana</language>
<language type="iw" draft="provisional">Hebraica</language>
<language type="he" draft="provisional">Hebraica</language>
<language type="ja" draft="provisional">Iaponica</language>
<language type="ji" draft="provisional">Iudaeogermanica</language>
<language type="yi" draft="provisional">Iudaeogermanica</language>
<language type="jv" draft="provisional">Iavensis</language>
<language type="ka" draft="provisional">Georgiana</language>
<language type="kk" draft="provisional">Cazachica</language>
Expand Down Expand Up @@ -213,7 +212,6 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<territory type="BR" draft="provisional">Brasilia</territory>
<territory type="BS" draft="provisional">Insulae Bahamenses</territory>
<territory type="BT" draft="provisional">Butania</territory>
<territory type="BU" draft="provisional">Birmania</territory>
<territory type="BV" draft="provisional">Insula Bouvet</territory>
<territory type="BW" draft="provisional">Botswana</territory>
<territory type="BY" draft="provisional">Ruthenia Alba</territory>
Expand All @@ -237,7 +235,6 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<territory type="CX" draft="provisional">Insula Christi Natalis</territory>
<territory type="CY" draft="provisional">Cyprus</territory>
<territory type="CZ" draft="provisional">Cechia</territory>
<territory type="DD" draft="provisional">Res publica Democratica Germanica</territory>
<territory type="DE" draft="provisional">Germania</territory>
<territory type="DJ" draft="provisional">Gibutum</territory>
<territory type="DK" draft="provisional">Dania</territory>
Expand Down Expand Up @@ -421,7 +418,6 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<territory type="XK" draft="provisional">Kosovia</territory>
<territory type="YE" draft="provisional">Iemenia</territory>
<territory type="YT" draft="provisional">Maiotta</territory>
<territory type="YU" draft="provisional">Iugoslavia</territory>
<territory type="ZA" draft="provisional">Africa Australis</territory>
<territory type="ZM" draft="provisional">Zambia</territory>
<territory type="ZW" draft="provisional">Zimbabua</territory>
Expand Down
1 change: 0 additions & 1 deletion common/main/nl.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ Warnings: All cp values have U+FE0F characters removed. See /annotationsDerived/
<language type="afh">Afrihili</language>
<language type="agq">Aghem</language>
<language type="ain">Aino</language>
<language type="ajp" draft="contributed">Zuid-Levantijns-Arabisch</language>
<language type="ak">Akan</language>
<language type="akk">Akkadisch</language>
<language type="akz">Alabama</language>
Expand Down
2 changes: 1 addition & 1 deletion common/supplemental/supplementalData.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4930,7 +4930,7 @@ XXX Code for transations where no currency is involved
<weekOfPreference ordering="weekOfYear weekOfInterval" locales="zu"/>
<weekOfPreference ordering="weekOfDate" locales="ca es fr gl"/>
<weekOfPreference ordering="weekOfDate weekOfMonth" locales="en bn ja ka"/>
<weekOfPreference ordering="weekOfDate weekOfMonth weekOfInterval" locales="bg de iw pt ur zh"/>
<weekOfPreference ordering="weekOfDate weekOfMonth weekOfInterval" locales="bg de he pt ur zh"/>
<weekOfPreference ordering="weekOfDate weekOfYear weekOfMonth" locales="nl"/>
<weekOfPreference ordering="weekOfDate weekOfInterval weekOfMonth" locales="af"/>
<weekOfPreference ordering="weekOfMonth" locales="ar fil gu hu hy id kk ko"/>
Expand Down
5 changes: 4 additions & 1 deletion common/supplemental/supplementalMetadata.xml
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ For terms of use, see http://www.unicode.org/copyright.html
<languageAlias type="mof" replacement="xnt" reason="deprecated"/> <!-- Mohegan-Montauk-Narragansett -->
<languageAlias type="mwd" replacement="dmw" reason="deprecated"/> <!-- Mudbura -->
<languageAlias type="nbf" replacement="nru" reason="deprecated"/> <!-- Naxi -->
<languageAlias type="nbx" replacement="ekc" reason="deprecated"/> <!-- Ngura -->
<languageAlias type="nbx" replacement="gll" reason="deprecated"/> <!-- Ngura -->
<languageAlias type="nln" replacement="azd" reason="deprecated"/> <!-- Durango Nahuatl -->
<languageAlias type="nlr" replacement="nrk" reason="deprecated"/> <!-- Ngarla -->
<languageAlias type="noo" replacement="dtd" reason="deprecated"/> <!-- Nootka -->
Expand Down Expand Up @@ -306,6 +306,9 @@ For terms of use, see http://www.unicode.org/copyright.html
<languageAlias type="him" replacement="srx" reason="macrolanguage"/> <!-- Himachali ⇒ Sirmauri (= Pahari, Himachali) -->
<languageAlias type="mnk" replacement="man" reason="macrolanguage"/> <!-- Mandinka ⇒ Mandingo -->
<languageAlias type="bh" replacement="bho" reason="macrolanguage"/> <!-- Bihari ⇒ Bhojpuri -->
<languageAlias type="cls" replacement="sa" reason="macrolanguage"/> <!-- Classical Sanskrit ⇒ Sanskrit -->
<!-- Special case <languageAlias type="nb" replacement="no" reason="macrolanguage"/> <! - - Norwegian Bokmål ⇒ Norwegian -->
<!-- Special case <languageAlias type="sr" replacement="sh" reason="macrolanguage"/> <! - - Serbian ⇒ Serbo-Croatian -->

<languageAlias type="prs" replacement="fa_AF" reason="overlong"/> <!-- Dari ⇒ Farsi (Afganistan) -->
<languageAlias type="swc" replacement="sw_CD" reason="overlong"/> <!-- Congo Swahili ⇒ Swahili (Congo - Kinshasa) -->
Expand Down
4 changes: 2 additions & 2 deletions common/validity/language.xml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
cia~e cih cik cim~n cip cir ciw ciy
cja cje cjh~i cjk cjm~p cjs cjv cjy
ckb ckh ckl~o ckq~v ckx~z
cla clc cle clh~m clo cls~u clw cly
cla clc cle clh~m clo clt~u clw cly
cma cmc cme cmg cmi cml~m cmo cmr~t
cna~c cng~i cnk~l cno~q cns~u cnw~x
co coa~h coj~q cot~x coz
Expand Down Expand Up @@ -628,7 +628,7 @@
<id type='language' idStatus='deprecated'> <!-- 296 items -->
aam adp agp ais ajp ajt~u als aoh arb asd aue ayr ayx~y azj
baz bbz bcc bcl bgm bh bhk bic bij bjd bjq bkb blg bmy bpb btb btl bxk bxr bxx byy
cbe cbh cca ccq cdg cjr cka cld cmk cmn cnr coy cqu cug cum cwd
cbe cbh cca ccq cdg cjr cka cld cls cmk cmn cnr coy cqu cug cum cwd
daf dap dgo dgu dha dhd dik diq dit djl dkl drh drr drw dud duj dwl
ekc ekk elp emk emo esk
fat fuc
Expand Down
23 changes: 0 additions & 23 deletions exemplars/main/rna.xml

This file was deleted.

Loading

0 comments on commit db1ba18

Please sign in to comment.