glyphs/case mapping between caps and lowercases #3230

RosaWagner · 2021-04-01T13:28:11Z

For example, a font I am trying to onboard has Ydieresis, but not ydieresis.
We need a case mapping check cause if someone capitalises ÿ… then there won't be any Ÿ.
We could have the same with small caps too.

There will be few exception like Dz, which doesn't have a lowercase relative.

The text was updated successfully, but these errors were encountered:

davelab6 · 2021-04-02T01:14:39Z

Agreed, this is EXTREMELY important, and should be quite simple to implement.

felipesanches · 2021-04-02T07:35:44Z

@RosaWagner mentioned "very few exceptions". Can we come up with a list of them all?

chrissimpkins · 2021-04-02T12:05:52Z

Can we come up with a list of them all?

It looks like maybe this Unicode chart is a start? https://www.unicode.org/charts/case/chart_NoCaseMapping.html

Defined as the following:

If characters have a decomposition containing a cased character, but do not have a case mapping (lower, title, upper, or fold), then they are listed in NoCaseMapping.

Also relevant from the Unicode case mapping docs:

There are a number of complications to case mappings that occur once the repertoire of characters is expanded beyond ASCII.

In most cases, the titlecase is the same as the uppercase, but not always. For example, the titlecase of U+01F1 "DZ" capital dz is U+01F2 "Dz" capital d with small z.

Case mappings may produce strings of different length than the original.
For example, the German character U+00DF "ß" small letter sharp s expands when uppercased to the sequence of two characters "SS". This also occurs where there is no precomposed character corresponding to a case mapping, such as with U+0149 "ŉ" latin small letter n preceded by apostrophe.

There are some characters that require special handling, such as U+0345 combining iota subscript.

Characters may also have different case mappings, depending on the context.
For example, U+03A3 "Σ" capital sigma lowercases to U+03C3 "σ" small sigma if it is followed by another letter, but lowercases to U+03C2 "ς" small final sigma if it is not.

Characters may have case mappings that depend on the locale.
For example, in Turkish the letter U+0049 "I" capital letter i lowercases to U+0131 "ı" small dotless i.

Since many characters are really caseless (most of the IPA block, for example) and have no matching uppercase, the process of uppercasing a string does not mean that it will no longer contain any lowercase letters.

It might be possible to pull these data out of the ICU lib using something like Cased or Changes_When_* properties?

chrissimpkins · 2021-04-02T12:28:56Z

The Python str.islower(), str.isupper(), and str.istitle() have interesting definitions that involve the presence of Unicode case mapping definitions.

>>> case_str = "A"
>>> nocase_str = "1"
>>> case_str.islower()
False
>>> case_str.isupper()
True
>>> case_str.istitle()
True
>>> nocase_str.islower()
False
>>> nocase_str.isupper()
False
>>> nocase_str.istitle()
False

RosaWagner · 2021-09-13T10:22:33Z

To what @chrissimpkins mentioned I would add another exception:
uni0237 (j dotless), doesn't have a capital counter part either.

For the smallcaps mapping, it has to have the same mapping as uppercases logically
IMO severy=10 / FAIL, cause if a font shows tofu in caps but not in lowercase then it can be considered broken.
Would it be a problem to have this check implemented prior to have an exhaustive list of exceptions? Cause it is really hard to check case mapping with human eyes, and the exception list could be completed when something comes up?

felipesanches · 2024-01-18T20:54:32Z

I suspect we won't need an exception list. The python unicode methods seem to be enough for the task.

felipesanches · 2024-01-18T20:55:49Z

Also, I think this check is sufficiently generic to be included as a FAIL in the Universal profile.

felipesanches · 2024-01-18T21:59:42Z

The screenshot below has two different renderings for the results in my initial implementation:

a bullet-list
a table

I think I'll use the table, as it seems more readable, and delete the bullet-list.

felipesanches · 2024-01-18T22:06:22Z

Here's how it will look on a markdown report:

🔥 FAIL: Ensure the font supports case swapping for all its glyphs. (com.google.fonts/check/case_mapping)

Ensure that no glyph lacks its corresponding upper or lower counterpart (but only when unicode supports case-mapping).

🔥 FAIL The following glyphs lack their case-swapping counterparts:

Glyph present in the font	Missing case-swapping counterpart
U+00B5: MICRO SIGN	U+039C: GREEK CAPITAL LETTER MU
U+0192: LATIN SMALL LETTER F WITH HOOK	U+0191: LATIN CAPITAL LETTER F WITH HOOK
U+0394: GREEK CAPITAL LETTER DELTA	U+03B4: GREEK SMALL LETTER DELTA
U+03A3: GREEK CAPITAL LETTER SIGMA	U+03C3: GREEK SMALL LETTER SIGMA
U+03C0: GREEK SMALL LETTER PI	U+03A0: GREEK CAPITAL LETTER PI
U+2126: OHM SIGN	U+03C9: GREEK SMALL LETTER OMEGA
U+24CA: CIRCLED LATIN CAPITAL LETTER U	U+24E4: CIRCLED LATIN SMALL LETTER U

[code: missing-case-counterparts]

felipesanches · 2024-01-18T22:43:34Z

Unfortunately, the Google Fonts library is mostly in bad shape regarding this new check:

felipesanches · 2024-01-18T22:45:47Z

Here's where perhaps we could see if we want to add exceptions. But for that I think we would need some statistics on which are the most common missing case-mapping counterparts. I'll try to come up with the numbers.

felipesanches · 2024-01-19T00:13:59Z

These are the most common occurrences on the Google Fonts library (the first number indicates how many times fontbakery detected that specific missing case-mapping counterpart):

2281 - U+0192: ƒ - Latin Small Letter F with Hook
2263 - U+00B5: µ - Micro Sign
1612 - U+03C0: π - Greek Small Letter Pi
1272 - U+2126: Ω - Ohm Sign - _ - U+
1162 - U+03BC: μ - Greek Small Letter Mu
970 - U+03A9: Ω - Greek Capital Letter Omega
912 - U+0394: Δ - Greek Capital Letter Delta
407 - U+0251: ɑ - Latin Small Letter Alpha
245 - U+0261: ɡ - Latin Small Letter Script G
167 - U+00FF: ÿ - Latin Small Letter Y with Diaeresis
158 - U+0250: ɐ - Latin Small Letter Turned A
150 - U+025C: ɜ - Latin Small Letter Reversed Open E
149 - U+0252: ɒ - Latin Small Letter Turned Alpha
146 - U+0271: ɱ - Latin Small Letter M with Hook
146 - U+0282: ʂ - Latin Small Letter S with Hook
141 - U+029E: ʞ - Latin Small Letter Turned K
136 - U+0287: ʇ - Latin Small Letter Turned T
134 - U+0127: ħ - Latin Small Letter H with Stroke
132 - U+0140: ŀ - Latin Small Letter L with Middle Dot
124 - U+023F: ȿ - Latin Small Letter S with Swash Tail
121 - U+0240: ɀ - Latin Small Letter Z with Swash Tail
151 - U+026B: ɫ - Latin Small Letter L with Middle Tilde

felipesanches · 2024-01-19T00:32:02Z

If we list these as exceptions, then the situation improves a bit:

(note: I am running this agains all *-Regular.ttf on the full library, instead of all *.ttf, because that was eating up all RAM on my laptop - which sounds like a bug to investigate - but this gives us at least an overall idea of the state of the library)

Ensure that no glyph lacks its corresponding upper or lower counterpart (but only when unicode supports case-mapping). com.google.fonts/check/case_mapping (EXPERIMENTAL) Added to the Universal profile. (issue fonttools#3230)

But we need to inspect them more carefully (issue fonttools#3230)

Ensure that no glyph lacks its corresponding upper or lower counterpart (but only when unicode supports case-mapping). com.google.fonts/check/case_mapping (EXPERIMENTAL) Added to the Universal profile. (issue fonttools#3230)

But we need to inspect them more carefully (issue fonttools#3230)

Ensure that no glyph lacks its corresponding upper or lower counterpart (but only when unicode supports case-mapping). com.google.fonts/check/case_mapping (EXPERIMENTAL) Added to the Universal profile. (issue fonttools#3230)

Ensure that no glyph lacks its corresponding upper or lower counterpart (but only when unicode supports case-mapping). com.google.fonts/check/case_mapping (EXPERIMENTAL) Added to the Universal profile. (issue #3230)

moyogo · 2024-02-23T18:43:14Z

@felipesanches

167 - U+00FF: ÿ - Latin Small Letter Y with Diaeresis

That one as an exception doesn’t make sense.
It’s not a symbol. It’s used in French or German names, sometimes in names of Hungarian origin.

The likely reason the uppercase Ÿ is missing in many fonts may be because ÿ is in the Latin Extended A block which most Latin fonts cover and the uppercase is in the Latin Extended B block which most fonts do not cover.

moyogo · 2024-02-29T08:23:14Z

@felipesanches @simoncozens This should be reopened. The exceptions are inconsistent or should raise a WARN.

There are roughly orthographic characters, phonetic characters and historical characters. The orthographic, phonetic and historical sometimes overlap, for example the lowercase is phonetic and the uppercase is historical, or the lowercase is
phonetic and both lowercase and uppercase are orthographic.

For example:

ÿ 00FF is used in French and German names, Ÿ 0178 should be present.
ß 00DF is used in German and ẞ 1E9E is an alternate uppercase to SS.
ᶎ 1D8E ꞔ A794 (not currently exceptions) are phonetic symbols, the case-pairs with Ᶎ A7C6 Ꞔ A7C4 are historical (proposed Hanyu Pinyin used in a few documents) but ȿ 023F ɀ 0240 (currently exceptions) are historical phonetic symbols, the caise-paris are historical-orthographic.
ↄ 2184 and Ↄ 2183 are historical (not currently exceptions)
ɥ 0265 ɦ 0266 ɪ 026A ɬ 026C ʝ 029D are currently not exceptions but ƒ 0192 ɑ 0251 ɐ 0250 ɱ 0271 ħ 0127 ɫ 026B are exceptions, both sets are phonetic or other kind of symbols with case pairs used in orthographies.

The fontbakery check should likely check if a case-pair is orthographic (for example reported by shaperglot as such), then either FAIL or at least WARN. For the FAIL there could be some heuristic like whether the character is decomposable with unicodedata.normalize("NFD", char).

After being marked as **experimental** for 9 weeks since the v0.11.1 release, these checks are now made effective. For more details, see their previous entries on the changelog. Made effective on the Open Type profile - * **com.typenetwork/check/varfont/ital_range** (PR #4402) - * **com.google.fonts/check/varfont/family_axis_ranges** (issue #4554) Made effective on the Universal profile - * **com.google.fonts/check/tabular_kerning** (issue #4440) - * **com.google.fonts/check/case_mapping** (issue #3230)

RosaWagner added the New check proposal We expect new check proposals to include a detailed rationale description and a suggested check-id label Apr 1, 2021

davelab6 assigned felipesanches Apr 2, 2021

davelab6 added the P0 Urgent label Apr 2, 2021

davelab6 added this to the 0.7.35 milestone Apr 2, 2021

felipesanches modified the milestones: 0.7.35, 0.7.36 May 12, 2021

felipesanches modified the milestones: 0.7.37, 0.7.x May 20, 2021

felipesanches modified the milestones: 0.7.x, 0.8.x series Jul 14, 2021

RosaWagner added the Severity 5 (Highest) Font problems that must be addressed urgently! label Sep 13, 2021

felipesanches modified the milestones: 0.8.x series, 0.8.4 Oct 14, 2021

felipesanches modified the milestones: 0.8.4, 0.8.8 Nov 19, 2021

felipesanches modified the milestones: 0.8.8, 0.8.9 Mar 14, 2022

felipesanches modified the milestones: 0.8.9, 0.8.11 Jun 10, 2022

felipesanches modified the milestones: 0.8.11, 0.8.12 Aug 19, 2022

felipesanches modified the milestones: 0.8.12, 0.8.14 Jun 2, 2023

RosaWagner added the GF's priority list List of high priority issues for google/fonts CI label Jun 14, 2023

RosaWagner unassigned felipesanches Jun 14, 2023

felipesanches modified the milestones: 0.10.9, 0.10.10 Jan 12, 2024

felipesanches added a commit to felipesanches/fontbakery that referenced this issue Jan 19, 2024

Some exceptions to check/case_mapping

50ff742

But we need to inspect them more carefully (issue fonttools#3230)

felipesanches mentioned this issue Jan 19, 2024

new check: ensure glyph case mapping #4431

Merged

felipesanches added a commit to felipesanches/fontbakery that referenced this issue Jan 19, 2024

Some exceptions to check/case_mapping

ad4ab8f

But we need to inspect them more carefully (issue fonttools#3230)

felipesanches closed this as completed Feb 1, 2024

djrrb mentioned this issue Feb 22, 2024

Many lowercase characters are mapped to uppercase glyphs canonical/Ubuntu-Sans-fonts#103

Open

felipesanches reopened this Feb 29, 2024

felipesanches mentioned this issue Feb 29, 2024

glyphs/case mapping between caps and lowercases: The exceptions are inconsistent or should raise a WARN. #4564

Open

felipesanches closed this as completed Feb 29, 2024

yanone mentioned this issue Apr 19, 2024

Issue to fix for Google Fonts aminabedi68/Estedad#17

Open

kateliev mentioned this issue Aug 7, 2024

[Audit] FB Report Build 1.08: Ensure the font supports case swapping for all its glyphs. googlefonts/science-gothic#343

Closed

vv-monsalve mentioned this issue Oct 11, 2024

Fail: glyphs swapping TypeTogether/Playpen-Sans#24

Closed

subframe7536 mentioned this issue Jan 23, 2025

Missing glyphs subframe7536/maple-font#318

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glyphs/case mapping between caps and lowercases #3230

glyphs/case mapping between caps and lowercases #3230

RosaWagner commented Apr 1, 2021

davelab6 commented Apr 2, 2021

felipesanches commented Apr 2, 2021

chrissimpkins commented Apr 2, 2021 •

edited

Loading

chrissimpkins commented Apr 2, 2021 •

edited

Loading

RosaWagner commented Sep 13, 2021

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 19, 2024 •

edited

Loading

felipesanches commented Jan 19, 2024

moyogo commented Feb 23, 2024

moyogo commented Feb 29, 2024

glyphs/case mapping between caps and lowercases #3230

glyphs/case mapping between caps and lowercases #3230

Comments

RosaWagner commented Apr 1, 2021

davelab6 commented Apr 2, 2021

felipesanches commented Apr 2, 2021

chrissimpkins commented Apr 2, 2021 • edited Loading

chrissimpkins commented Apr 2, 2021 • edited Loading

RosaWagner commented Sep 13, 2021

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 18, 2024

felipesanches commented Jan 19, 2024 • edited Loading

felipesanches commented Jan 19, 2024

moyogo commented Feb 23, 2024

moyogo commented Feb 29, 2024

chrissimpkins commented Apr 2, 2021 •

edited

Loading

chrissimpkins commented Apr 2, 2021 •

edited

Loading

felipesanches commented Jan 19, 2024 •

edited

Loading