Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Arabic contextual forms natively #3

Draft
wants to merge 11 commits into
base: complex-text-50
Choose a base branch
from

Conversation

1ec5
Copy link
Owner

@1ec5 1ec5 commented Aug 26, 2024

Stop using mapbox-gl-rtl-text for Arabic character shaping. Instead, use kashida/tatweel to force TinySDF’s canvas to draw Arabic forms in all three contextual forms as well as the isolated form.

Unfortunately, the extra kashidas/tatweels overstrike upon surrounding glyphs, making it look like the text is haphazardly crossed out. The bitmap that comes out of TinySDF needs to be cropped to just the character of interest, or kashidas/tatweels need to be excluded if the form does not exist for the given letter.

Before After
ھیندستان and ديربورن. ھیندستان and ديربورن with overstruck tatweels.

If this works, it would go halfway towards removing the mapbox-gl-rtl-text dependency, which is cumbersome and requires a CDN: maplibre#3215 mapbox/mapbox-gl-js#4007. (bidi-js can likely address the other half, bidirectional text processing.)

Depends on #1.

1ec5 added 10 commits August 27, 2024 09:31
If a grapheme cluster begins with a combining diacritical mark or ends with an invisible stacker, combine it with an adjacent grapheme cluster to avoid drawing diacritics over dotted circles or placeholder diacritics where adjacent characters should be ligated instead.
Added a script that fetches the latest Unicode character database’s property file for Indic syllable categories and generates a function for combining graphemes based on it.
Replace zero-width joiners with temporary strip markers to prevent ICU from stripping them.
Preemptively swap combining marks with the characters they modify to visual order, so that the RTL plugin will swap them back to logical order.
Replaced custom word break heuristics when determining line breaks with a word segmenter. Added a simple polyfill for older versions of Firefox.
Fixed an issue where vertical CJK text was shifted upwards by one em.
@1ec5 1ec5 force-pushed the complex-text-50 branch from 0a42e34 to 29de7cc Compare August 27, 2024 16:32
Stop using mapbox-gl-rtl-text for Arabic character shaping. Instead, use kashida/tatweel to force TinySDF’s canvas to draw Arabic forms in all three contextual forms as well as the isolated form.
@1ec5 1ec5 force-pushed the complex-text-50 branch 2 times, most recently from 979a434 to b3b7359 Compare September 7, 2024 17:59
@1ec5
Copy link
Owner Author

1ec5 commented Oct 28, 2024

Unfortunately, the extra kashidas/tatweels overstrike upon surrounding glyphs, making it look like the text is haphazardly crossed out.

In both SVG and HTML, we can simply wrap each kashida/tatweel character in a span that has a transparent text color, but unfortunately that isn’t possible with the canvas API.

Medial form of the Arabic letter kaf, surrounded by opaque and transparent tatweels

Maybe TinySDF could have a separate mode for Arabic text that builds an SVG, loads it in an Image, and draws that image to a canvas. It would need to subtract the advance of the tatweel on either side after measuring the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant