Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to over-ride the hyphenation pattern for specific words? #265

Closed
alerque opened this issue Feb 15, 2016 · 1 comment
Closed

Comments

@alerque
Copy link
Member

alerque commented Feb 15, 2016

I have a few words which are not hyphenating correctly. I don't think there is actually a way to fix this in the hyphenation patterns. LaTeX, for example, fails as well. A common problem is the possessive form of "The Lord's" (RAB'bin) in Turkish which comes up pretty frequently in typesetting Bible text.

Linguistically this is an odd word because of the double consonant. Normally suffixes being added to a word ending in a consonant would start with a vowel, but then again words in Turkish never end in the letter B either.

LaTeX and most other typesetting engines I've tried pretty much just decide that this cannot be hyphenated. This is probably the best fallback, but technically if it really needed to be broken it would be done at the apostrophe. For example here is LaTeX's handling:

\documentclass{article}
\usepackage{polyglossia}
\setdefaultlanguage{turkish}
\usepackage{testhyphens}
\begin{document}
\begin{checkhyphens}
    tireleme
    RAB'bin
    RAB’bin
    RABbin
\end{checkhyphens}
\end{document}

ti-re-le-me
RAB’bin
RAB’bin
RAB-bin

SILE does something a little different and allows breaks on both sides of the apostrophe and treats the letter sequence "rab'bin" as a whole. It then ends up breaking on oddball locations such as "ra-b'bin", which would have been valid for "ra-bin", but the spelling exception that adds the other b also changes the syllabification and it should be hyphenated "rab-bin".

>  showHyphenationPoints("tireleme", "tr")                                    
ti-re-leme
> showHyphenationPoints("RAB'bin", "tr")                                
RA-B-'-bin
> showHyphenationPoints("RAB’bin", "tr") 
RA-B-’-bin
>  showHyphenationPoints("RABbin", "tr") 
RAB-bin

selection_159

What would be the best way of adding a list of manual exception words with special hyphenation to either a document or a language class? For example this particular word should probably have just one hyphenation point set at RAB’-bin. There are a few other words that come up from time to time (particularly foreign names) that are oddball exceptions to the normal hyphenation patterns. At the very least there should be a way to re-hyphenate an exception word list on a per-document basis, but a language exception

@alerque
Copy link
Member Author

alerque commented Feb 23, 2016

It seems like the recently suggested issue #277 is a more comprehensive approach to this problem. The only thing that makes this not a duplicate is the fact that my example's above aren't even working the way the normal TeX hyphenation would work. I suspect there is some bad interaction with ICU here. If that were resolved for the case of apostrophe characters in words, then what remains would be covered by that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant