-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odd Unicode characters instead of real letters are now used to render texts #10205
Comments
It looks like Going further to var unicode = this.toUnicode.get(charcode) || charcode; while in the version not working properly: var unicode = this.toUnicode.get(charcode) || this.fallbackToUnicode.get(charcode) || charcode; There are no other notable differences in that function, so I assume it's this line that causes the problems. |
From the information here I assume this is an SVG back-end specific issue, is that correct? @brendandahl @Snuffleupagus Do you perhaps know more about what can cause this? |
It's not back-end specific. It's the easiests to see the consequences when using SVG rendering (which can also be done front-end side). Prior to 2.0.943, SVG used a sane http://projects.wojtekmaj.pl/react-pdf/test/ (this is version based on older PDF.js)
On this version, you'll find "Sampledocument" Now do the same steps using In this version you will see that while the letters appear correct, the HTML rendered is garbage. This has two serious consequences:
|
I commented in the other bug, this is by design because of #9340. All the char codes are moved into the private use area unicode range. To properly do text selection w/ svg we should do something like the canvas backend and create a text layer from the unicode mappings. |
Note that this is already done in the default viewer, when the
Keep in mind that that will never be a complete solution for text-selection/copying/searching purposes, since the PDF format distinguishes between rendering/text-extraction; hence why e.g. Also, please keep in mind that the status of the SVG back-end is probably, as far as I know, best described as "experimental" and that it's thus not officially supported; #9211 (comment) is probably relevant here as well. |
It's not only applicable to SVG though. It's especially harmful for SVGs for the reasons I pointed out, like copying the original text, but that can be worked around using the same text layer that's being used for canvas rendering. I'm using the original fonts to create a text layer over the canvas in my implementation, and using the same font as the original source in vast majority of cases gave me much more accurate results than using some default font. Moving all the char codes are moved into the private use area Unicode range without leaving them in their default positions made the fonts completely unusable. |
Is there anything I could do to resolve this issue? Perhaps it could be an option, like |
A text-selection implementation that by design breaks a relatively common feature, such as ligatures, should probably not be described as a "good solution" in general; but I digress.
If glyphs are left in their original positions, and are not being re-mapped to a PUA, that is guaranteed to completely break font rendering in a very large number of PDF files; refer to PR #9340 for additional details. Perhaps it may be slightly more acceptable to add an option, |
If we really just want to improve text selection there some other things we could try. One option would be to generate a font that has the same width glyphs as the original font, but each glyph would just draw a square or line and it would be assigned to the unicode value.. |
In this case, it seems that this issue could just be marked as a duplicate of #1914. |
Yes, let's close this as such and track the issue there. |
Hello,
in v2.0.550 text rendered in SVG rendering mode used normal letters:
The same node in v2.0.943 (after #9192) looks like so:
I don't see how losing the ability to read the source would benefit anyone. Is there a way to get the old behavior back?
The text was updated successfully, but these errors were encountered: