You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using this library to extract info from several PDF files but although all English and French files are converted very easily and with no flaws, PDFs with Greek characters are hit and miss. Some are perfect. Some have some letters wrong some are completely garbled.
I was able to get the PDF files for testing by removing the dev. from the provided URLs.
However, even going back as far as the 0.9.x releases from around October 2013, before this issue was opened, I wasn't able to reproduce any of the issues described. I don't speak or read Greek, but the letters were rendered fine and I didn't see any shifting or unusual spaces.
On the other hand, nothing improved with #257 either, as far as I can see...
So, unless @IoannisLoukeris has anything to add after 6+ years, I'd say it's safe to close this issue.
Hello.
I am using this library to extract info from several PDF files but although all English and French files are converted very easily and with no flaws, PDFs with Greek characters are hit and miss. Some are perfect. Some have some letters wrong some are completely garbled.
You can see two very characteristic specimens at:
http://dev.life-read.gr/pdfs/5RVIZ15j14RjqGJ_3Cy0b.pdf (letters μ Μ Δ δ ω and Ω not rendered correctly through out the text)
http://dev.life-read.gr/pdfs/cRDA1T9U9a6m3lt_9dkV9.pdf (every letter after the fifth letter of the alphabet seems to be shifted by some places left. so letters α β γ δ ε are rendered correctly along with any accented letters but every thing else is garbled)
The generators for these are NITRO PDF and MS Word according to metadata extracted with pdfparser.
Thank you,
John
The text was updated successfully, but these errors were encountered: