-
-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing characters like "Ň č ľ 付" when converting a fillable pdf form to image #254
Comments
It's probably related to the version of poppler you're using. Here's the discussion on a related issue. https://gitlab.freedesktop.org/poppler/poppler/-/issues/1070 |
Hello, thank you for your response. I have tried with v20.12.1 and v23.01.0-0. I am facing the issue similar issue in both the version |
If you share the PDF causing the issue, I can take a deeper dive investigating. |
Hello @jjbiggins, Base PDF: https://drive.google.com/file/d/19GmVt1EzuTrhZS21Xxac74Y_VM31Uwdx/view?usp=share_link Flattened PDF: https://drive.google.com/file/d/1CnoQbkVtIywK7zTyEd0j-MtNY-C54Tw-/view?usp=share_link Output from pdf2image (windows): https://drive.google.com/file/d/1Hwy2UVBhExXIb3cKQM8QflzL0PqhKnTz/view?usp=share_link Please open Base PDF and Flattened PDF outside drive PDF viewer so that input fields can be seen. |
@ae-f Were you able to sort this out? I am facing a similar issue as well. |
For anyone who's stuck at this issue. After spending days on this. This is how i sorted out the problem: Print the PDF using firefox: (FYI: tried chrome as well but the characters were jumbled up on ubuntu) from time import sleep
from helium import start_firefox
from selenium.webdriver import FirefoxOptions
options = FirefoxOptions()
options.add_argument("--headless")
options.set_preference("print.always_print_silent", True)
options.set_preference("print.printer_Mozilla_Save_to_PDF.print_to_file", True)
options.set_preference("print_printer", "Mozilla Save to PDF")
driver = start_firefox("file:///path/to/firefox.pdf"), options=options)
driver.execute_script("window.print();")
sleep(5) # Found that a little wait is needed for the print to be rendered otherwise the file will be corrupted
driver.quit() And then use tools like |
On running
pdftoppm -r 200 -jpeg x.pdf out
I am getting error similar to:Syntax Error: AnnotWidget::layoutText, cannot convert U+0147
.Font available on system is DejaVu and poppler-data is also installed.
OS: Ubuntu
Pdf2image version: 1.16.0
The text was updated successfully, but these errors were encountered: