Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getText() adds the letter "j" to a lot of words in the PDF content #353

Closed
conkreet opened this issue Oct 7, 2020 · 1 comment · Fixed by #634
Closed

getText() adds the letter "j" to a lot of words in the PDF content #353

conkreet opened this issue Oct 7, 2020 · 1 comment · Fixed by #634
Labels
missing or incomplete functionality For something which is not a bug, but more like an incomplete feature.

Comments

@conkreet
Copy link

conkreet commented Oct 7, 2020

When I parse the attached PDF, the letter "j" is added to random words when I use getText(). For instance, in some cases the term "BrabantWonen" is changed to "BjrabantWonen" and "aanbrengen" becomes "aanbrengejn". Is this a known issue?

4c011e062aafcc305834aa245734eafc6945c1dc.pdf

@k00ni k00ni added the missing or incomplete functionality For something which is not a bug, but more like an incomplete feature. label Oct 8, 2020
@GreyWyvern
Copy link
Contributor

Hi @conkreet. Are we able to use your sample PDF 4c011e062aafcc305834aa245734eafc6945c1dc.pdf in the PdfParser test suite? Is the file free to use? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
missing or incomplete functionality For something which is not a bug, but more like an incomplete feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants