-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible bug with error TypeError: 'IndirectObject' object cannot be interpreted as an integer
#2137
Comments
👍 Your change is perfectly valid : you should convert it into a PR. Your test extracting the text from page[0] will cover the validation. |
I am actually not a collaborator on pypdf and do not have the privileges to create branches or PRs. I'd be happy to do it if I can apply for those privileges, not sure how and where. |
Just use the general GitHub workflow: Fork the project into your own account, create a new branch with your changes including a corresponding test, then create a pull request against the upstream repository (if you have created your branch and committed some changes, you should see a dialog to create such a pull request on the upstream repository). (Upstream repository means https://github.com/py-pdf/pypdf/ in this case.) |
@rchen19 |
Yes, should be able to make a PR later today. Thanks. |
- a pdf file from arxiv is included
- URL too long - file name too long - variable declared but not used
See description below. Seems like a bug to me. This is solved by make the following edits in function
compute_space_width
in_cmap.py
, line 19 in the code belowst = w[0]
->st = w[0] if isinstance(w[0], int) else w[0].get_object()
, this is in line 412 from the original file, since I am not familiar at all with the lower level implementation of pdf format, I am not sure if this is a bug at all, or if my fix makes sense:Environment
Which environment were you using when you encountered the problem?
Code + PDF
This is a minimal, complete example that shows the issue:
The pdf file:
Morris et al. - 2020 - TextAttack A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP.pdf
Traceback
This is the complete Traceback I see:
The text was updated successfully, but these errors were encountered: