Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: td matrix #1373

Merged
merged 3 commits into from
Oct 9, 2022
Merged

BUG: td matrix #1373

merged 3 commits into from
Oct 9, 2022

Conversation

srogmann
Copy link
Contributor

This PR contains a bugfix which introduces a matrix multiplication [1 0 0 1 tx ty] * tm_matrix when processing operator Td (see 5.3.1 Text-Positioning Operators, p. 406). This is important in cases where the text matrix is not like [1 0 0 1 e f].

I added an ASCII-sample pdf file which contains such a text matrix:

   10 0 0 7 0 0 Tm
   41 30 Td
   (Hello PDF 10 7!) Tj

The test file ST.2019.pdf of #1142 contains such text matrices:

9 -0 0 9 264.36 796.92 Tm
(201)Tj
(9)Tj
/C2_0 1 Tf
2.28 0 Td
<14A414A414D618D50A7A>Tj

Therefore I adjusted test iss_1142.

After generating a svg export of Sample_Td-matrix.pdf I found a problem at the visitor-calls of #1252: One may get the wrong text matrix when an operator like Tm is executed together with not yet written text-output. I didn't want to mix that problem with the Td-bug therefore I created this PR first of all.

@MartinThoma
Copy link
Member

If you execute black . it will fix the test_page.py :-)

@MartinThoma
Copy link
Member

Just a short heads-up: I will only be able to review this properly on 5th of October, as I'm on vacation without my notebook (just phone).

@srogmann
Copy link
Contributor Author

@MartinThoma Black reformatted tests/test_page.py. Have a nice vacation!

@codecov
Copy link

codecov bot commented Sep 28, 2022

Codecov Report

Base: 94.10% // Head: 94.16% // Increases project coverage by +0.06% 🎉

Coverage data is based on head (e5be25b) compared to base (d508c69).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1373      +/-   ##
==========================================
+ Coverage   94.10%   94.16%   +0.06%     
==========================================
  Files          28       28              
  Lines        5071     5074       +3     
  Branches     1052     1052              
==========================================
+ Hits         4772     4778       +6     
+ Misses        177      176       -1     
+ Partials      122      120       -2     
Impacted Files Coverage Δ
PyPDF2/_page.py 92.16% <100.00%> (+0.48%) ⬆️
PyPDF2/_cmap.py 95.11% <0.00%> (+0.02%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@MartinThoma MartinThoma added the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Oct 8, 2022
@MartinThoma MartinThoma changed the title Bugfix td matrix BUG: td matrix Oct 8, 2022
@MartinThoma MartinThoma merged commit 50c1b52 into py-pdf:main Oct 9, 2022
@MartinThoma
Copy link
Member

Wow, that was a nice catch and good fix ❤️

Thank you for adding a test - without it, it would have been super hard for me to see that this does the right 🙏

MartinThoma added a commit that referenced this pull request Oct 10, 2022
Bug Fixes (BUG):
- td matrix (#1373)
- Cope with cmap from #1322 (#1372)

Robustness (ROB):
-  Cope with str returned from get_data in cmap (#1380)

Full Changelog: 2.11.0...2.11.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
soon PRs that are almost ready to be merged, issues that get solved pretty soon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants