Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Small refactoring after #788 #830

Merged
merged 3 commits into from
Apr 27, 2022
Merged

MAINT: Small refactoring after #788 #830

merged 3 commits into from
Apr 27, 2022

Conversation

MartinThoma
Copy link
Member

@MartinThoma MartinThoma commented Apr 27, 2022

This refactoring aims at making maintenance easier:

  1. Too long functions make it hard to grasp the overall behavior. Hence the _get_xref_issues function was split out
  2. _get_xref_issues is made a static method of the PdfFileReader to show that it belongs to the reader, but doesn't require any of its attributes
  3. _get_xref_issues makes use of an integer return value instead of raising + catching exceptions. That also seems easier to grasp for me. Also, capturing exceptions is a tiny bit more expensive than just returning an int
  4. _rebuild_xref_table was moved to a method for the same reason.

@codecov
Copy link

codecov bot commented Apr 27, 2022

Codecov Report

Merging #830 (10cca01) into main (904b0df) will increase coverage by 0.02%.
The diff coverage is 84.07%.

@@            Coverage Diff             @@
##             main     #830      +/-   ##
==========================================
+ Coverage   77.18%   77.20%   +0.02%     
==========================================
  Files          12       12              
  Lines        3532     3540       +8     
  Branches      830      830              
==========================================
+ Hits         2726     2733       +7     
- Misses        589      590       +1     
  Partials      217      217              
Impacted Files Coverage Δ
PyPDF2/pdf.py 82.21% <84.07%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 904b0df...10cca01. Read the comment docs.

@MartinThoma MartinThoma merged commit fd775d3 into main Apr 27, 2022
@MartinThoma MartinThoma deleted the 788-refactoring branch April 27, 2022 15:44
@pubpub-zz
Copy link
Collaborator

@MartinThoma
For your information, in the fork https://github.com/pubpub-zz/PyPPDF4.sav I did, I split pdf.py in 3 files for readability and commonality :
pdfcommon.py for a base type that holds all the common parts between the reader and the writer,
pdfreader.py for reader specific (about 1000 lines)
pdfwriter.py for all writing functions (adding bookmarks,...)
merger.py was quite small (most of the functions are in the writer)

This architecture requires a little of rewriting (trailer/xref had to be rewritten) but should ease maintenance. Your opinion?

MartinThoma added a commit that referenced this pull request Apr 28, 2022
This refactoring aims at making maintenance easier:

1. Too long functions make it hard to grasp the overall behavior. Hence the _get_xref_issues function was split out
2. `_get_xref_issues` is made a static method of the PdfFileReader to show that it belongs to the reader, but doesn't require any of its attributes
3. `_get_xref_issues` makes use of an integer return value instead of raising + catching exceptions. 
4. `_rebuild_xref_table` was moved to a method for the same reason.
VictorCarlquist pushed a commit to VictorCarlquist/PyPDF2 that referenced this pull request Apr 29, 2022
This refactoring aims at making maintenance easier:

1. Too long functions make it hard to grasp the overall behavior. Hence the _get_xref_issues function was split out
2. `_get_xref_issues` is made a static method of the PdfFileReader to show that it belongs to the reader, but doesn't require any of its attributes
3. `_get_xref_issues` makes use of an integer return value instead of raising + catching exceptions. 
4. `_rebuild_xref_table` was moved to a method for the same reason.
MartinThoma added a commit that referenced this pull request May 1, 2022
Robustness (ROB):
-  Handle missing destinations in reader (#840)
-  warn-only in readStringFromStream (#837)
-  Fix corruption in startxref or xref table (#788 and #830)

Documentation (DOC):
-  Project Governance (#799)
-  History of PyPDF2
-  PDF feature/version support (#816)
-  More details on text parsing issues (#815)

Developer Experience (DEV):
-  Add benchmark command to Makefile
-  Ignore IronPython parts for code coverage (#826)

Maintenance (MAINT):
-  Split pdf module (#836)
-  Separated CCITTFax param parsing/decoding (#841)
-  Update requirements files

Testing (TST):
-  Use external repository for larger/more PDFs for testing (#820)
-  Swap incorrect test names (#838)
-  Add test for PdfFileReader and page properties (#835)
-  Add tests for PyPDF2.generic (#831)
-  Add tests for utils, form fields, PageRange (#827)
-  Add test for ASCII85Decode (#825)
-  Add test for FlateDecode (#823)
-  Add test for filters.ASCIIHexDecode (#822)

Code Style (STY):
-  Apply pre-commit (black, isort) + use snake_case variables (#832)
-  Remove debug code (#828)
-  Documentation, Variable names (#839)

Full Changelog: 1.27.9...1.27.10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants