Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PI: Apply improvements to _utils suggested by perflint #993

Merged
merged 2 commits into from
Jun 14, 2022
Merged

Conversation

MartinThoma
Copy link
Member

No description provided.

@MartinThoma MartinThoma merged commit e47e057 into main Jun 14, 2022
@MartinThoma MartinThoma deleted the perflint-2 branch June 14, 2022 18:43
@MartinThoma MartinThoma mentioned this pull request Jun 14, 2022
@@ -181,7 +178,7 @@ def read_previous_line(stream: StreamType) -> bytes:
if not found_crlf:
# We haven't found our first CR/LF yet.
# Read off characters until we hit one.
while idx >= 0 and block[idx] not in CRLF:
while idx >= 0 and block[idx] not in b"\r\n":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why this change was suggested? Just to avoid polluting what gets imported from doing a import * from ._utils?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently using global variables has a slight performance penalty.

Typically I would have preferred readability over this minor performance hit, but in this case I think it is similar readable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense I guess in how the scope lookup for variables works, though this feels like micro-optimization territory that only matters if you're running the code millions upon millions times you'd notice a slight speedup.

Interesting nevertheless as I've never used perflint before. 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just listening to some PyCon talks where it was presented :-) The author also said that it might give several false-positives, so one has to use it with a bit of caution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like micro-optimization territory that only matters if you're running the code millions upon millions times

I definitely agree! If it would have made the code less readable, I would not have applied it.

I've just looked at some call graphs (e.g. #984 ) and we do run some functions quite a lot of times for small PDF documents. Those functions look fine to me though.

MartinThoma added a commit that referenced this pull request Jun 17, 2022
Performance Improvements (PI):
-  Remove b_ calls (#992, #986)
-  Apply improvements to _utils suggested by perflint (#993)

Robustness (ROB):
-  utf-16-be\' codec can\'t decode (...) (#995)

Documentation (DOC):
-  Remove reference to Scripts (#987)

Developer Experience (DEV):
-  Fix type annotations for add_bookmarks (#1000)

Testing (TST):
-  Add test for PdfMerger (#1001)
-  Add tests for XMP information (#996)
-  reader.get_fields / zlib issue / LZW decode issue (#1004)
-  reader.get_fields with report generation (#1002)
-  Improve test coverage by extracting texts (#998)

Code Style (STY):
-  Apply fixes suggested by pylint (#999)

Full Changelog: 2.2.0...2.2.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants