PI: Apply improvements to _utils suggested by perflint #993

MartinThoma · 2022-06-14T18:36:55Z

No description provided.

MasterOdin · 2022-06-14T18:53:00Z

PyPDF2/_utils.py

@@ -181,7 +178,7 @@ def read_previous_line(stream: StreamType) -> bytes:
        if not found_crlf:
            # We haven't found our first CR/LF yet.
            # Read off characters until we hit one.
-            while idx >= 0 and block[idx] not in CRLF:
+            while idx >= 0 and block[idx] not in b"\r\n":


Curious why this change was suggested? Just to avoid polluting what gets imported from doing a import * from ._utils?

Apparently using global variables has a slight performance penalty.

Typically I would have preferred readability over this minor performance hit, but in this case I think it is similar readable

Yeah, makes sense I guess in how the scope lookup for variables works, though this feels like micro-optimization territory that only matters if you're running the code millions upon millions times you'd notice a slight speedup.

Interesting nevertheless as I've never used perflint before. 🤷

I was just listening to some PyCon talks where it was presented :-) The author also said that it might give several false-positives, so one has to use it with a bit of caution.

this feels like micro-optimization territory that only matters if you're running the code millions upon millions times

I definitely agree! If it would have made the code less readable, I would not have applied it.

I've just looked at some call graphs (e.g. #984 ) and we do run some functions quite a lot of times for small PDF documents. Those functions look fine to me though.

Performance Improvements (PI): - Remove b_ calls (#992, #986) - Apply improvements to _utils suggested by perflint (#993) Robustness (ROB): - utf-16-be\' codec can\'t decode (...) (#995) Documentation (DOC): - Remove reference to Scripts (#987) Developer Experience (DEV): - Fix type annotations for add_bookmarks (#1000) Testing (TST): - Add test for PdfMerger (#1001) - Add tests for XMP information (#996) - reader.get_fields / zlib issue / LZW decode issue (#1004) - reader.get_fields with report generation (#1002) - Improve test coverage by extracting texts (#998) Code Style (STY): - Apply fixes suggested by pylint (#999) Full Changelog: 2.2.0...2.2.1

MartinThoma added 2 commits June 14, 2022 20:36

PI: Apply improvements to _utils suggested by perflint

645d447

Foo

174d7a2

MartinThoma merged commit e47e057 into main Jun 14, 2022

MartinThoma deleted the perflint-2 branch June 14, 2022 18:43

MartinThoma mentioned this pull request Jun 14, 2022

Speed up parser #78

Closed

MasterOdin reviewed Jun 14, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PI: Apply improvements to _utils suggested by perflint #993

PI: Apply improvements to _utils suggested by perflint #993

MartinThoma commented Jun 14, 2022

MasterOdin Jun 14, 2022

MartinThoma Jun 14, 2022

MasterOdin Jun 14, 2022

MartinThoma Jun 14, 2022

MartinThoma Jun 14, 2022

PI: Apply improvements to _utils suggested by perflint #993

PI: Apply improvements to _utils suggested by perflint #993

Conversation

MartinThoma commented Jun 14, 2022

MasterOdin Jun 14, 2022

Choose a reason for hiding this comment

MartinThoma Jun 14, 2022

Choose a reason for hiding this comment

MasterOdin Jun 14, 2022

Choose a reason for hiding this comment

MartinThoma Jun 14, 2022

Choose a reason for hiding this comment

MartinThoma Jun 14, 2022

Choose a reason for hiding this comment