Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: lazify pytz seqToRE call, trims 35ms from import #28228

Merged
merged 1 commit into from
Aug 30, 2019

Conversation

jbrockmendel
Copy link
Member

Shaves on the order of 5% off of import time.

Unrelated docstring fixups, improve typing on parse_timezone_directive and _calc_julian_from_V.

The important thing here is setting the "Z" key dynamically in __getitem__ instead of in __init__ (since an instance is created in the module namespace at importt)

@simonjayhawkins simonjayhawkins added Performance Memory or execution speed performance Timezones Timezone data dtype labels Aug 30, 2019
@TomAugspurger TomAugspurger added this to the 1.0 milestone Aug 30, 2019
@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2019

How does the init impact import time? Are we instantiating these at import?

@jbrockmendel
Copy link
Member Author

How does the init impact import time? Are we instantiating these at import?

Yes, there is an instance of this in the module namespace.

if key == "Z":
# lazy computation
if self._Z is None:
self._Z = self.__seqToRE(pytz.all_timezones, 'Z')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly OT this should be a module level function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the seqRE i mean

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is based on a class vendored from the stdlib, so probably not

@jreback
Copy link
Contributor

jreback commented Aug 30, 2019

lgtm to me

@WillAyd WillAyd merged commit 51db82d into pandas-dev:master Aug 30, 2019
@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2019

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the fmtspeed branch August 30, 2019 16:43
@jbrockmendel
Copy link
Member Author

@WillAyd IIRC you made the top-level Panel import lazy in py37. have you looked at e.g. https://pypi.org/project/modutil/ for doing this in more places? top-level imports from pandas.io.api look like the next place we can trim

@WillAyd
Copy link
Member

WillAyd commented Aug 30, 2019

Haven't looked a lot at lazy imports, so if you have concrete proposals certainly put them out there.

Just keep in mind as one downside to lazy imports I think you would lose type checking against those libraries, since that import I don't think can be resolved statically

MarcoGorelli pushed a commit to MarcoGorelli/pandas that referenced this pull request Sep 3, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants