Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce import time #118

Closed
paulromano opened this issue Jun 26, 2023 · 6 comments · Fixed by #121
Closed

Reduce import time #118

paulromano opened this issue Jun 26, 2023 · 6 comments · Fixed by #121

Comments

@paulromano
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

First off, thanks a lot for developing this wonderful package! I'm interested in using it as a dependency for other Python projects that I manage. One thing that makes me a little hesitant is that the import time is a bit on the long side. For example, on my system:

> time python -c 'import mendeleev'

real	0m2.076s
user	0m2.404s
sys	0m1.253s

Compare this to other common packages:

> time python -c 'import scipy'

real	0m0.191s
user	0m0.540s
sys	0m1.021s

> time python -c 'import pandas'

real	0m0.333s
user	0m0.695s
sys	0m1.224s

If I pick up mendeleev as a dependency, one unfortunate side effect is that my packages will inherit that slow start up time too.

Describe the solution you'd like

Any change that reduces the import time would be great. I assume this is entirely related to loading the database and so I don't know how much of it is "inherent" and difficult to change. One possible solution is to defer loading the database until the point at which it's needed.

Describe alternatives you've considered

For me as a nuclear scientist/engineer, I am primarily interested in pulling in mendeleev for atomic masses, although I may use it for other pieces of data as well. The main alternative for me is to not use mendeleev and instead have my own cooked up version of an atomic_mass function (along with AME data).

@lmmentel
Copy link
Owner

Hey Paul, thanks for reporting this and explaining your use case. This is a known issue and a consequence of a few design choices made earlier. The module that causes long import times is mendeleev.elements where Elementinstances are queried for all elements to enable the import shorthand

from mendeleev import Ag, F
print(Ag.name, F.atomic_mass)

This isn't optimal since now there a few relations on the SQL side that need to be traversed on init to make it work.

Maybe you could try commenting out this line

from .elements import *
and checking if the import time are reduced sufficiently for your use case?

If that works we could move the elements module import to be optional and not the default as it is now.

BTW are you accessing data in bulk i.e. reading a properties into a dataframe or element by element?

@paulromano
Copy link
Collaborator Author

Thanks for the quick response @lmmentel! Commenting out that line definitely does the trick:

> time python -c 'import mendeleev'

real	0m0.416s
user	0m0.739s
sys	0m1.011s

For my use case, I would probably read the atomic masses in bulk into a dictionary internally that I could pull from later as needed. So, that means I'll still pay the price for loading all the data, but I'm OK with that as long as that load time only happens when atomic masses are actually needed.

@lmmentel
Copy link
Owner

Glad I could help. I think it might be worth making this a permanent change to reduce the import time for all users. That would however necessitate some changes in the docs and probably in the test suite. I'm a bit short on time to look further into this but happy to help if you are interested in giving it a try.

In case you haven't seen it, there are methods for bulk data access that might be worth looking at. Here's a tutorial.

@paulromano
Copy link
Collaborator Author

Thanks @lmmentel. I may take a stab at this when I get a chance. Your code is well-structured so that definitely helps! 😄

@lmmentel
Copy link
Owner

Hey @paulromano any chance of reviving this one?

@lmmentel
Copy link
Owner

Related to #135

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants