-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce import time #118
Comments
Hey Paul, thanks for reporting this and explaining your use case. This is a known issue and a consequence of a few design choices made earlier. The module that causes long import times is mendeleev.elements where from mendeleev import Ag, F
print(Ag.name, F.atomic_mass) This isn't optimal since now there a few relations on the SQL side that need to be traversed on init to make it work. Maybe you could try commenting out this line mendeleev/mendeleev/__init__.py Line 7 in 3d14699
If that works we could move the BTW are you accessing data in bulk i.e. reading a properties into a dataframe or element by element? |
Thanks for the quick response @lmmentel! Commenting out that line definitely does the trick: > time python -c 'import mendeleev'
real 0m0.416s
user 0m0.739s
sys 0m1.011s For my use case, I would probably read the atomic masses in bulk into a dictionary internally that I could pull from later as needed. So, that means I'll still pay the price for loading all the data, but I'm OK with that as long as that load time only happens when atomic masses are actually needed. |
Glad I could help. I think it might be worth making this a permanent change to reduce the import time for all users. That would however necessitate some changes in the docs and probably in the test suite. I'm a bit short on time to look further into this but happy to help if you are interested in giving it a try. In case you haven't seen it, there are methods for bulk data access that might be worth looking at. Here's a tutorial. |
Thanks @lmmentel. I may take a stab at this when I get a chance. Your code is well-structured so that definitely helps! 😄 |
Hey @paulromano any chance of reviving this one? |
Related to #135 |
Is your feature request related to a problem? Please describe.
First off, thanks a lot for developing this wonderful package! I'm interested in using it as a dependency for other Python projects that I manage. One thing that makes me a little hesitant is that the import time is a bit on the long side. For example, on my system:
Compare this to other common packages:
If I pick up
mendeleev
as a dependency, one unfortunate side effect is that my packages will inherit that slow start up time too.Describe the solution you'd like
Any change that reduces the import time would be great. I assume this is entirely related to loading the database and so I don't know how much of it is "inherent" and difficult to change. One possible solution is to defer loading the database until the point at which it's needed.
Describe alternatives you've considered
For me as a nuclear scientist/engineer, I am primarily interested in pulling in
mendeleev
for atomic masses, although I may use it for other pieces of data as well. The main alternative for me is to not usemendeleev
and instead have my own cooked up version of anatomic_mass
function (along with AME data).The text was updated successfully, but these errors were encountered: