Reduce import time #118

paulromano · 2023-06-26T18:00:33Z

Is your feature request related to a problem? Please describe.

First off, thanks a lot for developing this wonderful package! I'm interested in using it as a dependency for other Python projects that I manage. One thing that makes me a little hesitant is that the import time is a bit on the long side. For example, on my system:

> time python -c 'import mendeleev'

real	0m2.076s
user	0m2.404s
sys	0m1.253s

Compare this to other common packages:

> time python -c 'import scipy'

real	0m0.191s
user	0m0.540s
sys	0m1.021s

> time python -c 'import pandas'

real	0m0.333s
user	0m0.695s
sys	0m1.224s

If I pick up mendeleev as a dependency, one unfortunate side effect is that my packages will inherit that slow start up time too.

Describe the solution you'd like

Any change that reduces the import time would be great. I assume this is entirely related to loading the database and so I don't know how much of it is "inherent" and difficult to change. One possible solution is to defer loading the database until the point at which it's needed.

Describe alternatives you've considered

For me as a nuclear scientist/engineer, I am primarily interested in pulling in mendeleev for atomic masses, although I may use it for other pieces of data as well. The main alternative for me is to not use mendeleev and instead have my own cooked up version of an atomic_mass function (along with AME data).

The text was updated successfully, but these errors were encountered:

lmmentel · 2023-06-27T22:06:46Z

Hey Paul, thanks for reporting this and explaining your use case. This is a known issue and a consequence of a few design choices made earlier. The module that causes long import times is mendeleev.elements where Elementinstances are queried for all elements to enable the import shorthand

from mendeleev import Ag, F
print(Ag.name, F.atomic_mass)

This isn't optimal since now there a few relations on the SQL side that need to be traversed on init to make it work.

Maybe you could try commenting out this line

mendeleev/mendeleev/__init__.py

Line 7 in 3d14699

from .elements import *

and checking if the import time are reduced sufficiently for your use case?

If that works we could move the elements module import to be optional and not the default as it is now.

BTW are you accessing data in bulk i.e. reading a properties into a dataframe or element by element?

paulromano · 2023-06-28T03:23:02Z

Thanks for the quick response @lmmentel! Commenting out that line definitely does the trick:

> time python -c 'import mendeleev'

real	0m0.416s
user	0m0.739s
sys	0m1.011s

For my use case, I would probably read the atomic masses in bulk into a dictionary internally that I could pull from later as needed. So, that means I'll still pay the price for loading all the data, but I'm OK with that as long as that load time only happens when atomic masses are actually needed.

lmmentel · 2023-06-28T20:46:02Z

Glad I could help. I think it might be worth making this a permanent change to reduce the import time for all users. That would however necessitate some changes in the docs and probably in the test suite. I'm a bit short on time to look further into this but happy to help if you are interested in giving it a try.

In case you haven't seen it, there are methods for bulk data access that might be worth looking at. Here's a tutorial.

paulromano · 2023-06-28T21:16:12Z

Thanks @lmmentel. I may take a stab at this when I get a chance. Your code is well-structured so that definitely helps! 😄

lmmentel · 2024-03-18T20:46:42Z

Hey @paulromano any chance of reviving this one?

lmmentel · 2024-03-18T20:48:06Z

Related to #135

lmmentel added the maintenance label Jun 28, 2023

paulromano mentioned this issue Jun 29, 2023

Defer loading element data until attribute access #121

Merged

lmmentel added the performance label Mar 18, 2024

lmmentel assigned paulromano Mar 18, 2024

lmmentel mentioned this issue May 5, 2024

Reduce import time #145

Closed

lmmentel closed this as completed in #121 May 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce import time #118

Reduce import time #118

paulromano commented Jun 26, 2023

lmmentel commented Jun 27, 2023

paulromano commented Jun 28, 2023

lmmentel commented Jun 28, 2023

paulromano commented Jun 28, 2023

lmmentel commented Mar 18, 2024

lmmentel commented Mar 18, 2024

Reduce import time #118

Reduce import time #118

Comments

paulromano commented Jun 26, 2023

lmmentel commented Jun 27, 2023

paulromano commented Jun 28, 2023

lmmentel commented Jun 28, 2023

paulromano commented Jun 28, 2023

lmmentel commented Mar 18, 2024

lmmentel commented Mar 18, 2024