-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import very slow #533
Comments
Hi @skjerns , could you confirm that this is happening with the last version? I can't reproduce it. |
On the other side, the import is a little slower than other libraries because it initializes some parser to be faster when executing it, so I think there isn't too much we can do. |
@noviluni it's faster now (280ms), but I've also upgraded my PC, so I can't really tell ;-) using Maybe you could implement some lazy loading to the parsers? Or are they all needed? |
I tried with the last version of I will close it. Feel free to comment or reopen it if you thing there is something that should be improved 🙂 |
Yes! Seems to be much faster now :) thanks alot! |
dateparser 1.1.8 takes around 600ms to import here. |
cannot replicate: |
cannot replicate: Elapsed context: 275 ms
So, your system is faster than mine.
275ms import time is still unacceptable (basically everything >10ms on an average system is a bug)
…On 13 July 2023 10:47:59 CEST, Simon Kern ***@***.***> wrote:
> dateparser 1.1.8 takes around 600ms to import here.
cannot replicate: `Elapsed context: 275 ms`
--
Reply to this email directly or view it on GitHub:
#533 (comment)
You are receiving this because you commented.
Message ID: ***@***.***>
|
I have to agree that hundreds of milliseconds are a problem. For me the import is about 250 ms and about 200 ms from that is the generation of _search_regex_parts = []
_tz_offsets = list(build_tz_offsets(_search_regex_parts))
_search_regex = re.compile('|'.join(_search_regex_parts))
_search_regex_ignorecase = re.compile(
'|'.join(_search_regex_parts), re.IGNORECASE) (from that 140 ms is just From my point of view, it would be better not to precompute this list on import, but rather do so on first use of Would it be interesting for you if I attempted this change and sent a pull request? |
OK, I created an MR for this issue. It postpones the compilation of regexps in Any feedback is welcome. |
this is different from pr scrapinghub#1181. that pr only makes import faster but still incurs cost on the first usage. this one leverages an optional cache. closes scrapinghub#533
this is different from pr scrapinghub#1181. that pr only makes import faster but still incurs cost on the first usage. this one leverages an optional cache. closes scrapinghub#533
this is different from pr scrapinghub#1181. that pr only makes import faster but still incurs cost on the first usage. this one leverages an optional cache. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
this is different from pr scrapinghub#1181. it builds a cache at install time which can be distributed. closes scrapinghub#533
I figured that the import of this library is very slow (500-1000ms on my system).
Given that the library probably less complex than e.g.
numpy
or similarly large packages: Do you think the import times can be optimized in some way?The text was updated successfully, but these errors were encountered: