You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Working on the argopy documentation (that was taking an unusual long time to build), I noticed that the last argovis API looks much slower than the previous one...
there is a very significant performance lost going from o(20sec) to o(11mins) to fetch some regional data
It's hard to be more quantitative, but with last argopy versions (>v0.1.16) using the API v2 (from argovis-api.colorado.edu)
I can't barely get any faster than about 10mins for this use case
This is a little bit worrying to me, because I always recommended to use argovis for large domain requests
Looks like the overhead from the server of managing small requests is no longer worth the chunking
I hope this is due to a change of config on the server side that could be fixed, rather than on the API design
What do you think of this ? and do you have any clue of what's going on ?
On my side, I shall play with chunk sizes to see when the overhead is worth it
Thanks for pointing this out - the new API actually looks considerably faster for the non-parallelized request (region b1). My first guess is that if you're firing a bunch of requests in parallel, you're hitting our rate limiter which only exists since v2 (people were firing tons of parallel requests at us and taking our service down, is why this was implemented).
Is there a verbose mode we can run this in so we can see the exact API calls being made, and maybe their timing too? If this is indeed what is happening, you should be getting some responses with HTTP code 429 and some JSON describing how frequently such requests can be made; we should be able to use this to tune the parallelization to an optimal level. Let me know what you see and we can go from there.
Actually, another thing worth noting - I am seriously considering re-paginating these responses to something much simpler; currently we limit request sizes temporospatially, which makes it complex and case-dependent to understand how fast the rate limiter will allow requests. A more traditional pagination by simple number of profiles will have a flat and easy-to-understand requests per second rate limitation. If we can confirm that the slow timings you're seeing from your parallel requests are due to rate limitation, I think that's a good argument to go ahead with this simpler pagination for simplifying parallelization.
Hi !
Working on the argopy documentation (that was taking an unusual long time to build), I noticed that the last argovis API looks much slower than the previous one...
If you compare numbers using API v1 at:
https://argopy.readthedocs.io/en/v0.1.15/performances.html#comparison-of-performances
with those using API v2 at:
https://argopy.readthedocs.io/en/v0.1.16/performances.html#comparison-of-performances
there is a very significant performance lost going from o(20sec) to o(11mins) to fetch some regional data
It's hard to be more quantitative, but with last argopy versions (>v0.1.16) using the API v2 (from argovis-api.colorado.edu)
I can't barely get any faster than about 10mins for this use case
This is a little bit worrying to me, because I always recommended to use argovis for large domain requests
Looks like the overhead from the server of managing small requests is no longer worth the chunking
I hope this is due to a change of config on the server side that could be fixed, rather than on the API design
What do you think of this ? and do you have any clue of what's going on ?
On my side, I shall play with chunk sizes to see when the overhead is worth it
Best !
Guillaume
poke: @bkatiemills @quai20
The text was updated successfully, but these errors were encountered: