-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCP] Refactor the reserved instances cache #2836
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Michaelvll - LGTM.
@@ -225,11 +161,6 @@ class GCP(clouds.Cloud): | |||
'https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#google-cloud-platform-gcp' # pylint: disable=line-too-long | |||
) | |||
|
|||
def __init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need to test back-compat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested excuting new job and launching again on existing cluster. It works correctly. Added the tests in the PR description. Thanks!
return [r for r in reservations if r.zone.endswith(f'/{zone}')] | ||
|
||
|
||
@cachetools.cached(cache=cachetools.TTLCache(maxsize=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any quick test on whether the cache is effective within a process? Asking because the args passed are slightly different than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I added a logging in L104 and tried to remove the decorator. With the decorator the logging will only be shown once, but without the decorator it will show the output multiple times.
Also, tried to change the ttl
to 0.3, and it shows the output three times during the optimization.
We may want to add a comment to Lightweight stateless objects. SkyPilot may create multiple such objects; therefore, subclasses should take care to make methods inexpensive to call, and should not store heavy state. If state needs to be queried from the cloud provider and cached, create a module in sky/clouds/utils/ so they can be reused across cloud object creation. |
* Refactor the reserved instances cache for GCP * fix tests * Address comments * Add comments * fix test name * fix test
The previous implementation of GCP's reserved instance cache has different cache for multiple GCP() cloud objects. This can cause long optimization time if multiple resources are specified with different GCP() objects.
This PR refactors the cache out to a module to avoid querying the reservation list multiple times.
It reduces the time for optimization from 13 seconds to 4 seconds for the following yaml:
sky launch -c test test.yaml
Tested (run the relevant ones):
bash format.sh
sky launch --cloud gcp -t n2-standard-2 echo hi
with reservation forn2-standard-2
sky launch -c test-gcp --cloud gcp echo hi
on mastersky exec test-gcp echo hi
on current PRpytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh