-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[serve] Use pickle.dumps
for proxy->replica messages
#49539
Conversation
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
…es/pickle-for-msgs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
python/ray/serve/_private/common.py
Outdated
@property | ||
def receive_asgi_messages(self) -> Callable[[RequestMetadata], Awaitable[bytes]]: | ||
if self._receive_asgi_messages is None: | ||
proxy_actor = ray.get_actor(self._proxy_actor_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably unlikely to happen and there's nothing we can do about it, but I feel there is a case where this get_actor
call might fail and unable to get the proxy actor and call receive_asgi_messages
now if the node died or proxy actor died.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the GCS goes down before the proxy sends its first request to replica A? Will ray.get_actor(self._proxy_actor_name)
go through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I'm not actually sure the behavior. It should be essentially the same as what happens currently when the actor handle is deserialized, but now it might block the event loop (not good). Let me investigate
Turns out the |
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
…es/pickle-for-msgs
…es/pickle-for-msgs
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
…9539) Fully skips the Ray cloudpickle serialization path for proxy -> replica communication. The main change required is passing the proxy actor name rather than handle to the replica and fetching the actor handle on the replica side instead. I've done this inside of `StreamingHTTPRequest` to avoid widespread changes. Before: ``` Concurrency Level: 100 Time taken for tests: 12.873 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 776.81 [#/sec] (mean) Time per request: 128.732 [ms] (mean) Time per request: 1.287 [ms] (mean, across all concurrent requests) Transfer rate: 144.89 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 128 33.2 128 268 Waiting: 4 126 33.1 127 265 Total: 6 128 33.1 129 268 Percentage of the requests served within a certain time (ms) 50% 129 66% 140 75% 147 80% 152 90% 167 95% 187 98% 209 99% 222 100% 268 (longest request) ``` After: ``` Concurrency Level: 100 Time taken for tests: 11.489 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 870.40 [#/sec] (mean) Time per request: 114.890 [ms] (mean) Time per request: 1.149 [ms] (mean, across all concurrent requests) Transfer rate: 162.35 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 114 27.9 116 212 Waiting: 4 113 27.9 115 208 Total: 6 114 27.8 117 213 Percentage of the requests served within a certain time (ms) 50% 117 66% 123 75% 130 80% 136 90% 150 95% 161 98% 174 99% 182 100% 213 (longest request) ``` --------- Signed-off-by: Edward Oakes <[email protected]>
…9539) Fully skips the Ray cloudpickle serialization path for proxy -> replica communication. The main change required is passing the proxy actor name rather than handle to the replica and fetching the actor handle on the replica side instead. I've done this inside of `StreamingHTTPRequest` to avoid widespread changes. Before: ``` Concurrency Level: 100 Time taken for tests: 12.873 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 776.81 [#/sec] (mean) Time per request: 128.732 [ms] (mean) Time per request: 1.287 [ms] (mean, across all concurrent requests) Transfer rate: 144.89 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 128 33.2 128 268 Waiting: 4 126 33.1 127 265 Total: 6 128 33.1 129 268 Percentage of the requests served within a certain time (ms) 50% 129 66% 140 75% 147 80% 152 90% 167 95% 187 98% 209 99% 222 100% 268 (longest request) ``` After: ``` Concurrency Level: 100 Time taken for tests: 11.489 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 870.40 [#/sec] (mean) Time per request: 114.890 [ms] (mean) Time per request: 1.149 [ms] (mean, across all concurrent requests) Transfer rate: 162.35 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 114 27.9 116 212 Waiting: 4 113 27.9 115 208 Total: 6 114 27.8 117 213 Percentage of the requests served within a certain time (ms) 50% 117 66% 123 75% 130 80% 136 90% 150 95% 161 98% 174 99% 182 100% 213 (longest request) ``` --------- Signed-off-by: Edward Oakes <[email protected]> Signed-off-by: Roshan Kathawate <[email protected]>
…9539) Fully skips the Ray cloudpickle serialization path for proxy -> replica communication. The main change required is passing the proxy actor name rather than handle to the replica and fetching the actor handle on the replica side instead. I've done this inside of `StreamingHTTPRequest` to avoid widespread changes. Before: ``` Concurrency Level: 100 Time taken for tests: 12.873 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 776.81 [#/sec] (mean) Time per request: 128.732 [ms] (mean) Time per request: 1.287 [ms] (mean, across all concurrent requests) Transfer rate: 144.89 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 128 33.2 128 268 Waiting: 4 126 33.1 127 265 Total: 6 128 33.1 129 268 Percentage of the requests served within a certain time (ms) 50% 129 66% 140 75% 147 80% 152 90% 167 95% 187 98% 209 99% 222 100% 268 (longest request) ``` After: ``` Concurrency Level: 100 Time taken for tests: 11.489 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1910000 bytes HTML transferred: 120000 bytes Requests per second: 870.40 [#/sec] (mean) Time per request: 114.890 [ms] (mean) Time per request: 1.149 [ms] (mean, across all concurrent requests) Transfer rate: 162.35 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 3 Processing: 6 114 27.9 116 212 Waiting: 4 113 27.9 115 208 Total: 6 114 27.8 117 213 Percentage of the requests served within a certain time (ms) 50% 117 66% 123 75% 130 80% 136 90% 150 95% 161 98% 174 99% 182 100% 213 (longest request) ``` --------- Signed-off-by: Edward Oakes <[email protected]> Signed-off-by: Puyuan Yao <[email protected]>
Why are these changes needed?
Fully skips the Ray cloudpickle serialization path for proxy -> replica communication.
The main change required is passing the proxy actor name rather than handle to the replica and fetching the actor handle on the replica side instead. I've done this inside of
StreamingHTTPRequest
to avoid widespread changes.Before:
After:
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.