Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition between ping and stream_response #55

Merged
merged 7 commits into from
May 3, 2023

Conversation

ejlangev
Copy link
Contributor

@ejlangev ejlangev commented May 1, 2023

The bad sequence of events was as follows:

  1. Coroutine for stream_response calls send to close the request and blocks somewhere within on actually doing the sending which allows other coroutines to run. At this point the
  2. Coroutine for ping wakes up from its sleep and also calls send. This raises an exception because the stream_response call had tried to close the request but had not yet been able to return in order to cancel everything else.

This is relatively unlikely to happen but we were observing it regularly on our API which is doing a lot of SSE responses.

@sysid
Copy link
Owner

sysid commented May 2, 2023

Thank you @ejlangev for your PR. It makes a lot of sense.

I am about merging it. However I am not sure what exactly the test_ping_concurrency is demonstrating.

May I ask you to:

  1. either add some assert statements
  2. or to add some comments to demonstrate what is actually being proven by the test

I also merged the latest PR into your branch. Hope this is ok.

Regards,
sysid

@ejlangev
Copy link
Contributor Author

ejlangev commented May 2, 2023

I'll add some comments to the test, basically it would throw a WouldBlock error if it were to exhibit the race conditions and the sleep lengths are designed to have the ping task wake up while the stream_response is sending. Can definitely make it more clear.

@ejlangev
Copy link
Contributor Author

ejlangev commented May 2, 2023

Comments added explaining the sequencing and how this tests prevents it because it would raise a WouldBlock error if there were concurrent access to send.

@sysid sysid merged commit 4769f6d into sysid:master May 3, 2023
@sysid
Copy link
Owner

sysid commented Sep 23, 2023

@ejlangev , due to a new discussion 77 I tried to understand your use-case better and reproduce race conditions on my side. Unfortunately I could not.

My understanding is that the send coroutine when using FastAPI/Starlette should be safe to be called from different coroutines in parallel. Is this a wrong assumption?

@ejlangev
Copy link
Contributor Author

My understanding is that the send coroutine when using FastAPI/Starlette should be safe to be called from different coroutines in parallel. Is this a wrong assumption?

I think this is true except when one of the coroutines is calling send with more_body=False to end the response. The problem this is trying to fix is that the _ping coroutine can wake up and call send while the stream_response coroutine is in the middle of sending. This is normally fine because only one coroutine can execute at once but breaks if the stream_response coroutine is closing the connection because then the _ping coroutine will try to send data on a closed connection.

I think a simple performance improvement would be to get rid of safe_send and just directly claim the lock in stream_response but only for the send call that has more_body=False. The _ping coroutine would also need to claim the lock but it would have far less contention.

@sysid
Copy link
Owner

sysid commented Sep 23, 2023

@ejlangev , I am baffled by your cycle times, chapeau!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants