-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(app,api): Add opt-in ping/pong monitoring to RPC websocket #2083
Conversation
Codecov Report
@@ Coverage Diff @@
## edge #2083 +/- ##
==========================================
+ Coverage 33.03% 33.11% +0.07%
==========================================
Files 452 452
Lines 7196 7224 +28
==========================================
+ Hits 2377 2392 +15
- Misses 4819 4832 +13
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested as best I could on wandering-👩 and VS. Could only confirm via calibration screens not run as wandering-👩can not simulate a run properly.
In my testing on Sunset, the ping/pong routine seems to work just fine during a protocol run. Additionally and most importantly, if a robot exists as both an ethernet and a wifi discovered robot, you connect to RPC over ethernet, and then you yank the cable, ping/pong will notify you that the socket has gone down and prompt you to reconnect. There's still some additional polish to be done around this experience (e.g. what do we do if connected to a robot over wifi and ethernet becomes available?), but could use UX guidance and additional ticket(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐙 Looking forward to sunsetting GET /health poll!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏋️ BE code looks good
overview
This PR adds opt-in ping/pong monitoring to the RPC websocket. It also improves/adds WebSocket closure handling to surface socket closure to the UI more reliably.
Closes #2052
changelog
review requests
Checking on the currently connected robot via polling
GET /health
has proven unreliable (socket can fail even when the health endpoint is fine) and is kind of a mess in terms of app state. Also, now that the app can combine multiple IP addresses into one robot (if$OT_APP_DISCOVERY__ENABLED
is set), the existing health-check implementation doesn't know what to do if the currently connected IP goes down but the robot is still accessible via another IP. For the purposes of switching to new discovery by default, this is blocking.Rather than getting crazy with the existing health-check stuff, this PR implements a very simple (and battle-tested) ping/pong routine where the client sends a ping and the server responds with a pong on an interval. If too many pings go by without a pong, the client will close the socket and signal the UI.
If the RPC server the app connects to supports ping/pong, the app will not spin up a
GET /health
poll routine for that robot. We can build auto-reconnect attempts on top of this, and eventually remove thehttp-api-client/health-check
module entirely in favor of the/health
polling indiscovery-client
and the socket ping/pong here.I didn't even attempt to build unit tests for this because the server-side RPC tests are completely unworkable (it seems like raises in the code cause indefinite hangs rather than test failures) and the client-side code isn't much better. I'm sorry.
test plan
Tests can be run with real robot or virtual smoothie, provided at least a few people test on real robots.
regression testing
On a robot without ping/pong RPC:
GET /health
poll on connectGET /health
poll on disconnectnew feature testing
On a robot with ping/pong RPC (
sunset
has been loaded with this branch):GET /health
poll on connectapi/opentrons/server/rpc.py