feat(api, app): Check Robot Deck Transform #5845

Laura-Danielle · 2020-06-08T22:01:26Z

overview

This PR adds in a check during the health poll to determine whether a robot has a valid deck transformation or not.

The method will check whether a matrix is singular, the identity matrix, or out of the bounds greater than 3 for XY and [-20, -30] for the z value. All of these cases are rare/improbable, but if the user somehow has this data on their robot then they will definitely be experiencing major movement issues.

changelog

Add in a new entry for the health endpoint called valid_calibration (open to other names)
Add in a method in api.py that checks a few different properties/values of the matrix (see overview section)
Edit the app to display an error message iff valid_calibration is false

Note

Dev server will always display bad deck calibration because currently the identity matrix is used during simulation as a placeholder. This is also why the tavern test has to currently be set to false because the identity matrix is not valid. Generating a valid deck transform for simulation will be fixed in a subsequent PR.

review requests

Put this on your robot, turn your deck calibration into an identity matrix (easiest check), restart your robot then see the screen displayed here.

risk assessment

Medium? Not sure if we should even include this feature when we release deck calibration check. It might scare our users, even though it's checking only rare cases.

SyntaxColoring · 2020-06-08T22:15:41Z

or out of the bounds greater than 3 for XY

3 mm seems too tight to me. From my notes on #5022, one recent robot's clean-slate, healthy, good calibration had a y-offset of 4.99.

As things are, I don't know how we would find a sensible range for that check. We can see wild robots' calibrations through logging telemetry, but we can't trust that those are good.

SyntaxColoring · 2020-06-08T22:31:24Z

app/src/components/RobotSettings/ControlsCard.js

@@ -36,10 +37,16 @@ const CALIBRATE_DECK_DESCRIPTION =

 const CHECK_ROBOT_CAL_DESCRIPTION = "Check the robot's calibration state"

+const ROBOT_CAL_ERROR = 'Bad robot calibration data detected'
+const ROBOT_CAL_ERROR_DESCRIPTION =
+  'Do not run any protocols or check calibration as robot is likely to experience a crash.'


My understanding was that the robot calibration check looks at the tip pickup position and makes sure it's sane, specifically to guard against insane-matrix crashes like this? But maybe that's not good enough to stop singular or identity matrices from causing crashes?

There are a few different check points for determining whether you calibration is bad or not. This check point is meant to tell a user that their deck calibration is so bad, they might not be able to even complete a calibration check without fixing their robot deck calibration. (I.e. limit switches triggering and such)

or out of the bounds greater than 3 for XY

3 mm seems too tight to me. From my notes on #5022, one recent robot's clean-slate, healthy, good calibration had a y-offset of 4.99.

As things are, I don't know how we would find a sensible range for that check. We can see wild robots' calibrations through logging telemetry, but we can't trust that those are good.

Sorry don't know why github wouldn't let me directly comment above. 5mm is pretty large for a translation across the deck (which is purely meant to encapsulate tolerance ranges for the physical deck). I wonder if we need to use a different nominal value moving forward, or something else weird is happening.

I'm also happy to take this check out for now until we come up with a better way to determine something being crazy out of bounds (aside from singularity or identity)

There are a few different check points for determining whether you calibration is bad or not. This check point is meant to tell a user that their deck calibration is so bad, they might not be able to even complete a calibration check without fixing their robot deck calibration. (I.e. limit switches triggering and such)

Dumb question, sorry, but, where do these messages show up? Is it:

on a robot's page

and, intended to prevent people from clicking the Check calibration button?

The message for this PR shows up on the robot's page (currently under the check robot calibration button)

The tip rack pick up check happens during calibration check after a user has successfully picked up a tip. If a user goes into cal check w/ the use-cases from this PR they might experience a weird error (such as limit switch triggering) before ever seeing the message around them having bad deck calibration data.

I spoke to hardware about the values that you saw for a "working" deck calibration, and they are currently investigating. They should be able to give us a good idea of what the tolerancing should be for the deck and whether the nominal values we're currently using as a measurement should be changed.

The message for this PR shows up...currently under the check robot calibration button

If a user goes into cal check w/ the use-cases from this PR they might experience a weird error...before ever seeing the message around them having bad deck calibration data.

Ah, that makes sense.

In that case, I think we should think about a different UX.

Checking deck calibration should always be safe. We shouldn't have an inviting button, and then also have an error message competing with that button trying to convince you that the button is dangerous. We should just make the button not dangerous.

For example, we could duplicate this check at the beginning of the calibration check flow.

(Semantically, I don't see this condition as an error that prevents the robot from checking its calibration. I see it as the robot having already successfully diagnosed a problem with its calibration.)

I spoke to hardware about the values that you saw for a "working" deck calibration, and they are currently investigating. They should be able to give us a good idea of what the tolerancing should be for the deck and whether the nominal values we're currently using as a measurement should be changed.

Sounds great. Thanks!

@SyntaxColoring I think we should just disable cal check when the error is there.

@SyntaxColoring I think we should just disable cal check when the error is there.

We could do that. I think I softly prefer my suggestion, though.

If the calibration check is available unconditionally, it's simpler to point users to it. Otherwise, we might elicit confusion like, "My robot is moving wrong, and you Support people told me to click Check calibration to troubleshoot, but the software is telling me it can't do that...because it might make my robot move wrong?"

mcous

My thoughts below! TL;DR: I don't think this should be in the health endpoint.

Not sure if we should even include this feature when we release deck calibration check.

If we don't have clear UX goals with this feature, should we consider removing the UI from this PR, and instead trigger an analytics event if the data is bad? That way, we could have better data to decide if UI is necessary

mcous · 2020-06-08T22:14:39Z

api/src/opentrons/hardware_control/api.py

+    def valid_transform(self) -> bool:
+        return self._valid_transform
+
+    def validate_calibration(self):


Why doesn't this method return a boolean? It seems a little complicated to have a call-site understand that it needs to trigger an update to a class property and then read that class property instead of just asking for the value

Yeah we should fuse valid_transform and validate_transform into one method that returns its outcome (which i think maybe we should have be an enum that encompasses "singular", "identity", "bad", and "ok") and lru_cache it with invalidation on altering deck calibration

mcous · 2020-06-08T22:20:40Z

app/src/components/RobotSettings/ControlsCard.js

+          </LabeledButton>
+
+          {health && !health.valid_calibration && (
+            <div className={styles.cal_check_error_wrapper}>


This is a lot of presentational concerns to put into a component (ControlsCard) that is not currently presentational, and this logic isn't covered in ControlsCard.test.js.

I would create a new component for displaying the calibration warning text (and tests to ensure those warnings are displayed properly) and use that component here. If I was implementing it from the top down, I would do something like:

Write a test for the ControlsCard that says "if there is a problem with the deck calibration data, a DeckCalibrationWarning (or whatever name) component is shown"

Make a dummy DeckCalibrationWarning component and add the logic to ControlsCard to get that test to pass

Add test(s) to check that DeckCalibrationWarning

Displays an "alert-circle" Icon

Has some title that mentions deck calibration

Has some description that also mentions deck calibration

Implement a purely presentational DeckCalibrationError component that passes those tests

mcous · 2020-06-08T22:24:23Z

app/src/components/RobotSettings/styles.css

@@ -94,3 +94,25 @@
    margin-right: auto;
  }
 }
+
+.cal_check_error {


Mostly for demo purposes for our meeting on Wednesday, instead of <div className={styles.cal_check_error}>, you could do:

// note: COLOR_ERROR would need to be added to components/src/styles/colors.js import { Box, FONT_SIZE_BODY_1, COLOR_ERROR, SPACING_AUTO, SPACING_1 } from '@opentrons/components' // ... <Box color={COLOR_ERROR} fontSize={FONT_SIZE_BODY_1} marginRight={SPACING_AUTO} paddingRight={SPACING_1} >

mcous · 2020-06-08T22:26:20Z

app/src/components/RobotSettings/styles.css

+}
+
+
+.cal_check_error_wrapper {


Instead of <div className={styles.cal_check_error_wrapper}>, this could be

import { Flex, ALIGN_CENTER } from '@opentrons/components' // ... <Flex alignItems={ALIGN_CENTER}>

mcous · 2020-06-08T22:29:15Z

robot-server/robot_server/service/legacy/routers/health.py

@@ -37,7 +37,9 @@ async def get_health(
    else:
        max_supported = APIVersion(1, 0)

+    hardware.validate_calibration()


Do we have to add this to the health endpoint? Health endpoints are for checking "is the server up" and should be as lightweight as possible. This introduces matrix math into the health endpoint, which is called a lot.

IMO this belongs in a calibration-specific GET endpoint

I think that the /health endpoint is and should be about health of the robot, of which "the server is up" is an inescapable part.

That said, you're right about this being too much work to do in a frequently-polled endpoint - we should lru_cache this function and invalidate it when the deck calibration changes.

A bad deck calibration doesn't mean the robot is unhealthy, it means the calibration data needs to be changed. The robot can still be interacted with, and you could still run a protocol (it's a bad idea, but you could).

You could contrast the calibration data with the version numbers reported in the health endpoint, which are important in determining how / if the robot can be communicated successfully by a given client.

This data is really specific to the implementation of the calibration system. I strongly feel like it should be in a calibration-specific endpoint/resource. The health endpoint is part of the connectivity system, and I would like to avoid overloading it with other concerns.

SyntaxColoring · 2020-06-09T15:15:26Z

If we don't have clear UX goals with this feature, should we consider removing the UI from this PR, and instead trigger an analytics event if the data is bad? That way, we could have better data to decide if UI is necessary

Maybe obvious, but: If we do this, I suggest including the robot's calibration data—ideally everything, but at least the deck calibration matrix—in the analytics. Otherwise, we just see a bunch of binary pass/fails, which is kind of unactionable.

sfoster1 · 2020-06-09T16:00:24Z

api/src/opentrons/hardware_control/api.py

+    def valid_transform(self) -> bool:
+        return self._valid_transform
+
+    def validate_calibration(self):


Yeah we should fuse valid_transform and validate_transform into one method that returns its outcome (which i think maybe we should have be an enum that encompasses "singular", "identity", "bad", and "ok") and lru_cache it with invalidation on altering deck calibration

… deck transform

sfoster1

Inline change requests, minor ones though

sfoster1 · 2020-06-18T13:13:21Z

api/src/opentrons/hardware_control/api.py

@@ -209,6 +211,53 @@ def is_simulator(self):
        """ `True` if this is a simulator; `False` otherwise. """
        return isinstance(self._backend, Simulator)

+    def validate_calibration(self):


definitely need a return type annotation here

sfoster1 · 2020-06-18T13:13:38Z

api/src/opentrons/hardware_control/api.py

+        return self._calculate_valid_calibration()
+
+    @lru_cache(maxsize=1)
+    def _calculate_valid_calibration(self):


need a return type annotation

sfoster1 · 2020-06-18T13:15:24Z

api/src/opentrons/hardware_control/api.py

+
+        z = abs(curr_cal[2][-1])
+
+        outofrange = z < 20 or z > 30


this tolerance is a little tight for my comfort, especially since we can't reallytest this on as many robots as normal - could we bump it to (16, 34) or something? +-9 from the -25 nominal should still be within our 10mm safety bound for moving to the tiprack

sfoster1 · 2020-06-18T13:16:43Z

api/src/opentrons/hardware_control/simulator.py

@@ -253,6 +254,9 @@ def fw_version(self) -> Optional[str]:
    async def update_fw_version(self):
        pass

+    def validate_calibration(self):
+        return DeckTransformState.IDENTITY


What's this function for? It doesn't appear to be called

sfoster1 · 2020-06-18T13:18:39Z

robot-server/robot_server/service/legacy/models/health.py

@@ -22,6 +22,11 @@ class Health(BaseModel):
              description="The robot's name. In most cases the same as its "
                          "mDNS advertisement domain name but this can get out"
                          " of sync. Mostly useful for user-facing titles.")
+    calibration: str = \


this really should be an enum type brought up from hardware so it shows up as such in the api docs, and the same in the app-side flow types

sfoster1

The tooltip is centered on the box containing the check button rather than the button itself

sfoster1

At the default viewport size of 1024x768 the deck cal warning wraps extremely weirdly:

Could we maybe make the Bad deck calibration! its own <p> or something?

sfoster1 · 2020-06-18T14:01:39Z

It does work great though!

sfoster1

LGTM!

mcous · 2020-06-18T15:45:35Z

I want to re-iterate my concerns with adding deck calibration data to the health endpoint. In addition to overloading the health endpoint with non-server-health concerns, it has a functional problem: by design, the app UI does not control when the health endpoint of a given robot is hit because it is part of the discover-client's connectivity check.

If you rely on the health endpoint being updated out of band (rather than using a specific calibration endpoint that can be hit when the UI needs that information), you run the risk of displaying stale data

sfoster1

lgtm!

* add /calibration/status endpoint + tavern test * fill in instrumentCalibration and data matrix * slight model changes

Laura-Danielle added app Affects the `app` project api Affects the `api` project labels Jun 8, 2020

Laura-Danielle requested review from a team June 8, 2020 22:01

Laura-Danielle requested review from a team as code owners June 8, 2020 22:01

Laura-Danielle requested review from ahiuchingau, mcous and amitlissack and removed request for a team June 8, 2020 22:01

SyntaxColoring reviewed Jun 8, 2020

View reviewed changes

mcous suggested changes Jun 8, 2020

View reviewed changes

Laura-Danielle added the WIP label Jun 8, 2020

sfoster1 requested changes Jun 9, 2020

View reviewed changes

Laura-Danielle added 10 commits June 17, 2020 20:30

feat(api): Determine whether a robot has an excessively out of bounds…

a2c2dae

… deck transform

Fixup: Linter errors

3133bb5

Fixup: Typo in test

e58c94f

Fixup: API side of validate transform

41f4587

Fixup: App side of validate transform

ca0c5d6

App: Refactor messages and components

6ca7501

Fixup: Server handling

22fc7dd

App: Add tooltip and more tests

6307684

Api: Fixup tests and lru cache handling

87ed01c

Fixup: App, refactor to use TitledButton

7f88c40

sfoster1 requested changes Jun 18, 2020

View reviewed changes

Laura-Danielle added 2 commits June 18, 2020 10:40

Fixup: Make API requested changes and fixup errors from rebase

6fc6f68

App: refactor app to use TitledButton

625f433

Laura-Danielle force-pushed the add_bad_deck_cal_status branch from 0d5bdec to 625f433 Compare June 18, 2020 14:41

Laura-Danielle requested review from mcous and sfoster1 June 18, 2020 14:42

Laura-Danielle added ready for review and removed WIP labels Jun 18, 2020

sfoster1 approved these changes Jun 18, 2020

View reviewed changes

Remove deck cal status from health endpoint

172f5f6

sfoster1 approved these changes Jun 18, 2020

View reviewed changes

mcous approved these changes Jun 18, 2020

View reviewed changes

feat(robot-api): Add bad cal status endpoint (#5955)

3eec71f

* add /calibration/status endpoint + tavern test * fill in instrumentCalibration and data matrix * slight model changes

Laura-Danielle merged commit ed67383 into edge Jun 18, 2020

Laura-Danielle deleted the add_bad_deck_cal_status branch June 18, 2020 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api, app): Check Robot Deck Transform #5845

feat(api, app): Check Robot Deck Transform #5845

Laura-Danielle commented Jun 8, 2020

SyntaxColoring commented Jun 8, 2020 •

edited

Loading

SyntaxColoring Jun 8, 2020

Laura-Danielle Jun 8, 2020

Laura-Danielle Jun 8, 2020 •

edited

Loading

SyntaxColoring Jun 8, 2020

Laura-Danielle Jun 9, 2020

SyntaxColoring Jun 9, 2020

sfoster1 Jun 9, 2020

SyntaxColoring Jun 9, 2020 •

edited

Loading

mcous left a comment

mcous Jun 8, 2020

sfoster1 Jun 9, 2020

mcous Jun 8, 2020

mcous Jun 8, 2020

mcous Jun 8, 2020 •

edited

Loading

mcous Jun 8, 2020

sfoster1 Jun 9, 2020

mcous Jun 9, 2020

SyntaxColoring commented Jun 9, 2020

sfoster1 Jun 9, 2020

sfoster1 left a comment

sfoster1 Jun 18, 2020

sfoster1 Jun 18, 2020

sfoster1 Jun 18, 2020

sfoster1 Jun 18, 2020

sfoster1 Jun 18, 2020

sfoster1 left a comment

sfoster1 left a comment

sfoster1 commented Jun 18, 2020

sfoster1 left a comment

mcous commented Jun 18, 2020

sfoster1 left a comment

		}


		.cal_check_error_wrapper {

feat(api, app): Check Robot Deck Transform #5845

feat(api, app): Check Robot Deck Transform #5845

Conversation

Laura-Danielle commented Jun 8, 2020

overview

changelog

Note

review requests

risk assessment

SyntaxColoring commented Jun 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Laura-Danielle Jun 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring Jun 9, 2020 • edited Loading

Choose a reason for hiding this comment

mcous left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcous Jun 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring commented Jun 9, 2020

Choose a reason for hiding this comment

sfoster1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfoster1 left a comment

Choose a reason for hiding this comment

sfoster1 left a comment

Choose a reason for hiding this comment

sfoster1 commented Jun 18, 2020

sfoster1 left a comment

Choose a reason for hiding this comment

mcous commented Jun 18, 2020

sfoster1 left a comment

Choose a reason for hiding this comment

SyntaxColoring commented Jun 8, 2020 •

edited

Loading

Laura-Danielle Jun 8, 2020 •

edited

Loading

SyntaxColoring Jun 9, 2020 •

edited

Loading

mcous Jun 8, 2020 •

edited

Loading