Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): Limited support for programmatically running PAPIv≥2.14 #12970

Merged
merged 33 commits into from
Jul 25, 2023

Conversation

SyntaxColoring
Copy link
Contributor

@SyntaxColoring SyntaxColoring commented Jun 23, 2023

Overview / Changelog

These interfaces now support executing PAPIv≥2.14 protocols:

  • The opentrons_execute CLI
  • opentrons.execute.execute()
  • opentrons.execute.get_protocol_api()

This works towards, but does not close, RSS-268.

Known limitations:

  • Simulating PAPIv≥2.14 protocols will still raise a NotImplementedError.
  • Executing or simulating JSONv6+ protocols will still raise a NotImplementedError.
  • It will not print the run log. (Ticketed as RSS-283.)

Test Plan

Regression testing

  • opentrons_execute with a PAPIv≤2.13 file
    • Only standard labware
    • Custom labware with -L
    • Custom labware in the current working directory
  • opentrons.execute.execute() with a PAPIv≤2.13 file
    • Only standard labware
    • Custom labware with custom_labware_paths=...
    • Custom labware in the Jupyter Notebook directory
  • opentrons.execute.get_protocol_api() with PAPIv≤2.13
    • Only standard labware
    • Custom labware with extra_labware=...
    • Custom labware in the Jupyter Notebook directory

New feature testing

  • opentrons_execute with a PAPIv≥2.14 file
    • Only standard labware
    • Custom labware with -L
    • Custom labware in the current working directory
  • opentrons.execute.execute() with a PAPIv≥2.14 file
    • Only standard labware
    • Custom labware with custom_labware_paths=...
    • Custom labware in the Jupyter Notebook directory
  • opentrons.execute.get_protocol_api() with PAPIv≥2.14
    • Only standard labware
    • Custom labware with extra_labware=...
    • Custom labware in the Jupyter Notebook directory
  • Do at least one of these on an OT-3

Review requests

I hate several things about this.

Risk assessment

Medium. This is messy, highly-duplicated, difficult to automatically test, and public-facing.

Much of this code is sensitive to differences between None and {} for bundled_labware, extra_labware, and bundled_data.

Next steps

We should try executing some Flex hardware testing scripts. From @ryanthecoder:

python3 -m hardware_testing.gravimetric --photometric --pipette 1000 --channels 96 --tip 200 --trials 5 this is how we call our scripts. the scripts themselves load a python protocol to fetch which type and location of labware but then just executes python directly talking to the protocol context

After that, we should continue working on RSS-268 until these interfaces are good enough to launch the Flex with.

@SyntaxColoring SyntaxColoring added api Affects the `api` project protocol-engine Ticket related to the Protocol Engine project and associated HTTP APIs labels Jun 23, 2023
@SyntaxColoring SyntaxColoring requested review from andySigler, ryanthecoder and a team June 23, 2023 19:02
@codecov
Copy link

codecov bot commented Jun 23, 2023

Codecov Report

Merging #12970 (6543c2c) into edge (11fa4eb) will decrease coverage by 0.39%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             edge   #12970      +/-   ##
==========================================
- Coverage   72.55%   72.16%   -0.39%     
==========================================
  Files        2390     1565     -825     
  Lines       65903    51711   -14192     
  Branches     7306     3261    -4045     
==========================================
- Hits        47814    37319   -10495     
+ Misses      16344    13876    -2468     
+ Partials     1745      516    -1229     
Flag Coverage Δ
g-code-testing 96.44% <ø> (ø)
notify-server 89.13% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
api/src/opentrons/execute.py 54.83% <ø> (-6.14%) ⬇️
api/src/opentrons/protocol_engine/__init__.py 100.00% <ø> (ø)
...pentrons/protocol_engine/create_protocol_engine.py 100.00% <ø> (ø)
api/src/opentrons/util/entrypoint_util.py 90.69% <ø> (-0.22%) ⬇️

... and 825 files with indirect coverage changes

* Fix a confusing assert with an incorrect message.
* Comment and refactor test_loop_lifetime() for clarity.
* Rename loop_queue to loop_mailbox to reflect what we're using it for.
This promotes this test helper to production code.
We don't really do this often, but it seems nice for keeping the test function names short while still leaving room for testing the other functions in this file.
This promotes this test helper to production code.
@SyntaxColoring

This comment was marked as resolved.

@SyntaxColoring SyntaxColoring changed the title feat(api): Limited support for programmatically running PAPIv2.15 feat(api): Limited support for programmatically running PAPIv≥2.14 Jul 7, 2023
@SyntaxColoring SyntaxColoring added papi-v2 Python API V2 robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). labels Jul 7, 2023
@SyntaxColoring SyntaxColoring marked this pull request as ready for review July 8, 2023 00:56
@SyntaxColoring SyntaxColoring requested a review from a team as a code owner July 8, 2023 00:56
Comment on lines -53 to -61
_PYTHON_TOO_NEW_MESSAGE = (
"Python protocols with apiLevels higher than 2.13"
" cannot currently be executed with"
" the opentrons_execute command-line tool,"
" the opentrons.execute.execute() function,"
" or the opentrons.execute.get_protocol_api() function."
" Use a lower apiLevel"
" or use the Opentrons App instead."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can modify this message to talk about the limitations of the current implementation and log it as a warning instead. Just a thought

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in-person, I think:

We'll raise fine-grained errors when, and only when, you try to do the specific things that the implementation doesn't yet support. I think we should keep them NotImplementedErrors for now, instead of warnings.

except protocol_api.ProtocolEngineCoreRequiredError as e:
raise NotImplementedError(_PYTHON_TOO_NEW_MESSAGE) from e # See Jira RCORE-535.
else:
if bundled_labware is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the context of bundled labware. Is it there for future implementation/completion or was it a part of some old version that we don't plan on supporting for newer versions? If it's the latter, we should update the error message to indicate that it's not supported anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also don't know. Let's ask around and see if we can remove it.

Merging in case PR #13008 affects error handling.
Merging for test and doc improvements in PR #13107.
This was just noise. I think it's okay to not document this explicitly, for now. We don't document other potential reasons why an execute would fail, and it's weird to treat this specially.
Comment on lines +517 to +533
def to_stderr_string(self) -> str:
"""Return a string suitable as the stderr output of the `opentrons_execute` CLI.

This summarizes from the full error details.
"""
# It's unclear what exactly we should extract here.
#
# First, do we print the first element, or the last, or all of them?
#
# Second, do we print the .detail? .errorCode? .errorInfo? .wrappedErrors?
# By contract, .detail seems like it would be insufficient, but experimentally,
# it includes a lot, like:
#
# ProtocolEngineError [line 3]: Error 4000 GENERAL_ERROR (ProtocolEngineError):
# UnexpectedProtocolError: Labware "fixture_12_trough" not found with version 1
# in namespace "fixture".
return self._error_occurrences[0].detail
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TamarZanzouri @sfoster1 Is there a better thing for me to do here, now that stuff like #13008 is merged?

Here's what I'm trying to solve:

When we execute a protocol through Protocol Engine, we get any kind of fatal error—bugs, hardware errors, Protocol Engine validation errors, non-protocol-related Python mistakes, everything—through Protocol Engine's state_summary.errors. This is a list of ErrorOccurrences.

From that, we need to decide:

  1. How do we propagate that to callers of opentrons.execute.execute()? It needs to be an exception, but what do we put in it?

    I currently have the list of ErrorOccurrences as its only member. In stack traces, Python will stringify that as the exception message. This seems...okay, I guess? It's structured and lossless, at least. But it's nonstandard. I guess I'd really prefer if it were a normal Python exception chain all the way down.

  2. What do we print to stderr for callers of the opentrons_execute CLI? Stringifying the ErrorOccurrence, or parts of it, seems plausible, but I don't know the details of how to do that properly.

Copy link
Member

@sanni-t sanni-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌 Awesome! I know it has a few hacks and still needs to be implemented fully, but this is an amazing start. This ticket is quite a complex puzzle and I think you've approached it quite carefully, covering a lot of edge cases. I might have missed a couple of things while reviewing too (re: it's a complex change), but unit tests and on-robot testing has been successful so I think this is merge worthy!


# TODO(mm, 2023-06-30): This will home and drop tips at the end, which is not how
# things have historically behaved with PAPIv2.13 and older or JSONv5 and older.
result = await protocol_runner.run(protocol_source)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, another problem with default/ forced tip dropping.

Copy link
Contributor Author

@SyntaxColoring SyntaxColoring Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. the situation is actually not quite as bad as I thought it was, though. I thought we didn't have a way to clean up an engine without dropping tips. We do: we just have do do ProtocolEngine.finish(drop_tips_and_home=False). The trouble is just at the ProtocolRunner level, which doesn't propagate that functionality up.

Comment on lines +684 to +685
# opentrons.protocol_api.core.engine, that would incorrectly make
# ProtocolContext.is_simulating() return True.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oo! I did not realize that's how we are checking for simulating engine. Looking at that code, maybe there's scope to reduce redundancy in some configs (virtual pipettes/ modules/ pause ignore) and pooling them all together into an is_simulating check (not for this PR). But I might be forgetting a reason why it was split into multiple flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like we should either split it even further and have a dedicated flag for something like tell_python_protocol_it_is_simulating, or merge them all. We're in a weird in-between right now.

create_protocol_engine_in_thread(
hardware_api=hardware_api.wrapped(),
config=_get_protocol_engine_config(),
drop_tips_and_home_after=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, am I understanding this correctly that live protocols like those over Jupyter will not drop tips and home at the end of the 'protocol session', but python/json protocol execution will?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly.

@SyntaxColoring
Copy link
Contributor Author

SyntaxColoring commented Jul 24, 2023

This PR is causing pipenv run pytest tests/opentrons/test_execute.py to hang on exit when run on Python 3.10. Looking into it.

Here's why it was hanging:

A Python thread can be configured to be "daemon" or "non-daemon." The difference is that when the process is shutting down, the Python interpreter won't wait for daemon threads to be joined.

More precisely, the Python interpreter will only start to call our @atexit handlers after all non-daemon threads have joined. This creates a clear deadlock. A non-daemon thread will only exit after our @atexit handler tells it to close, but our @atexit handler will never get a chance to run because there's still a pending thread, so the process will hang on shutdown.

We're seeing this on Python 3.10 and not 3.7 because 3.9 changed ThreadPoolExecutor's threads from daemon to non-daemon. (python/cpython#83993).

My workaround is to ditch ThreadPoolExecutor in favor of a manually-managed Thread so we can set daemon=True. Here's another workaround that's less deeply intrusive, but is more horrifying to me in other ways.

I think the real solution is to let get_protocol_api() users call some kind of .close() method. Then, we don't need those @atexit handlers, and we don't need daemon=True. I agree with this Stack Overflow answer in that the interpreter shutdown phase is not a happy place to be coding. Per prior discussion, I'm not adding that .close() method in this line of PRs. But I view all of this as more reason why we need to do that at some point.

@SyntaxColoring
Copy link
Contributor Author

I have a verbal 👍 from @TamarZanzouri on the above change to fix the shutdown hanging.

@SyntaxColoring SyntaxColoring merged commit 63f94d9 into edge Jul 25, 2023
@SyntaxColoring SyntaxColoring deleted the pe_based_execute branch July 25, 2023 15:33
SyntaxColoring added a commit that referenced this pull request Oct 11, 2023
Notable changes:

* Rework `simulate.py` to support PAPIv≥2.14 protocols, via Protocol Engine. This rework mirrors what we did to `execute.py` in #12970.
* Rework the `adapt_protocol_source()` helper to cope with protocols for which we don't know the filename. This is necessary to support `opentrons.simulate.simulate()`, whose `filename` argument is optional.

  Unfortunately, this requires modifications to `opentrons.protocols.parse` to make the filename more pass-through, so `adapt_protocol_source()` can tell when it wasn't provided.

Plus various smaller refactors for type safety, deduplication, and consistency.
SyntaxColoring added a commit that referenced this pull request Oct 19, 2023
Notable changes:

* Rework `simulate.py` to support PAPIv≥2.14 protocols, via Protocol Engine. This rework mirrors what we did to `execute.py` in #12970.
* Rework the `adapt_protocol_source()` helper to cope with protocols for which we don't know the filename. This is necessary to support `opentrons.simulate.simulate()`, whose `filename` argument is optional.

  Unfortunately, this requires modifications to `opentrons.protocols.parse` to make the filename more pass-through, so `adapt_protocol_source()` can tell when it wasn't provided.

Plus various smaller refactors for type safety, deduplication, and consistency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Affects the `api` project papi-v2 Python API V2 protocol-engine Ticket related to the Protocol Engine project and associated HTTP APIs robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants