Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Update from 4.3.1 -> 4.4.0 adds ThreadManagerException to software #7959

Closed
Koeng101 opened this issue Jun 18, 2021 · 16 comments · Fixed by #8009
Closed

bug: Update from 4.3.1 -> 4.4.0 adds ThreadManagerException to software #7959

Koeng101 opened this issue Jun 18, 2021 · 16 comments · Fixed by #8009
Labels

Comments

@Koeng101
Copy link

Koeng101 commented Jun 18, 2021

4.3.1 -> 4.4.0

Overview

I currently am maintaining the software package opentronsfastapi (https://github.com/Koeng101/opentronsfastapi) which allows you to easily build fastapis that run on opentrons themselves. Basically, integrating opentrons with other hardware is way too clunky currently, so we've build software to present the opentrons as a REST server that you can pass parameters to (NOT passing python scripts).

The last update, 4.4.0, has caused our software to stop working, as the following exception is raised when run on real opentrons hardware:

INFO:     192.168.1.187:49982 - "POST /api/test_competent_cells?version_only=false HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "usr/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
  File "usr/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
  File "usr/lib/python3.7/site-packages/fastapi/applications.py", line 149, in __call__
  File "usr/lib/python3.7/site-packages/starlette/applications.py", line 102, in __call__
  File "usr/lib/python3.7/site-packages/starlette/middleware/errors.py", line 181, in __call__
  File "usr/lib/python3.7/site-packages/starlette/middleware/errors.py", line 159, in __call__
  File "usr/lib/python3.7/site-packages/starlette/exceptions.py", line 82, in __call__
  File "usr/lib/python3.7/site-packages/starlette/exceptions.py", line 71, in __call__
  File "usr/lib/python3.7/site-packages/starlette/routing.py", line 550, in __call__
  File "usr/lib/python3.7/site-packages/starlette/routing.py", line 227, in handle
  File "usr/lib/python3.7/site-packages/starlette/routing.py", line 41, in app
  File "usr/lib/python3.7/site-packages/fastapi/routing.py", line 197, in app
  File "usr/lib/python3.7/site-packages/fastapi/routing.py", line 148, in run_endpoint_function
  File "./opentronsfastapi/__init__.py", line 198, in inner
    ctx = opentrons_env.get_protocol_api(apiLevel)
  File "usr/lib/python3.7/site-packages/opentrons/execute.py", line 93, in get_protocol_api
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.py", line 130, in __init__
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.py", line 135, in managed_thread_ready_blocking
opentrons.hardware_control.thread_manager.ThreadManagerException: Failed to create Managed Object

The machine appears to get upset at this line - https://github.com/Koeng101/opentronsfastapi/blob/5cbed98c755123051887f38f1966a185664c8b2d/opentronsfastapi/__init__.py#L198 where I establish a new context to be used.

This new code was added by @ahiuchingau in 475ee25 on (475ee25#diff-daa3772563c5ebc4de4a12da9e046e1612d382a5b7adcf6d1b0eb8a841304b64 , great commit message btw).

For context in opentronsfastapi - we basically have a global lock mechanism for protocols that are currently running using an SQLite server (that also acts as an activity log). I know that opentrons has a local lock built into the software, but it isn't very easy to hook that onto data to, say, tell a user which protocol is currently running, while maintaining a historical log of the activity.

Once the protocol gets a lock, it simulates the protocol (this works just fine). It then gets a protocol context, passes this into a thread that runs the protocol itself, and returns to the user a confirmation that the protocol has begun, along with a process ID and a version hash for the protocol's function. The step of get_protocol_api is what is failing here.

Any ideas of how I can fix this? I'm not too familiar with the low-downs of how this is implemented or ways I could get around it. Thanks!

Steps to reproduce

You can run our example app on a real machine to raise this error.

Current behavior

Currently, I get the above error.

Expected behavior

I expect to be able to use this ctx in our software.

@Koeng101 Koeng101 added the bug label Jun 18, 2021
@amitlissack
Copy link
Contributor

Sorry that you've run into this. We are currently in a long holiday weekend, but can put time into this next week.

A couple questions:

  1. Do you run into the same issue when using a simulator?
  2. Sadly the the ThreadManagerException is not super helpful. The ThreadManager is simply calling this method API.build_hardware_controller(). For the sake of seeing the actual exception, could you call API.build_hardware_controller() in your code and share the exception raised?

@Koeng101
Copy link
Author

Hey @amitlissack meant to run this today, but was taking a long holiday weekend in the other direction. Will test (2) early tomorrow morning.

I do not get the same issue when running in a simulator.

@d-raith
Copy link

d-raith commented Jun 22, 2021

I probably can confirm the issue from another perspective. I'm currently using jupyter notebooks to implement custom remote control, similar as how it is described in the documentation section regarding jupyter notebooks. We use a gen 2 temperature module and since the latest update we are unable to obtain a protocol context through the get_protocol_api method. It fails with the same error mentioned in the initial post but in addition I have a stack trace regarding the temperature module communication:

Exception in Thread Manager build
Traceback (most recent call last):
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.py", line 152, in _build_and_start_loop
  File "usr/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/api.py", line 191, in build_hardware_controller
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 36, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 116, in register_modules
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 63, in build_module
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/utils.py", line 37, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/tempdeck.py", line 79, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/tempdeck.py", line 241, in _connect
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 258, in get_device_info
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 353, in _get_info
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 299, in _send_command
AssertionError: not connected

The assertion error is triggered due to the temperature module's lock not being available / initialized.

However, control and access of the module using the opentrons app is working. With the module being powered off, I'm able to obtain a ProtocolContext and control the robot. @amitlissack, do you have any additional modules connected to your test setup?

Info:
Server Version: 4.4.0
Firmware Version: v1.1.0-25e5cea

Jupyter Cell:

import opentrons.execute
protocol = opentrons.execute.get_protocol_api('2.11')
protocol.home()
Failed to initialize character device, will not be able to control gpios (lights, button, smoothiekill, smoothie reset). Only one connection can be made to the gpios at a time. If you need to control gpios, first stop the robot server with systemctl stop opentrons-robot-server. Until you restart the server with systemctl start opentrons-robot-server, you will be unable to control the robot using the Opentrons app.
Exception in Thread Manager build
Traceback (most recent call last):
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.py", line 152, in _build_and_start_loop
  File "usr/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/api.py", line 191, in build_hardware_controller
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 36, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 116, in register_modules
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/module_control.py", line 63, in build_module
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/utils.py", line 37, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/tempdeck.py", line 79, in build
  File "usr/lib/python3.7/site-packages/opentrons/hardware_control/modules/tempdeck.py", line 241, in _connect
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 258, in get_device_info
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 353, in _get_info
  File "usr/lib/python3.7/site-packages/opentrons/drivers/temp_deck/driver.py", line 299, in _send_command
AssertionError: not connected
---------------------------------------------------------------------------
ThreadManagerException                    Traceback (most recent call last)
<ipython-input-4-03058a0ad874> in <module>()
      1 import opentrons.execute
      2 import time
----> 3 protocol = opentrons.execute.get_protocol_api('2.11')
      4 
      5 

/usr/lib/python3.7/site-packages/opentrons/execute.pyc in get_protocol_api(version, bundled_labware, bundled_data, extra_labware)

/usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.pyc in __init__(self, builder, *args, **kwargs)

/usr/lib/python3.7/site-packages/opentrons/hardware_control/thread_manager.pyc in managed_thread_ready_blocking(self)

ThreadManagerException: Failed to create Managed Object

@amitlissack
Copy link
Contributor

amitlissack commented Jun 22, 2021

@d-raith thanks for all that info.

I am looking into the issue and find this quite perplexing.

By any chance is the environment variable ENABLE_VIRTUAL_SMOOTHIE defined in your jupyter setup?

@d-raith
Copy link

d-raith commented Jun 22, 2021

I've already downgraded the system to 4.3.1 in order to continue development. The variable is not set in this configuration and it's probably the same for 4.4.0 as long as the default behavior didn't change with the update as we do not make use of it in our code.

@amitlissack
Copy link
Contributor

@d-raith ENABLE_VIRTUAL_SMOOTHIE was a long shot but an easy explanation. I do not have a tempdeck to test with. I'll continue digging.

Can you verify that you have only have one tempdeck connected?

@d-raith
Copy link

d-raith commented Jun 22, 2021

@amitlissack Yes, there is only one temp deck connected.

@amitlissack
Copy link
Contributor

@Koeng101 are you using any modules? Does @d-raith 's issue seem the same as yours? I'd like to make sure this isn't two different bugs.

@Koeng101
Copy link
Author

I have a temp deck connected as well. I can disconnect it and try again

@sfoster1
Copy link
Member

If you stop the opentrons robot server on the raspberry pi with systemctl stop opentrons-robot-server does that help things?

@d-raith
Copy link

d-raith commented Jun 22, 2021

@sfoster1 I tried it, but it didn't help at least for my part. Given that @Koeng101 also uses a temp module, I'm pretty certain that this is the same or at least a closely related issue.

@Russell-Tran
Copy link

Russell-Tran commented Jun 22, 2021

Following up on behalf of @Koeng101 (same lab):

  • @amitlissack: Temp module is related to the issue. "If you disconnect the temp module everything works again." Our custom software, opentronsfastapi, is unsuccessful (fails with ThreadManagerException) in any instance where a Temp module is on and connected to the OT-2. That is, fails regardless of whether the particular protocol actually declares the Temp module in code (ctx.load_module("temperature module", 4)). Disconnect the Temp module, and opentronsfastapi for protocols which don't need a Temp module regains functionality.
  • @sfoster1: systemctl stop opentrons-robot-server does not help things

@amitlissack
Copy link
Contributor

We have been able to reproduce the problem in our lab. We will update you all when we know more.

@Koeng101
Copy link
Author

@Laura-Danielle Great to know this is fixed! Will there be an update release soon to fix this bug? Thanks

@Laura-Danielle
Copy link
Contributor

Hey @Koeng101, we will be doing our next release around the end of July.

@d-raith
Copy link

d-raith commented Jun 29, 2021

Glad to hear that the issue has been resolved, looking forward to the release.
Thanks a lot for the quick response and great soft- and hardware!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants