-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dorado plugin #344
Dorado plugin #344
Conversation
…t_pyguppy_client_lib to ont_pybasecaller_client_lib
In addition this request needs to know if it should send read_id or read name - it has not yet been updated to use the read name. |
This should close #347 - which will start to appear more as more people update. I think we should merge #324 before we merge this. I also think that making this a new plugin is probably the right way to go! It's going to be heavy duplication of the guppy.py plugin - but I suggest we add a deprecation warning when running the @mattloose could we take the |
Uhoh spaggetios ==================================================================================== test session starts ====================================================================================
platform linux -- Python 3.11.0, pytest-7.4.3, pluggy-1.3.0
rootdir: /home/adoni5/Projects/readfish
configfile: pyproject.toml
testpaths: tests/*test.py, src/readfish
collected 65 items / 1 error
========================================================================================== ERRORS ===========================================================================================
______________________________________________________________________ ERROR collecting src/readfish/plugins/guppy.py _______________________________________________________________________
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/runner.py:341: in from_call
result: Optional[TResult] = func()
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/runner.py:372: in <lambda>
call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/doctest.py:567: in collect
module = import_path(
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/pathlib.py:567: in import_path
importlib.import_module(module_name)
../../miniforge3/envs/readfish/lib/python3.11/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1206: in _gcd_import
???
<frozen importlib._bootstrap>:1178: in _find_and_load
???
<frozen importlib._bootstrap>:1149: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:690: in _load_unlocked
???
<frozen importlib._bootstrap_external>:940: in exec_module
???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
???
src/readfish/plugins/guppy.py:16: in <module>
from pyguppy_client_lib.helper_functions import package_read
../../miniforge3/envs/readfish/lib/python3.11/site-packages/pyguppy_client_lib/helper_functions.py:17: in <module>
from pyguppy_client_lib.client_lib import GuppyClient
E ImportError: NumPy: dtype is already registered
================================================================================== short test summary info ==================================================================================
ERROR src/readfish/plugins/guppy.py - ImportError: NumPy: dtype is already registered
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
===================================================================================== 1 error in 0.35s ====================================================================================== Seems installing |
Yes - we need some kind of conditional install? IT's a bit of a pain this for sure. |
This is so pytest collections (The only time both are imported into the same interperter runtime) doesn't crash. This requires all tests to only test one, so I've moved all tests to test dorado
This is due to the fact that pytest imports all files, the only time that pyguppy-client-lib and pybasecall-client-lib are imported at the same time In a real experiment, the plugin choice for base-caller imports the correct module and leaves the other one out of the interpreter runtime
…s we can't really cover them They require things link a read_until_api or live base caller to properly test
Needs the documentation updating - this is confirmed working with latest Minknow /Dorado 7.3 plus, and should work with <= 7.2. Final steps:
|
Updated to pull sample rate from MinKNOW. But also have found an issue with dorado returning two reads with the same id in the same batch. |
* Initial validation of minknow version and guppy dorado versions * MinKNOW compatibility check function Base-caller compatibiity check * Compatibility checks * Remove rogue comment * edit insanely long string multiline string example of sub tags * Local updates * Error checking and minor corrections to address compatibility. We no longer raise RuntimeException when a version issue is encountered. This means that readfish will continue to function with future versions of minKNOW if compatible. * Add default when popping `sample_rate` from guppy params Prevents `KeyError` from being raised if `sample_rate`isn't listed as a `kwarg` * Remove unused basecaller compatibilty function We now just check in `dorado.py` and warn if there is a mismatch * Correct docstring, add doctests * Add more complete docstring to `DIRECTION` enum Only return Enum variant from `check_compatibility`, not tuple of (bool, variant) * Suggest opening an issue if a suitable version of readfish doesn't exist --------- Co-authored-by: Adoni5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM let's goooooo
Create a new
dorado
plugin to handle the latestdorado_basecall_server
version (7.3.9
), which only accepts connections fromont-pybasecall-client-lib
, deprecatingont-pyguppy-client-lib
.This pull request is to trigger discussion about the best way to switch to the new ONT pybasecaller client.
Rather than introduce this in the guppy plugin I have created a distinct dorado plugin.
This is due to changes in how the reads need to be packaged for the client, which expects some surprising data (e.g sample rate). Given the switch from guppy to dorado at ONT it may make sense in the future for people to be loading a dorado plugin instead of a guppy plugin. Within a few months many users may have forgotten about (or never known about) guppy in the first place.
Related to the above, we need to find a way of providing the sample rate to the basecall client. Currently the base caller has no way of accessing this?
Any suggestions?
Until this is resolved the PR is not fit for merging.