-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRG: adjust protein ksize for record/manifest #3019
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## latest #3019 +/- ##
=======================================
Coverage 86.70% 86.71%
=======================================
Files 135 135
Lines 15395 15396 +1
Branches 2624 2624
=======================================
+ Hits 13349 13350 +1
Misses 1745 1745
Partials 301 301
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
bluegenes
changed the title
MRG: adjust protein ksizes
MRG: adjust protein ksize for record/manifest
Feb 21, 2024
ready for review @sourmash-bio/devs |
luizirber
approved these changes
Feb 21, 2024
4 tasks
ctb
added a commit
that referenced
this pull request
Feb 23, 2024
) Updates `src/core/CHANGELOG.md` with release notes; see below. ## Release checklist from #2987 (comment): - [x] verify minimum supported rust version (MSRV) for writing release notes (see #2988 for an example); MSRV is checked by CI in `.github/workflows/rust.yml`, `minimum_rust_version` - [x] write release notes using `git log --oneline r0.12.0..latest src/core | cut -d\ -f2- > /tmp/out.txt` - [x] verify that the ChangeLog is up to date: https://github.com/sourmash-bio/sourmash/blob/latest/src/core/CHANGELOG.md - [x] bump version in `src/core/Cargo.toml` ## Release notes for r0.13.0 MSRV: 1.65 Changes/additions: * Calculate all gather stats in rust; use for rocksdb gather (#2943) * adjust protein ksize for record/manifest (#3019) * Allow changing storage location for a collection in RevIndex (#3015) * make core Manifest booleans python compatible (core) (#3007) Updates: * Bump roaring from 0.10.2 to 0.10.3 (#3014) * Bump histogram from 0.9.0 to 0.9.1 (#3002) * Bump chrono from 0.4.33 to 0.4.34 (#3000) * Bump web-sys from 0.3.67 to 0.3.68 (#2998) * Bump num-iter from 0.1.43 to 0.1.44 (#2997) * Bump wasm-bindgen-test from 0.3.40 to 0.3.41 (#2996)
bluegenes
added a commit
to sourmash-bio/sourmash_plugin_branchwater
that referenced
this pull request
Feb 23, 2024
This PR updates manifest generation in `manysketch` to use `Record` and `Manifest` utils from core. This also makes the code significantly simpler, since we can generate manifest information directly from the signature, rather than saving sketch parameters alongside sketches. Required `0.13.0` core due to these fixes: - sourmash-bio/sourmash#3007 - sourmash-bio/sourmash#3019 * Fixes #203
ctb
added a commit
that referenced
this pull request
Mar 21, 2024
# Release notes for sourmash v4.8.7 Note: This release changes the way `sourmash multigather` names output files in some situations. Please see #2722 for details. Minor new features: * support proper manifest creation with `--relpath` for `sig check` and `sig collect` (#3054) * fix `multigather` output by adding md5sum along with `-U/--output-add-query-md5sum` (#2722) * enable loading lineages from annotated gather with match_name instead of name (#3078) Bug fixes: * fix output for `sketch ... --singleton` (#3066) * fix `calculate_gather_stats` `threshold=0` bug (#3052) Cleanup and documentation updates: * adjust protein ksize for record/manifest (#3019) * Resolve `sourmash gather --help` issue (#3032) * rework the manifest documentation; do misc cleanup (#3027) * add branchwater web to docs (#3018) Developer updates: * make core Manifest booleans python compatible (core) (#3007) * safer ksize selection while still accommodating k=k*3 (#3028) * fix clippy beta issues (#3088) * tell dependabot to ignore upgrades to `byteorder`, `chrono`, `once_cell`, and `wasm-bindgen` (#3065) * update rust changelog for r0.13.0 in preparation for release (#3033) * Allow changing storage location for a collection in RevIndex (#3015) * Fix tox and nix configs so all tox tests execute correctly (#2992) * Calculate all gather stats in rust; use for rocksdb gather (#2943) * bump screed req to 1.1.3 (#3067) * bump to v4.8.7-dev (#2989) Dependabot updates: * Bump DeterminateSystems/magic-nix-cache-action from 1 to 3 (#2994) * Bump DeterminateSystems/magic-nix-cache-action from 3 to 4 (#3085) * Bump DeterminateSystems/nix-installer-action from 4 to 9 (#2995) * Bump DeterminateSystems/nix-installer-action from 9 to 10 (#3083) * Bump chrono from 0.4.33 to 0.4.34 (#3000) * Bump conda-incubator/setup-miniconda from 3.0.1 to 3.0.2 (#3046) * Bump conda-incubator/setup-miniconda from 3.0.2 to 3.0.3 (#3057) * Bump histogram from 0.9.0 to 0.9.1 (#3002) * Bump itertools from 0.12.0 to 0.12.1 (#3043) * Bump log from 0.4.20 to 0.4.21 (#3062) * Bump num-iter from 0.1.43 to 0.1.44 (#2997) * Bump pypa/cibuildwheel from 2.16.5 to 2.17.0 (#3084) * Bump rayon from 1.8.1 to 1.9.0 (#3058) * Bump roaring from 0.10.2 to 0.10.3 (#3014) * Bump serde from 1.0.196 to 1.0.197 (#3045) * Bump serde_json from 1.0.113 to 1.0.114 (#3044) * Bump tempfile from 3.10.0 to 3.10.1 (#3059) * Bump thiserror from 1.0.56 to 1.0.57 (#2999) * Bump thiserror from 1.0.57 to 1.0.58 (#3082) * Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060) * Bump wasm-bindgen-test from 0.3.40 to 0.3.41 (#2996) * Bump wasm-bindgen-test from 0.3.41 to 0.3.42 (#3063) * Bump web-sys from 0.3.67 to 0.3.68 (#2998) * Bump web-sys from 0.3.68 to 0.3.69 (#3061) * Revert "Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060)" (#3064) * Update asv to 0.6.2 (#3025) * Update pytest requirement from <8.1.0,>=6.2.4 to >=6.2.4,<8.2.0 (#3075)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Protein k-mer sizes are k=k*3 internally, but k in a manifest.
Record is used within manifest, so should reflect k.
This should not impact selection at sig level. In
signature::Select
:So here we match exact ksize or k=k*3, regardless of ksize or molecule type. This is b/c we don't have access to
is_protein
or any other property unless we load the minhash.we may want to be more explicit at some point, but this solves the immediate problem