-
Notifications
You must be signed in to change notification settings - Fork 80
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MRG] Adjust
Index.find
search protocol to support selective collec…
…tion of matches (#1477) * begin refactoring 'categorize' * have the 'find' function for SBTs return signatures * fix majority of tests * comment & then fix test * torture the tests into working * split find and _find_nodes to take different kinds of functions * redo 'find' on index * refactor lca_db to use new find * refactor SBT to use new find * comment/cleanup * refactor out common code * fix up gather * use 'passes' properly * attempted cleanup * minor fixes * get a start on correct downsampling * adjust tree downsampling for regular minhashes, too * remove now-unused search functions in sbtmh * refactor categorize to use new find * cleanup and removal * remove redundant code in lca_db * remove redundant code in SBT * add notes * remove more unused code * refactor most of the test_sbt tests * fix one minor issue * fix jaccard calculation in sbt * check for compatibility of search fn and query signature * switch tests over to jaccard similarity, not containment * fix test * remove test for unimplemented LCA_Database.find method * document threshold change; update test * refuse to run abund signatures * flatten sigs internally for gather * reinflate abundances for saving * fix problem where sbt indices coudl be created with abund signatures * more * split flat and abund search * make ignore_abundance work again for categorize * turn off best-only, since it triggers on self-hits. * add test: 'sourmash index' flattens sigs * add note about something to test * fix typo; still broken tho * location is now a property * move search code into search.py * remove redundant scaled checking code * best-only now works properly for two tests * 'fix' tests by removing v1 and v2 SBT compatibility * simplify (?) downsampling code * require keyword args in MinHash.downsample(...) * fix bug with downsample * require keyword args in MinHash.downsample(...) * fix test to use proper downsampling, reverse order to match scaled * add test for revealed bug * remove unnecessary comment * flatten subject MinHash, too * add testme comment * clean up sbt find * clean up lca find * add IndexSearchResult namedtuple for search and gather results * add more tests for Index classes * add tests for subj & query num downsampling * tests for Index.search_abund * refactor a bit * refactor make_jaccard_search_query; start tests * even more tests * test collect, best_only * more search tests * remove unnec space * add minor comment * deal with status == None on SystemExit * upgrade and simplify categorize * restore test * merge * fix abundance search in SBT for categorize * code cleanup and refactoring; check for proper error messages * add explicit test for incompatible num * refactor MinHash.downsample * deal with status == None on SystemExit * fix test * fix comment mispelling * properly pass kwargs; fix search_sbt_index * add simple tests for SBT load and search API * allow arbitrary kwargs for LCA_DAtabase.find * add testing of passthru-kwargs * re-enable test * add notes to update docstrings * docstring updates * fix test * better tests for gather --save-unassigned * remove unnecessary check-me comment * clear out docstring * SBT search doesn't work on v1 and v2 SBTs b/c no min_n_below * fix my dumb mistake with gather * have the JaccardSearch.collect function take the matching signature * adjust search protocol to permit ignoring after finding * comment me * update comments, threshold setting Co-authored-by: Luiz Irber <[email protected]>
- Loading branch information
Showing
6 changed files
with
157 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters