-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the dice metric to UMAP #5140
base: branch-23.02
Are you sure you want to change the base?
Add the dice metric to UMAP #5140
Conversation
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4656
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Answers rapidsai#4586 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4596
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Depends on rapidsai/raft#583 and rapidsai/cumlprims_mg#86 This PR includes a few fixes to support source-only builds: 1. Use `CPMFindPackage` to get `cumlprims_mg` a. This enables building against a local `cumlprims_mg` build (`-D cumlprims_mg_ROOT=/cumlprims_mg/cpp/build`) 2. Explicitly call `find_package(Threads)` to be more resilient to RAFT changes Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4649
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Goals: - Python documentation should be built for PRs - Checks should fail upon sphinx warnings - Existing warnings are fixed in this PR This PR covers only the python documentation. C++ docs are out of scope. Authors: - Rory Mitchell (https://github.com/RAMitchell) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4585
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Continuation of rapidsai#2975 to change docstring to a doctest format. I'm also adding a pytest file to execute doctest, similar to what's being done in [cudf](https://github.com/rapidsai/cudf/blob/branch-22.04/python/cudf/cudf/tests/test_doctests.py) Authors: - Micka (https://github.com/lowener) - Yuqiong Li (https://github.com/yuqli) - Dante Gama Dessavre (https://github.com/dantegd) - Michael Demoret (https://github.com/mdemoret-nv) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4618
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
… single GPU (rapidsai#4645) Fixes single GPU build by dividing decomposition pxd headers so that cumlprims files are not included in single GPU cython files. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4645
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Replace `cudf.logical_not` with `~`. Authors: - Andy Adinets (https://github.com/canonizer) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4669
- removed default `T = float` from multi-sum - tests with float64 for multi-sum and `child_index()` - refactored multi-sum tests to reduce shared memory usage - this is needed for the tests with float64 to compile This is pull request 1 of 3 to integrate rapidsai#4646. This pull request is partly based on the work by @levsnv. Authors: - Andy Adinets (https://github.com/canonizer) Approvers: - Divye Gala (https://github.com/divyegala) - William Hicks (https://github.com/wphicks) URL: rapidsai#4648
This small PR : * Rectifies/clarifies the wrong parameter description for `max_depth` parameter in `RandomForestRegressor` and `RandomForestClassifier` in RF and dask-RF. * Corrects a syntax error in dask-rf docs * Corrects the `max_depth` check in RF-python layer resolves rapidsai#4623 Authors: - Venkat (https://github.com/venkywonka) Approvers: - Rory Mitchell (https://github.com/RAMitchell) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4666
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
We don't have arm64 packages for `rapids-doc-env`, so only run the doc builds for x86_64 Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Templatized functions related to FIL inference in preparation of `float64` support. Instantiations of templates with `float64`, or tests for `float64`, _are not included_; they will be provided in a future pull request. This is pull request 2 of 3 to integrate rapidsai#4646. This pull request is partly based on the work by @levsnv. Authors: - Andy Adinets (https://github.com/canonizer) - Levs Dolgovs (https://github.com/levsnv) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Divye Gala (https://github.com/divyegala) - William Hicks (https://github.com/wphicks) URL: rapidsai#4655
Fix rapidsai#4638. Authors: - Rory Mitchell (https://github.com/RAMitchell) Approvers: - Dante Gama Dessavre (https://github.com/dantegd)
[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci]
Add a `libcuml-tests` package to the `libcuml` recipe: - This is a prerequisite for removing "Project Flash" from our build/CI scripts. - The `libcuml-tests` package was added as an additional output to the existing `libcuml` recipe (which was renamed to `libcuml-split`) Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4635
`float64` support in FIL core and tests. Added `float64`-based instantiations of FIL types and functions, together with `float64`-based FIL tests. Treelite and Python layers _are not covered_ here, but will be in separate pull requests. This is pull request 3 of 3 to enable `float64` support in FIL core. This is partly based on the work by @levsnv. Authors: - Andy Adinets (https://github.com/canonizer) - Levs Dolgovs (https://github.com/levsnv) Approvers: - Divye Gala (https://github.com/divyegala) URL: rapidsai#4646
Authors: - Jiwei Liu (https://github.com/daxiongshu) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4692
[REVIEW] Unpin `dask` & `distributed` for development
This PR fixes a minor typo in the cuML intro documentation. Authors: - Christopher Akiki (https://github.com/cakiki) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4695
CMake 3.23.1 contains the bug fixes that we need to make use of CMake 3.23, so now we can update the pinnings to just avoid 3.23.0. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Jordan Jacobelli (https://github.com/Ethyling) - Robert Maynard (https://github.com/robertmaynard) URL: rapidsai#4698
`float64` support in treelite->FIL import and Python layer Authors: - Andy Adinets (https://github.com/canonizer) - Levs Dolgovs (https://github.com/levsnv) Approvers: - Philip Hyunsu Cho (https://github.com/hcho3) - William Hicks (https://github.com/wphicks) URL: rapidsai#4690
Due to some limitations on the number of reusable workflows that can be used in a given GH Action workflow, I need to switch the nightly workflow files (`build.yaml` and `test.yaml`) to use `workflow_dispatch` instead of `workflow_call`. Because of this switch, we can also remove the hardcoded `repo` field in `{build,test}.yaml`.
…5108) This is a small cleanup to remove the outdated `PROJECT_FLASH` environment variable from the libcuml recipe. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5108
This PR adds a missing dependency for the `pr-builder` job in the `pr.yaml` workflow. Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)
…i#5113) Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Divye Gala (https://github.com/divyegala) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#5113
closes rapidsai#4986 Authors: - Divye Gala (https://github.com/divyegala) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#5102
Answers rapidsai#5106 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5116
…#5117) Authors: - William Hicks (https://github.com/wphicks) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5117
rapids-cmake 23.02 is deprecating the magic value of `ALL` since it doesn't cleanly map to the cmake magic value of `all`. Instead we use `RAPIDS` which better represents the architectures we are building for. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5124
…5125) Small cleanup to remove `MACOSX_DEPLOYMENT_TARGET` references from a build script (macOS is not supported by RAPIDS). Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#5125
It looks like there are still some attributes with `ptr` in their name that are being serialized with the hdbscan object and that's causing some segfaults when serialization is performed on a separate process. Closes rapidsai#4986 Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5128
This PR adds pip wheel CI to the Conda CI, instead of having them work separately. Authors: - Sevag H (https://github.com/sevagh) - AJ Schmidt (https://github.com/ajschmidt8) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#5109
Authors: - Carl Simon Adorf (https://github.com/csadorf) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#5026
@@ -572,12 +572,12 @@ def test_fuzzy_simplicial_set(n_rows, | |||
@pytest.mark.parametrize('metric', ['l2', 'euclidean', 'sqeuclidean', 'l1', | |||
'manhattan', 'minkowski', 'chebyshev', | |||
'cosine', 'correlation', 'jaccard', | |||
'hamming', 'canberra']) | |||
'hamming', 'canberra', 'dice']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can see if CI passes on this but I don't believe our dense pairwise distance API is currently able to compute Dice distance. This is a really trivial computation to do w/ any primitive that's capable of computing euclidean, though. It just requires the dot product and l1 norms:
value_t q_r_union = q_norm + r_norm;
value_t dice = (2 * dot) / q_r_union;
bool both_empty = q_r_union == 0;
return 1 - ((!both_empty * dice) + both_empty);
Let me know if this is something we should prioritize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passed for me locally before I put this PR up, but your point is well taken as the dense pairwise distance API doesn't support this currently. The existing machinery seems to handle some but not all of the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly still scratching my head at how this could be working w/ dense inputs, which is making me wonder if we're getting a false positive in the passing test. The FAISS brute-force API, which we're still using at the present time for non-L2 (and non-haversine) distances, doesn't support dice distance at all.
Sounds like we should table this PR until we better understand what's happening. Unless others disagree, will close this PR. |
@cjnolet I suggest we schedule this for 23.04. |
3bc1de0
to
e7fd6cc
Compare
@dantegd @beckernick Can we close this in favor of #5943? |
This PR:
metric="dice"
to UMAPCloses #5129