-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Expose UMAP embedding graph #4292
Comments
This issue has been labeled |
FWIW, this is of continued interest for use cases around security, fraud, genomics, and visualizing embeddings in general. We are discussing w/ the umap team on explainable AI approaches that build on this. Meanwhile, we're doing a k-nn edge-recovery dance to work around for RAPIDS flavors. |
This issue has been labeled |
This is of increasing relevance fwiw :) |
This issue has been labeled |
We are still interested in this. We're adding some autoumap bits to pygraphistry, and have to workaround when users switch from umap_learn to cuml.umap... |
This issue has been labeled |
Still of interest :) We are getting ready for the |
@lmeyerov this shouldn't be too hard to do. Just to clarify- what you want is the fuzzy simplicial set graph here? |
Yep -- with priority for the one in the original space, not the embedding, and with the normalized/undirected weights. Our intuition is the original is, for explainability, the original space's weighted 1-simplexes are already interpretable and more precise. Likewise, enables graph layout with the same initial seed. Getting the embedding's simplex is interesting too, mostly for enabling us to highlight which 1-simplexes were added vs lost.. but that's priority 2. |
ping :) we're about to release |
@taureandyernv we are going to try and aim for 22.06. There's a PR open (#4711) to expose the simplicial set functions, which required the need to expose sparse objects in Python which were populated by the c++ layer (e.g. the number of nonzeros isn't known ahead of time) which should make exposing the connectivites graph from a trained model even easier. |
This PR closes issues #3123, #4704 and #4292 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) URL: #4711
@lmeyerov, https://github.com/rapidsai/cuml/pull/4756/files added the attribute |
Excellent, is this for the 22.06 release (and in nightly's already)? |
@lmeyerov, yep, the feature made it into 22.06 (and the nightlies). |
This PR closes issues rapidsai#3123, rapidsai#4704 and rapidsai#4292 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#4711
Is your feature request related to a problem? Please describe.
We would like access to umap's computed graph for downstream tasks like visualization and other ML methods
See additional use cases discussed in #4228
Describe the solution you'd like
When computing an embedding, have an option to also expose the weighted graph / cover tree, such as via a numpy sparse matrix (how
umap_learn
does it) or a cugraph weighted graphEx:
Describe alternatives you've considered
umap_learn
hasumap.UMAP(transform_mode='graph', ...)
, except using that might mean having to callumap()
twice. An explicit flag to expose the graph as part of the output may be more in line with expected use.Currently, when using
umap_learn
, we do the above. When usingcuML
, we manually runknn
to try to infer the graph from the embedding, but that's awkward and less accurate.The text was updated successfully, but these errors were encountered: