-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unravel_index for row-major array. #631
Conversation
rerun tests |
hmm ? I thought libcu++ is available in raft by default. |
libcu++ is pulled in by cucollections, which is now only enabled when either building tests or the optional "distance" component. This was done to reduce the dependency burden for projects downstream who aren't using the sparse distances API (which is what depends on cucollections). |
Ah, thank you for sharing. I will try to switch to thrust instead. |
I have changed cuda::std::tuple to thrust tuple. The PR is ready. |
} | ||
|
||
/** | ||
* \brief Turns linear index into coordinate. Similar to numpy unravel_index. This is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be okay to expose this to public, as we are doing with flatten
and reshape
in #601 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If others think it's public ready then I'm happy to expose it. For me these are the concerns:
- what if one day mdspan decides to accept std::extents as an index?
- what if we use other implementations of tuple in the future? like
cuda::std::tuple
or juststd::tuple
instead ofthrust::tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjnolet had suggested that it's better we rely lesser on thrust as time goes on. I find value in using cuda::std::tuple
over thrust::tuple
as I have a feeling the latter is going to be replaced by the former. But maybe Corey feels differently about introducing libcu++ back as a core dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trivialfis I prefer to use std::tuple
here. I'd like to keep both thrust
and libcu++
out of public APIs. C++ stdlib is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjnolet but it doesn't always work in CUDA kernel. Also, it's hidden in detail name space.
The challenge w/ header-only here is that the dependencies for all included headers are required at compile-time by consumers downstream. That includes things pulled in transitively from the detail
namespace. The goal here is to draw the line and make it so that all the public apis in the core/
directory can be safely exposed by our consumers through their own public APIs while not imposing any additional dependencies on their users outside of RMM and the CTK libs. Currently, our mdarray header in detail
pulls in thrust and I'd like to eventually separate that out so that thrust isn't a hard requirement just for including mdarray
.
If we want to allow this function to be included by files in core/
, I would propose we find or create some other object to contain the unraveled indices. Though, if this is truly internal code and meant to stay that way, maybe we should consider separating this function out into a different header for now which is documented accordingly (for example, mentioning that it uses thrust, so it shouldn't be included by any headers in core/
). Maybe something like detail/mdarray_internal_utils.hpp
just to be very explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will hide this function as an internal function. Closing this PR for now and will submit it again when it's actually used.
template <typename LayoutPolicy, std::size_t... Exts> | ||
MDSPAN_INLINE_FUNCTION auto unravel_index(size_t idx, | ||
detail::stdex::extents<Exts...> shape, | ||
LayoutPolicy const&) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function arg needed? To stay compatible with NumPy args? I think it's acceptable that we are able to get this information directly from the template type, and thus to remove this arg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how you like to call the function:
unravel_index<stdex::layout_rigth>(idx, shape)
or
unravel_index(idx, shape, stdex::layout_right{})
I chose the second one as it feels more aligned with mdspan design:
submdspan(array, std::full_extent_t{}) // use constructor of full_extent_t here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, that is fair. I see value in deduced template types as compared to explicit
template <typename Fn, | ||
typename Tup, | ||
std::size_t kTupSize = thrust::tuple_size<std::remove_reference_t<Tup>>::value> | ||
MDSPAN_INLINE_FUNCTION auto constexpr apply(Fn&& f, Tup&& t) -> decltype(auto) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trailing return type here is redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's required to denote that the return type might be a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto&&
for universal references?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure if this will work. I'm okay with the former
} | ||
|
||
/** | ||
* C++ 17 style apply for thrust tuple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much like this, and that it just works. I wager this would be very useful in accessing and assigning for COOs
rerun tests |
} | ||
|
||
/** | ||
* \brief Turns linear index into coordinate. Similar to numpy unravel_index. This is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trivialfis I prefer to use std::tuple
here. I'd like to keep both thrust
and libcu++
out of public APIs. C++ stdlib is fine.
@cjnolet but it doesn't always work in CUDA kernel. Also, it's hidden in detail name space. |
Revised version of #631 with `thrust::tuple` replaced by `std::tuple` and custom `apply` function replaced by `std::apply`. Related: #629 The implementation is mostly copied from XGBoost https://github.com/dmlc/xgboost/blob/fdf533f2b9af9c068cddba50839574c6abb58dc3/include/xgboost/linalg.h#L523 . In the tests, I have used both `__host__ __device__` and `__device__` lambdas to make sure `std::tuple` and `std::apply` are working correctly with CUDA kernel. Authors: - Jiaming Yuan (https://github.com/trivialfis) Approvers: - Divye Gala (https://github.com/divyegala) - Corey J. Nolet (https://github.com/cjnolet) URL: #723
Related: #629 Only row-major is currently implemented in this PR. I can work on column-major support if needed. The function is hidden in the
detail
namespace right now due to the return type (thrust::tuple
) can not be directly used inmdspan
, which might change in the future. Please see the document and tests for example usage.The implementation is mostly copied from XGBoost (I'm the author) https://github.com/dmlc/xgboost/blob/fdf533f2b9af9c068cddba50839574c6abb58dc3/include/xgboost/linalg.h#L523 .