Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Span implementation. #399

Merged
merged 6 commits into from
Feb 3, 2022
Merged

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Nov 23, 2021

The implementation is largely ported from xgboost with some simplification and cleanups.

The one in XGBoost was modeled after the core guideline support library instead of std, which
was still a draft back then. The one in this PR uses a plain pointer as the iterator and doesn't
have bound check in release mode. ISO span uses size_type for index operator, this PR defines
a template parameter to allow indexing using other integer types.

  • Implement host_span
  • Implement device_span

@trivialfis trivialfis requested review from a team as code owners November 23, 2021 08:13
@trivialfis trivialfis added the feature request New feature or request label Nov 23, 2021
@cjnolet cjnolet added the non-breaking Non-breaking change label Nov 23, 2021
@cjnolet cjnolet changed the base branch from branch-21.12 to branch-22.02 November 23, 2021 20:06
@seunghwak
Copy link
Contributor

Any reason to adopt the XGBoost based one instead of the cudf one (https://github.com/rapidsai/cudf/blob/branch-22.02/cpp/include/cudf/utilities/span.hpp)?

AFAIK, this cudf based implementation will eventually go to libcu++.

@trivialfis
Copy link
Member Author

Hi, we haven't decided which implementation to use. The one from xgboost is closer to std::span defined in c++ 20. This PR is just to push the discussion.

@seunghwak
Copy link
Contributor

Hi, we haven't decided which implementation to use. The one from xgboost is closer to std::span defined in c++ 20. This PR is just to push the discussion.

I see. I guess the benefit of the cudf one is that it has host_span & device_span so allows us to specify whether the pointer is for host memory or device memory. I guess this is beneficial unless CUDA's unified memory model advances to the level that this distinction is unnecessary.

@trivialfis
Copy link
Member Author

trivialfis commented Dec 3, 2021

Yup, having safeguards is certainly useful. But the one in cudf doesn't support static extend https://github.com/rapidsai/cudf/blob/7ac8aac8d9174ecfcb06f56199c641bf3d869961/cpp/include/cudf/utilities/span.hpp#L42 . If we were to adopt it static extend must be integrated (not sure what's the best way yet).

Please note that static extend can be used in both host and device.

@trivialfis trivialfis changed the title [WIP] Span implementation. [REVIEW] Span implementation. Dec 14, 2021
@trivialfis
Copy link
Member Author

Added a template parameter to distinguish host and device span.

@trivialfis
Copy link
Member Author

I should rename span_t to span_base and similarly for other variants.

@trivialfis
Copy link
Member Author

trivialfis commented Dec 17, 2021

Made lint changes to some other files.

The implementation is largely ported from xgboost with some simplification and cleanups.

The one in XGBoost was modeled after core guideline support library instead of std, which
was still a draft back then.  The one in this PR uses plain pointer as iterator and don't
have bound check in release model.

Tests.

Distinguish host/device.

More tests.

Rename.

Rename + flexiable indexing.

lint
@github-actions
Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@hcho3
Copy link
Contributor

hcho3 commented Jan 20, 2022

Any progress on this one? TreeExplainer in cuML (rapidsai/cuml#4473) can benefit from a Span implementation. (Right now, it has its own small Span class.) Or is this going to be part of the upcoming MDSpan (#437) ?

@trivialfis
Copy link
Member Author

One can use mdspan to define span equivalence, but the underlying procedures are lot more complicated.

@hcho3
Copy link
Contributor

hcho3 commented Jan 20, 2022

Got it. In that case, TreeExplainer could potentially use mdspan instead.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor things at this point but wanted to provide the review promptly so we can get both span and mdspan into 22.04.

cpp/include/raft/common/span.hpp Outdated Show resolved Hide resolved
cpp/include/raft/common/span.hpp Outdated Show resolved Hide resolved
* - Otherwise, dynamic_extent.
*/
template <std::size_t Extent, std::size_t Offset, std::size_t Count>
struct extent_value_t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side thought- it would be super useful to be able to implicitly convert between a _mdspan_vector_t and a _span_t. For example, this would allow the RAFT API to accept an _mdspan_vector_t and pass it a _span_t

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjnolet We have some integration work to do. For instance, the dynamic_extent in this PR is duplicated with the one in stdex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can polish after having the base implementation and make a list of additional features we want, like conversion between span and 1-dim mdspan, padding for arrays, resize for arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trivialfis, I've looked over your changes and it's looking like this is about ready to merge. What do you think? Can you capture the follow-on items in Github issues so we don't lose them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpp/include/raft/common/span.hpp Outdated Show resolved Hide resolved
cpp/include/raft/common/span.hpp Outdated Show resolved Hide resolved
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Just one comment, and love the test coverage. Just another question - why so many direct calls to cudart in the tests?

cpp/include/raft/common/span.hpp Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member Author

I would actually not want to name it host_span_t, because I think only STL types should use that _t extension. raft::host_span seems okay to me personally

Got it, I will remove all the postfix for span and mdspan.

why so many direct calls to cudart in the tests?

My apologies, it was ported from xgboost, which supports enumerating devices. Will remove those calls.

* Update license.
* Add doc strings.
* Move to top level namespace.
* Remove set device.
@trivialfis trivialfis changed the base branch from branch-22.02 to branch-22.04 January 26, 2022 11:29
@trivialfis
Copy link
Member Author

Changed base to 22.04.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpucibot merge

@trivialfis
Copy link
Member Author

Hey, is there anything else I need to do for this PR? Maybe retrigger the CI?

@cjnolet
Copy link
Member

cjnolet commented Feb 3, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 7568d4a into rapidsai:branch-22.04 Feb 3, 2022
}

private:
size_type size_{0};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is missing a principle characteristic of span where a span with a static extent has only a single pointer data member. This is usually accomplished by doing something like:

template <typename T, std::size_t Extent>
struct span_storage{
   span_storage(T* p, std::size_t /*unused*/) : data_{p} {}
   constexpr std::size_t size(){ return Extent; };
   T* data_ = nullptr;
   // no size member
};
template <typename T>
struct span_storage<T, dynamic_extent>{
   span_storage(T* p, std::size_t s) : data_{p}, size_{s} {}
   constexpr std::size_t size() { return size_; };
private:
   T* data_ = nullptr;
   std::size_t size_;
};
...

template <typename T, std::size_t Extent = dynamic_extent>
struct span{
   ...
private:
   span_storage<T, Extent> storage_;
}

This further guarantees the compiler will optimize according to the static extent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, will follow up on this #511 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants