Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] migrating cumlHandle_impl (and related dependencies) into raft #3

Merged

Conversation

teju85
Copy link
Member

@teju85 teju85 commented Apr 28, 2020

@dantegd @JohnZed @cjnolet @afender for review.

Notable changes from cuML:

  1. a base class called raft::Allocator to reduce code duplication between deviceAllocator and hostAllocator
  2. handle is called raft::handle_t, but I'm open to suggestions for a better name
  3. disabling of logging features, as we need to see how to expose the current spdlog wrapper in cuml, without actually needing users of raft (cugraph and cuml currently) worry about underlying spdlog library.
  4. raft::handle_t doesn't contain the comms interface. It is better tackled in a separate PR.
  5. Enable clang-tidy checks to enforce consistent naming convention.

This PR is still a WIP, but filing it early to solicit feedback about the interface.

@seunghwak
Copy link
Contributor

I see multiple styles in naming classes,

e.g.
class Allocator (started with upppercase)
class deviceAllocator (start with_lowercase)
class buffer_base (use _ in the middle)
class handle_t (use _t at the end)

And when do we use which style?

@cjnolet
Copy link
Member

cjnolet commented Apr 28, 2020

raft::handle_t doesn't contain the comms interface. It is better tackled in a separate PR.

This is perfect. I've been planning to move the comms over right after the handle is moved over.

cpp/include/raft/allocator.hpp Outdated Show resolved Hide resolved
cpp/include/raft/buffer_base.hpp Outdated Show resolved Hide resolved
cpp/include/raft/cudart_utils.h Outdated Show resolved Hide resolved
cpp/include/raft/cudart_utils.h Show resolved Hide resolved
cpp/include/raft/cudart_utils.h Outdated Show resolved Hide resolved
cpp/include/raft/handle.hpp Outdated Show resolved Hide resolved
cpp/include/raft/host_buffer.hpp Outdated Show resolved Hide resolved
Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! That looks in pretty good shape to me. I added a few notes.
I'd agree that a consistent naming convention from the beginning will help in the long run.

cpp/include/raft/cudart_utils.h Outdated Show resolved Hide resolved
cpp/include/raft/allocator.hpp Outdated Show resolved Hide resolved
cpp/include/raft/cudart_utils.h Outdated Show resolved Hide resolved
@dillon-cullinan
Copy link
Contributor

rerun tests

@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@teju85
Copy link
Member Author

teju85 commented May 11, 2020

Hey @dantegd @cjnolet @JohnZed @afender @seunghwak ,
I think the PR is now ready to be looked at again with all the comments from above addressed. Can you folks check if this now looks good to be merged or not?

@teju85 teju85 changed the title [WIP] migrating cumlHandle_impl (and related dependencies) into raft [REVIEW] migrating cumlHandle_impl (and related dependencies) into raft May 11, 2020
Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about using preceding _ (over trailing _) in class private member variable names (see my last comment), but besides this, this PR looks great to me :-)

cpp/include/raft/buffer_base.hpp Outdated Show resolved Hide resolved
@dillon-cullinan
Copy link
Contributor

rerun tests

Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@teju85 the one thing missing now that CI is totally up and running is enable the googletests in the gpuci build script:

# GTEST_OUTPUT="xml:${WORKSPACE}/test-results/raft_cpp/" ./test_raft

@teju85
Copy link
Member Author

teju85 commented May 12, 2020

All I need is another approval from someone from cuML team. @JohnZed mentioned that he's going to take a look at this one soon.

In the meanwhile, if @dantegd can approve this, it'll be great. Asking for more approvals and reviews as I think we need to thoroughly review such a fundamental change to raft!

Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I can't claim I fully get all the rules in this clang tidy format, but it seems to match the google style guide and the code style clearly complies with our standard, so it looks great to me.

#include <cuda_runtime.h>
#include "cudart_utils.h"

namespace raft {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like RAFT or Raft as we use capitalized MLCommon in cuML

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuml's current naming convention are a rather exception than rule. cuML started out much earlier than other projects in RAPIDS. At that time, we did a mistake of not standardizing any of these rules (mostly due to lack of time for thinking about these things)!

If you look at rmm, cudf or cugraph, they too are following the lowercase namespace. Thus, in order to be aligned with the rest of RAPIDS project, even here these are being named in lowercase.

cpp/include/raft/handle.hpp Outdated Show resolved Hide resolved
cpp/include/raft/mr/allocator.hpp Show resolved Hide resolved
@dantegd dantegd merged commit e003de2 into rapidsai:branch-0.14 May 12, 2020
@teju85 teju85 deleted the fea-ext-migrate-cumlHandle_impl-into-raft branch May 13, 2020 01:53
afender pushed a commit that referenced this pull request Jul 29, 2020
Update fork from latest release
divyegala referenced this pull request in divyegala/raft Oct 28, 2020
rapids-bot bot pushed a commit that referenced this pull request Aug 22, 2023
rapids-bot bot pushed a commit that referenced this pull request Feb 15, 2024
Demangle the error stack trace provided by GCC.
Example output:
```bash
RAFT failure at file=/workspace/raft/cpp/bench/ann/src/raft/raft_ann_bench_utils.h line=127: Ooops!
Obtained 16 stack frames
#1 in /workspace/raft/cpp/build/libraft_ivf_pq_ann_bench.so: raft::logic_error::logic_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) +0x5e [0x7fb20acce45e]
#2 in /workspace/raft/cpp/build/libraft_ivf_pq_ann_bench.so: raft::bench::ann::configured_raft_resources::stream_wait(CUstream_st*) const +0x2e3 [0x7fb20acd0ac3]
#3 in /workspace/raft/cpp/build/libraft_ivf_pq_ann_bench.so: raft::bench::ann::RaftIvfPQ<float, long>::search(float const*, int, int, unsigned long*, float*, CUstream_st*) const +0x63e [0x7fb20acd44fe]
#4 in ./cpp/build/ANN_BENCH: void raft::bench::ann::bench_search<float>(benchmark::State&, raft::bench::ann::Configuration::Index, unsigned long, std::shared_ptr<raft::bench::ann::Dataset<float> const>, raft::bench::ann::Objective) +0xf76 [0x55853859f586]
#5 in ./cpp/build/ANN_BENCH: benchmark::internal::LambdaBenchmark<benchmark::RegisterBenchmark<void (&)(benchmark::State&, raft::bench::ann::Configuration::Index, unsigned long, std::shared_ptr<raft::bench::ann::Dataset<float> const>, raft::bench::ann::Objective), raft::bench::ann::Configuration::Index&, unsigned long&, std::shared_ptr<raft::bench::ann::Dataset<float> const>&, raft::bench::ann::Objective&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void (&)(benchmark::State&, raft::bench::ann::Configuration::Index, unsigned long, std::shared_ptr<raft::bench::ann::Dataset<float> const>, raft::bench::ann::Objective), raft::bench::ann::Configuration::Index&, unsigned long&, std::shared_ptr<raft::bench::ann::Dataset<float> const>&, raft::bench::ann::Objective&)::{lambda(benchmark::State&)#1}>::Run(benchmark::State&) +0x84 [0x558538548f14]
#6 in ./cpp/build/ANN_BENCH: benchmark::internal::BenchmarkInstance::Run(long, int, benchmark::internal::ThreadTimer*, benchmark::internal::ThreadManager*, benchmark::internal::PerfCountersMeasurement*) const +0x168 [0x5585385d6498]
#7 in ./cpp/build/ANN_BENCH(+0x149108) [0x5585385b7108]
#8 in ./cpp/build/ANN_BENCH: benchmark::internal::BenchmarkRunner::DoNIterations() +0x34f [0x5585385b8c7f]
#9 in ./cpp/build/ANN_BENCH: benchmark::internal::BenchmarkRunner::DoOneRepetition() +0x119 [0x5585385b99b9]
#10 in ./cpp/build/ANN_BENCH(+0x13afdd) [0x5585385a8fdd]
#11 in ./cpp/build/ANN_BENCH: benchmark::RunSpecifiedBenchmarks(benchmark::BenchmarkReporter*, benchmark::BenchmarkReporter*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) +0x58e [0x5585385aa8fe]
#12 in ./cpp/build/ANN_BENCH: benchmark::RunSpecifiedBenchmarks() +0x6a [0x5585385aaada]
#13 in ./cpp/build/ANN_BENCH: raft::bench::ann::run_main(int, char**) +0x11ed [0x5585385980cd]
#14 in /lib/x86_64-linux-gnu/libc.so.6(+0x28150) [0x7fb213e28150]
#15 in /lib/x86_64-linux-gnu/libc.so.6: __libc_start_main +0x89 [0x7fb213e28209]
#16 in ./cpp/build/ANN_BENCH(+0xbfcef) [0x55853852dcef]


```

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #2188
dantegd pushed a commit to dantegd/raft that referenced this pull request Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants