Fix syncing mechanism in `raft-ann-bench` C++ search #1961

divyegala · 2023-11-04T17:59:28Z

No description provided.

tfeher · 2023-11-05T12:48:24Z

cpp/bench/ann/src/common/benchmark.hpp

@@ -340,6 +349,11 @@ void bench_search(::benchmark::State& state,
      double actual_recall = static_cast<double>(match_count) / static_cast<double>(total_count);
      state.counters.insert({{"Recall", actual_recall}});
    }
+    std::cout << "Last thread about to acquire lock" << std::endl;
+    // std::unique_lock lk(init_mutex);
+    processed.store(false, std::memory_order_acq_rel);


Thanks @divyegala, this goes in the good direction! This line is only executed as part of the conditional block if (state.thread_index() == state.threads() - 1), so the last thread resets the processed variable and therefore signals thread 0 that it can load the next index.

But do we have any guarantee that the last thread is executed last? I think we cannot assume that. If (for example) thread 4 is stalled and last thread finishes before that, then we have again run into a problem: thread 0 reloading index while thread 4 still using it.

I think we need counter to keep track how many threads have finished computation, and only reset the processed var once counter reaches state.threads()

gbench guarantees that all threads enter the function at the same time, right? So thread 0 will not reload until all threads have finished computation. Is the all-thread synchronization only a one time thing at the start of the benchmark loop, or every single time bench_search is invoked by gbench threads? I assumed it was the latter. If not, I can add the counter logic.

cc @cjnolet who might have a quick answer here

@tfeher I reworked the logic such that either case will work now IMO, whether thread sync has been achieved at the start of the function bench_search or not

The guarantees are outlined here. I guess it's not fully guaranteed that the threads will enter bench_search() at the same time. I wonder, though, could we just move the 2 lines of code that read the pointers written by thread0 into the benchmarks loop itself? That would then guarantee all the threads enter the benchmarks loop together. I can't imagine just reading these two variable would have an effect on perf. What do you guys think?

@divyegala if we do this right and do an atomic counter for the number of threads that have entered the sync portion, we will have effectively created our own barrier, which I think we could add to the code here and reuse. The major benefit to this is that once we do update to c++20, it'll be a single line code change to go from raft::barrier() to std::barrier().

@cjnolet I agree, please see my latest commit. Hopefully that seems like an iron-clad solution here.

@divyegala you are right, the current implementation is sufficient. The next benchmark cannot start until all threads finished the current benchmark.

Note that this is not a documented feature of gbech, it is only visible if we look into look into the actual implementation.

Thanks for checking that @tfeher. I changed the implementation anyway that will now work even if gbench didn't synchronize the threads before calling the function.

cjnolet · 2023-11-05T23:13:04Z

/merge

Authors: - Divye Gala (https://github.com/divyegala) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1961

attempt to fix syncing mechanism

2493674

divyegala added bug Something isn't working non-breaking Non-breaking change labels Nov 4, 2023

divyegala self-assigned this Nov 4, 2023

github-actions bot added the cpp label Nov 4, 2023

divyegala added 2 commits November 4, 2023 11:50

use atomic_bool

543a380

remove unused var

bd429fe

tfeher reviewed Nov 5, 2023

View reviewed changes

divyegala added 3 commits November 5, 2023 08:20

rework with thread count

49d3ee7

style fix

7e869b4

reorder headers

4d1cd94

cjnolet approved these changes Nov 5, 2023

View reviewed changes

divyegala marked this pull request as ready for review November 5, 2023 22:21

divyegala requested a review from a team as a code owner November 5, 2023 22:21

cjnolet approved these changes Nov 5, 2023

View reviewed changes

rapids-bot bot merged commit bafd2a8 into rapidsai:branch-23.12 Nov 5, 2023
57 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix syncing mechanism in `raft-ann-bench` C++ search #1961

Fix syncing mechanism in `raft-ann-bench` C++ search #1961

divyegala commented Nov 4, 2023

tfeher Nov 5, 2023 •

edited

Loading

divyegala Nov 5, 2023 •

edited

Loading

divyegala Nov 5, 2023

divyegala Nov 5, 2023

cjnolet Nov 5, 2023

divyegala Nov 5, 2023

tfeher Nov 6, 2023

divyegala Nov 6, 2023

cjnolet commented Nov 5, 2023

Fix syncing mechanism in raft-ann-bench C++ search #1961

Fix syncing mechanism in raft-ann-bench C++ search #1961

Conversation

divyegala commented Nov 4, 2023

tfeher Nov 5, 2023 • edited Loading

Choose a reason for hiding this comment

divyegala Nov 5, 2023 • edited Loading

Choose a reason for hiding this comment

divyegala Nov 5, 2023

Choose a reason for hiding this comment

divyegala Nov 5, 2023

Choose a reason for hiding this comment

cjnolet Nov 5, 2023

Choose a reason for hiding this comment

divyegala Nov 5, 2023

Choose a reason for hiding this comment

tfeher Nov 6, 2023

Choose a reason for hiding this comment

divyegala Nov 6, 2023

Choose a reason for hiding this comment

cjnolet commented Nov 5, 2023

Fix syncing mechanism in `raft-ann-bench` C++ search #1961

Fix syncing mechanism in `raft-ann-bench` C++ search #1961

tfeher Nov 5, 2023 •

edited

Loading

divyegala Nov 5, 2023 •

edited

Loading