Learn heuristic to pick fastest select_k algorithm #1523

benfred · 2023-05-17T18:42:32Z

This uses the select_k dataset from #1497 to learn a heuristic of the fastest select_k variant based off the rows/ cols/k of the input. This heuristic is modelled as a DecisionTree, which is automatically exported in C++ code that is compiled into RAFT. This lets us learn a function to pick the fastest select_k method - which requires only a few if statements in C++ code, making it very cheap to evaluate.

This uses the select_k dataset from rapidsai#1497 to learn a heuristic of the fastest select_k variant based off the rows/ cols/k of the input. This heuristic is modelled as a DecisionTree, which is automatically exported in C++ code that is compiled into RAFT. This lets us learn a function to pick the fastest select_k method - which requires only a few if statements in C++ code, making it very cheap to evaluate.

cjnolet

LGTM. Just one tiny suggestion.

cjnolet · 2023-05-17T20:36:45Z

cpp/include/raft/matrix/detail/select_k-inl.cuh

+ * on different values of rows/cols/k. The decision tree is converted to c++
+ * code, which is cut and paste below.
+ *
+ * The code to generate is in cpp/scripts/heuristics/select_k, running the


Just a tiny nitpick:

Suggested change

* The code to generate is in cpp/scripts/heuristics/select_k, running the

* NOTE: The code to generate is in cpp/scripts/heuristics/select_k, running the

cjnolet · 2023-05-17T22:49:16Z

/merge

achirkin · 2023-05-18T06:23:18Z

cpp/include/raft/matrix/detail/select_k-inl.cuh

+ */
+inline Algo choose_select_k_algorithm(size_t rows, size_t cols, int k)
+{
+  if (k > 134) {


I think, we'd better use log_2(k) instead of k when constructing the heuristic, so that all values of k go in powers of two. For all warp-based algorithms, performance for non-powers of two is equal to their rounded-up powers of two (queue capacity parameter).

benfred requested a review from a team as a code owner May 17, 2023 18:42

benfred added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 17, 2023

github-actions bot added the cpp label May 17, 2023

benfred mentioned this pull request May 17, 2023

Learn heuristic to pick fastest select_k algorithm #1455

Closed

benfred added 2 commits May 17, 2023 11:46

Try select_k benchmarks with and without a memory pool

46d8980

missing file

1b8ef37

benfred mentioned this pull request May 17, 2023

use matrix::select_k in brute_force::knn call #1463

Merged

cjnolet approved these changes May 17, 2023

View reviewed changes

code review updates

b5a2acc

cjnolet assigned benfred May 17, 2023

rapids-bot bot merged commit 618dc23 into rapidsai:branch-23.06 May 17, 2023

achirkin reviewed May 18, 2023

View reviewed changes

benfred deleted the select_k_heuristic2 branch May 15, 2024 22:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learn heuristic to pick fastest select_k algorithm #1523

Learn heuristic to pick fastest select_k algorithm #1523

benfred commented May 17, 2023

cjnolet left a comment

cjnolet May 17, 2023

cjnolet commented May 17, 2023

achirkin May 18, 2023

	* The code to generate is in cpp/scripts/heuristics/select_k, running the
	* NOTE: The code to generate is in cpp/scripts/heuristics/select_k, running the

Learn heuristic to pick fastest select_k algorithm #1523

Learn heuristic to pick fastest select_k algorithm #1523

Conversation

benfred commented May 17, 2023

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet May 17, 2023

Choose a reason for hiding this comment

cjnolet commented May 17, 2023

achirkin May 18, 2023

Choose a reason for hiding this comment