Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tech Debt] Improve build time #5060

Open
wphicks opened this issue Dec 5, 2022 · 3 comments
Open

[Tech Debt] Improve build time #5060

wphicks opened this issue Dec 5, 2022 · 3 comments
Labels
feature request New feature or request

Comments

@wphicks
Copy link
Contributor

wphicks commented Dec 5, 2022

Summary

It would be useful to improve build time as much as possible for faster developer iteration and reduced CI resource consumption.

Current bottleneck files

The following are roughly in order of priority. Note that this does not correspond strictly to the longest build times (although it mostly does). Instead, it takes into account what can and can't be parallelized and ensures that we're tackling each dependent branch.

  • silhouette_score.cu: far and away our worst offender, with a compile duration of 22 minutes on my local machine. Fixing this alone could cut our compile time by a third
  • knn_regress_mg.cu
  • hdbscan.cu
  • tsne.cu
  • fil/infer.cu: Will be resolved if Provide FIL implementation for both CPU and GPU #4890 replaces existing implementation
  • hierarchy/linkage.cu
  • knn_classify_mg.cu
  • umap.cu
  • knn.cu
  • tree_shap.cu
  • trustworthiness.cu
  • svc.cu
  • knn_mg.cu
  • svr.cu
@wphicks wphicks added feature request New feature or request ? - Needs Triage Need team to review and classify labels Dec 5, 2022
@wphicks
Copy link
Contributor Author

wphicks commented Dec 5, 2022

Note, this issue replaces the original #3501 since there has been a fair amount of movement since then on this problem.

@wphicks wphicks removed the ? - Needs Triage Need team to review and classify label Dec 5, 2022
@cjnolet
Copy link
Member

cjnolet commented Dec 5, 2022

Some of these things, especially those things which require pairwise distances, will likely benefit from using the pre-compiled specializations that are already being built in RAFT. We've scraped through most things and updated them but there are still some things lingering, unfortunately, which might need to be updated (thinking single linkage, trustworthiness, silhouetter_score, hdbscan)

@wphicks
Copy link
Contributor Author

wphicks commented Dec 5, 2022

Just submitted #5061 to get the low-hanging fruit, but silhouette score remains our most significant bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants