Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating cmake files to enable NVTX markers for cuML build #4684

Closed
wants to merge 1 commit into from

Conversation

vinaydes
Copy link
Contributor

@vinaydes vinaydes commented Apr 5, 2022

Currently cuML is not built with NVTX markers even when compiled with NVTX=ON. This PR enables NVTX markers for cuML (libcuml++so, tests and benchmark binary) build when RAFT is picked from conda environment (which is default behavior right now). It should work even when RAFT is picked from user specified path, but I haven't tested it.
As discussed previously elsewhere, we might decide to enable NVTX by default in future, this is a patch for time being.
@venkywonka @teju85 @achirkin @cjnolet @dantegd

@vinaydes vinaydes requested a review from a team as a code owner April 5, 2022 04:21
@GPUtester
Copy link
Contributor

Can one of the admins verify this patch?

@vinaydes
Copy link
Contributor Author

vinaydes commented Apr 5, 2022

IMO even this line

should be removed. Since RAFT expects NVTX_ENABLED and also RAFT does not get compiled as a lib. We need to compile the source that includes RAFT headers with -DNVTX_ENABLED. I could not test it and thus not doing that change. Let me know if it is okay to remove the line.

Copy link
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been working with cmake for a while, but I've had an impression that normally this all should be handled by cmake as a regular header-only library. Raft is supposed to populate these compile/link options using the INTERFACE scope.

Also what puzzles me, get_raft.cmake has a few more parameters besides NVTX; how come some of them work and this one does not?..

I haven't tested this when raft is picked up from the conda environment, but the current setting works on my machine well with raft set by -DCPM_raft_SOURCE=../raft + DISABLE_FORCE_CLONE_RAFT=ON.

Comment on lines +107 to +110
if(NVTX)
target_compile_options(${PRIMS_BENCH_TARGET} PRIVATE "-DNVTX_ENABLED")
target_link_libraries(${PRIMS_BENCH_TARGET} PRIVATE nvToolsExt)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all bench and test targets depend on the cuml library target, we shouldn't need to repeat the compile options / link libraries. Does it work properly without these lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw link time errors related to unresolved nvtx function calls. Thats why added them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it still error if you change the scope to PUBLIC in the library's CMakeLists?

@vinaydes
Copy link
Contributor Author

vinaydes commented Apr 5, 2022

Also what puzzles me, get_raft.cmake has a few more parameters besides NVTX; how come some of them work and this one does not?..
I haven't tested this when raft is picked up from the conda environment, but the current setting works on my machine well with raft set by -DCPM_raft_SOURCE=../raft + DISABLE_FORCE_CLONE_RAFT=ON.

@achirkin May be I am getting the CMake fundamentals wrong here but we are passing here NVTX as a parameter and if you look at nvtx.hpp it expects NVTX_ENABLED. IS CMake supposed to make this translation?

See line https://github.com/rapidsai/raft/blob/branch-22.06/cpp/include/raft/common/detail/nvtx.hpp#L23

@achirkin
Copy link
Contributor

achirkin commented Apr 5, 2022

May be I am getting the CMake fundamentals wrong here but we are passing here NVTX as a parameter and if you look at nvtx.hpp it expects NVTX_ENABLED. IS CMake supposed to make this translation?

NVTX is the cmake option, NVTX_ENABLED is the CPP macro; it's set in cpp/CMakeLists.txt#L166.

@vinaydes
Copy link
Contributor Author

vinaydes commented Apr 5, 2022

May be I am getting the CMake fundamentals wrong here but we are passing here NVTX as a parameter and if you look at nvtx.hpp it expects NVTX_ENABLED. IS CMake supposed to make this translation?

NVTX is the cmake option, NVTX_ENABLED is the CPP macro; it's set in cpp/CMakeLists.txt#L166.

I see. Does raft's CMakeLists.txt gets used while using raft in header only way?

@achirkin
Copy link
Contributor

achirkin commented Apr 5, 2022

I see. Does raft's CMakeLists.txt gets used while using raft in header only way?

It should be, to the best of my understanding. Yet, I see, NVTX is indeed not enabled when I ./build.sh --nvtx in a fresh standalone cuml environment. I wonder, if I miss something in cpp/cmake/thirdparty/get_raft.cmake?..

@achirkin
Copy link
Contributor

achirkin commented Apr 5, 2022

I've tried to trace cmake setup in cuml. It seems, the problem is ${CONDA_PATH}/envs/cuml_dev/lib/cmake/raft/raft-targets.cmake gets compiled during the environment setup. All options from raft/cpp/CMakeLists.txt are substituted by their default values at that moment. Then, at the moment we build cuml, and get_raft.cmake is executed, the NVTX option is passed along to the compiled raft-targets.cmake correctly, but never used due to being already substituted.

Ideally, it would be nice to prevent conda/cmake from substituting options like NVTX during the environment setup, but I don't know how to do that. It also is not clear to me, what should be the behavior of non-header-only raft components in that regard. However, just copying the directives from raft to dependent projects does not look like an optimal solution to me.

@vinaydes
Copy link
Contributor Author

vinaydes commented Apr 5, 2022

Thanks for looking in to it @achirkin. I am okay with adopting any of the approaches as long as we can enable NVTX for cuml.

@robertmaynard
Copy link
Contributor

I've tried to trace cmake setup in cuml. It seems, the problem is ${CONDA_PATH}/envs/cuml_dev/lib/cmake/raft/raft-targets.cmake gets compiled during the environment setup. All options from raft/cpp/CMakeLists.txt are substituted by their default values at that moment. Then, at the moment we build cuml, and get_raft.cmake is executed, the NVTX option is passed along to the compiled raft-targets.cmake correctly, but never used due to being already substituted.

Ideally, it would be nice to prevent conda/cmake from substituting options like NVTX during the environment setup, but I don't know how to do that. It also is not clear to me, what should be the behavior of non-header-only raft components in that regard. However, just copying the directives from raft to dependent projects does not look like an optimal solution to me.

Since NVTX_ENABLED is considered an INTERFACE compile definition for raft it is fixed at install time ( aka when the conda package is installed ). If this flag is meant to be controlled by the downstream consumers of RAFT we should remove it here https://github.com/rapidsai/raft/blob/branch-22.06/cpp/CMakeLists.txt#L166

@cjnolet
Copy link
Member

cjnolet commented Apr 5, 2022

ok to test

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.06@c07e162). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.06    #4684   +/-   ##
===============================================
  Coverage                ?   84.06%           
===============================================
  Files                   ?      252           
  Lines                   ?    20336           
  Branches                ?        0           
===============================================
  Hits                    ?    17095           
  Misses                  ?     3241           
  Partials                ?        0           
Flag Coverage Δ
dask 44.98% <0.00%> (?)
non-dask 77.35% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c07e162...930ae6b. Read the comment docs.

@github-actions
Copy link

github-actions bot commented May 6, 2022

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@achirkin
Copy link
Contributor

rapidsai/raft#610 has been merged, so this one can be closed.

NB: in the current state, NVTX is ON by default within raft, but OFF by default in cuml. We'd need another PR to invert the --nvtx flag in cuml if we want that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants