Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] We shouldn't expose our dependent library symbols across libcuml++.so #2138

Open
teju85 opened this issue Apr 27, 2020 · 15 comments
Open
Assignees
Labels
CUDA / C++ CUDA issue inactive-30d inactive-90d proposal Change current process or code

Comments

@teju85
Copy link
Member

teju85 commented Apr 27, 2020

Describe the bug
While discussing RAFT integration, @dantegd described the reason for (re)cloning of dependencies of cuml here. Ideally, all symbols from these libraries that we depend upon should not have been exposed across libcuml++.so interface!

Expected behavior
If a user only has libcuml++.so and all headers inside cpp/include installed already, then they shouldn't need anything else (perhaps excepting cuda libraries and rmm), if they want to build python layer or just use the C++ library directly.

Advantages

  1. Simplifies the linker process, as the users don't have to worry too much about maintaining other dependencies.
  2. Simplifies a whole lot of cuML's python setup.py

Disadvantages

  1. This means we might have to link those dependencies statically, thereby further increasing our .so file size
@teju85 teju85 added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 27, 2020
@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

@teju85 dmlc/xgboost#5590 may be relevant to this discussion, where XGBoost devs hid all C++ symbols from libxgboost.so and only exposed C API symbols. As a result, none of the C++ symbols from Rabit and dmlc-core dependencies leaked into libxgboost.so.

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

@hcho3 unfortunately, we have to expose C++ symbols since our cython layer depends on those!

@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

@teju85 Got it. It should be however possible to selectively hide some symbols by using compiler attributes. In addition, we can hide C++ symbols that come from dependencies.

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

Agreed. That's essentially the hope behind filing this issue.

@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

dmlc/xgboost#5590 used a white-list approach: hide all symbols, except a few symbols that were marked with a special macro.
Alternatively, we can consider a black-list approach: expose all symbols except those we elect to hide.

Either way, we are essentially mimicking what Windows does to all shared libs. On Windows platform, functions must be marked __declspec(dllexport) in order to be visible at DLL boundary.

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

Good point. For now, if we can hide symbols from:

  1. treelite (and its dependencies)
  2. faiss
  3. spdlog

then that should be sufficient, for starters. Would you like to take this up, @hcho3 ?

@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

Is this issue blocking RAFT adoption?

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

No, shouldn't be a blocking issue, but certainly very useful if this got done soon.

@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

Got it. I'll take a look at this issue once I make progress on speeding up Treelite serialization.

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

Great! Thanks for volunteering. Agreed, this can certainly wait after the treelite-speedups work is over.

@hcho3
Copy link
Contributor

hcho3 commented Apr 27, 2020

Feel free to assign this issue to me.

@viclafargue viclafargue added CUDA / C++ CUDA issue proposal Change current process or code and removed ? - Needs Triage Need team to review and classify bug Something isn't working labels Apr 27, 2020
@JohnZed
Copy link
Contributor

JohnZed commented Apr 27, 2020

I believe we already do this for libdmlc and libprotobuf since we had issues with that? See https://github.com/rapidsai/cuml/blob/branch-0.14/cpp/CMakeLists.txt around line 318. Wish there were a more cmake-native way of doing it though.

@teju85
Copy link
Member Author

teju85 commented Apr 27, 2020

So, we should probably extend this for treelite, faiss and spdlog also?

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA / C++ CUDA issue inactive-30d inactive-90d proposal Change current process or code
Projects
None yet
Development

No branches or pull requests

4 participants