Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Make all nvcc warnings into errors #8916

Merged
merged 6 commits into from
Nov 2, 2022

Conversation

trxcllnt
Copy link
Contributor

@trxcllnt trxcllnt commented Jul 30, 2021

Seeing what impact -Werror=all-warnings has on device-side compilation.

Device warnings now treated as errors:

cudf/cpp/src/io/orc/stripe_enc.cu (633): error: dynamic initialization is not supported for a function-scope static __shared__ variable within a __device__/__global__ function

cudf/cpp/src/io/orc/writer_impl.cu
ptxas error   : Stack size for entry function '_ZN4cudf6detail20single_thread_kernelIZNS_2io6detail3orc19make_orc_table_viewERKNS_10table_viewERKNS_17table_device_viewEPKNS2_14table_metadataEN3rmm16cuda_stream_viewEEUlvE_EEvT_' cannot be statically determined

cudf/cpp/src/binaryop/compiled/binary_ops.cu(46): error: parameter "mr" was declared but never referenced

cudf/cpp/src/binaryop/compiled/binary_ops.cu(204): error: variable "out" was declared but never referenced

@trxcllnt trxcllnt requested a review from harrism July 30, 2021 17:31
@trxcllnt trxcllnt requested a review from a team as a code owner July 30, 2021 17:31
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Jul 30, 2021
@trxcllnt trxcllnt added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Jul 30, 2021
@harrism
Copy link
Member

harrism commented Aug 3, 2021

I would love to enable this. But the recursive device function and the initialization of shared memory are really hard to work around... Is there a pragma we can use to disable those warnings only at those two locations?

@harrism
Copy link
Member

harrism commented Aug 3, 2021

For the unused parameters, just delete the parameter name from the parameter list.

@devavret
Copy link
Contributor

devavret commented Aug 4, 2021

I would love to enable this. But the recursive device function and the initialization of shared memory are really hard to work around... Is there a pragma we can use to disable those warnings only at those two locations?

I tried my hand at fixing those. I converted the recursive method into iterative with a separately allocated stack. The shared mem should be easy. Just need to replace the span with a data ptr and size and add a method that returns a span from it.

@harrism
Copy link
Member

harrism commented Aug 4, 2021

@devavret are you working on the cuIO changes, then?

@devavret
Copy link
Contributor

devavret commented Aug 5, 2021

@devavret are you working on the cuIO changes, then?

Yep. PR will be ready shortly.

Copy link
Contributor

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not enable this. There are types in both libcu++ and cuCo where the shared memory static initialization warning cannot be worked around.

@harrism
Copy link
Member

harrism commented Aug 11, 2021

@jrhemstad We can disable errors based on that warning, while enabling all others. But fixing the cuIO static initialization warning should be done regardless.

rapids-bot bot pushed a commit that referenced this pull request Aug 18, 2021
Contributes to #8916

Authors:
  - Devavret Makkar (https://github.com/devavret)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - David Wendt (https://github.com/davidwendt)

URL: #8975
@github-actions
Copy link

This PR has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates.

@vyasr
Copy link
Contributor

vyasr commented Oct 27, 2022

@jrhemstad are you still opposed to this? As of today (just tested) our build succeeds with -Werror=all-warnings, so it seems like even if we are going to run into specific cases that are impossible to workaround we should be able to guard the necessary cuco/libcu++ includes with nvcc pragmas.

@vyasr vyasr requested review from a team as code owners November 1, 2022 19:26
@vyasr
Copy link
Contributor

vyasr commented Nov 1, 2022

I am fine with doing more, but that seems way too broad. It looks like the error is just in arrow's filesystem.h (and from looking at their code I can see the problem) so I would limit the use of the pragmas to just those cases. The only places I see it are

cpp/include/cudf/io/datasource.hpp:#include <arrow/filesystem/filesystem.h>
cpp/tests/io/arrow_io_source_test.cpp:#include <arrow/filesystem/filesystem.h>

Could you try adding the pragmas locally and see if tests pass for you? If so, then we can add that in this PR and hopefully be set.

I'm not really convinced that will fix the issue though because you are seeing the error when compiling arrow itself. I think we need to avoid propagating the new option to the arrow build itself. However, testing this out should confirm that theory.

@ttnghia
Copy link
Contributor

ttnghia commented Nov 1, 2022

I see warnings with filesystem.h and s3fs.h headers. Let me test disable such warning locally....

@vyasr
Copy link
Contributor

vyasr commented Nov 1, 2022

Sorry yes I missed that that S3FileSystem was in a different header file. The problems are all the same, not overloading all signatures of FileSystem::Equals (maybe something we should consider fixing upstream anyway).

@ttnghia
Copy link
Contributor

ttnghia commented Nov 1, 2022

Humnn, #pragma ... doesn't work with nvcc. We may need something else. CC @robertmaynard for help.

@vyasr
Copy link
Contributor

vyasr commented Nov 1, 2022

@ttnghia the pragmas you want are documented in the page I linked above. You should use the narrowest possible, so something like

#pragma nv_diag_suppress 611
#include <arrow/src/filesystem.h>
#pragma nv_diag_default 611
// Or maybe 611-D, I'm not sure how exactly the naming scheme for the warnings work

@codecov
Copy link

codecov bot commented Nov 1, 2022

Codecov Report

Base: 87.47% // Head: 88.11% // Increases project coverage by +0.63% 🎉

Coverage data is based on head (0104a3d) compared to base (f817d96).
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.12    #8916      +/-   ##
================================================
+ Coverage         87.47%   88.11%   +0.63%     
================================================
  Files               133      135       +2     
  Lines             21826    21985     +159     
================================================
+ Hits              19093    19372     +279     
+ Misses             2733     2613     -120     
Impacted Files Coverage Δ
python/cudf/cudf/io/text.py 91.66% <0.00%> (-8.34%) ⬇️
python/cudf/cudf/core/_base_index.py 81.28% <0.00%> (-4.27%) ⬇️
python/cudf/cudf/io/json.py 92.06% <0.00%> (-2.68%) ⬇️
python/cudf/cudf/utils/utils.py 89.91% <0.00%> (-0.69%) ⬇️
python/dask_cudf/dask_cudf/core.py 73.72% <0.00%> (-0.41%) ⬇️
python/cudf/cudf/io/parquet.py 90.45% <0.00%> (-0.39%) ⬇️
python/dask_cudf/dask_cudf/backends.py 84.90% <0.00%> (-0.37%) ⬇️
python/cudf/cudf/core/column/datetime.py 89.62% <0.00%> (-0.09%) ⬇️
python/cudf/cudf/io/orc.py 92.94% <0.00%> (-0.09%) ⬇️
python/cudf/cudf/core/dataframe.py 93.67% <0.00%> (-0.06%) ⬇️
... and 33 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ttnghia
Copy link
Contributor

ttnghia commented Nov 1, 2022

Okay using #pragma nv_diag_suppress 611 works for me. So please add it to the PR. We only care if the Arrow headers included in our .hpp files.

@vyasr
Copy link
Contributor

vyasr commented Nov 1, 2022

@ttnghia I've pushed the changes, let me know if those are consistent with what you needed to do.

@vyasr vyasr requested a review from ttnghia November 2, 2022 00:06
@vyasr
Copy link
Contributor

vyasr commented Nov 2, 2022

@ttnghia
Copy link
Contributor

ttnghia commented Nov 2, 2022

We still miss this:

_deps/arrow-src/cpp/src/arrow/table.h(244): error #611-D: overloaded virtual function "arrow::RecordBatchReader::ReadNext" is only partially overridden in class "arrow::TableBatchReader"

when building src/interop/from_arrow.cu and src/interop/to_arrow.cu.

@vyasr
Copy link
Contributor

vyasr commented Nov 2, 2022

OK I don't see any direct inclusions of arrow/table.h, so I'm assuming that this is coming from the transitive include in api.h. I've blocked that one off, hopefully that's the last of it.

@ttnghia
Copy link
Contributor

ttnghia commented Nov 2, 2022

OK I don't see any direct inclusions of arrow/table.h, so I'm assuming that this is coming from the transitive include in api.h. I've blocked that one off, hopefully that's the last of it.

Yes, I confirm that the latest patch fixes all errors for me.

Thanks for working on this 😄 👍

@robertmaynard
Copy link
Contributor

@ttnghia the pragmas you want are documented in the page I linked above. You should use the narrowest possible, so something like

#pragma nv_diag_suppress 611
#include <arrow/src/filesystem.h>
#pragma nv_diag_default 611
// Or maybe 611-D, I'm not sure how exactly the naming scheme for the warnings work

Compiling with --display-error-number will tell you what values you need to provide to #pragma nv_diag_suppress

Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this behavior to be controlled with a CMake option ( default to ON ). The big issue is that when new compiler versions are released they will have new warnings / errors are better heuristics which will cause hard build failures.

During this bringup phase it is nice to be able to compile with warnings not as errors, verify correctness of the binary and go back to fixing the warning or submitting a compiler regression.

@vyasr
Copy link
Contributor

vyasr commented Nov 2, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d949cd2 into rapidsai:branch-22.12 Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants