Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Add ability to place Thrust in a custom namespace. #1464

Merged
merged 1 commit into from
Jul 19, 2021

Conversation

alliepiper
Copy link
Collaborator

@alliepiper alliepiper commented Jun 17, 2021

This provides a workaround for downstream projects that encounter
a variety of issues from dynamically linking multiple libraries that
use Thrust.

See the new thrust/detail/config/namespace.h header for details.

Added several tests and checks to validate that this behavior is correct,
and the __THRUST_DEFINE_HAS_MEMBER_FUNCTION utility has been rewritten
to WAR an nvcc bug when the old implementation was used with objects in
an anonymous namespace.

New tests:

  • testing/namespace_wrapped.cu
  • testing/cmake/check_namespace.cmake

Fixes #1401. This change requires NVIDIA/cub#326,

@alliepiper alliepiper marked this pull request as draft June 17, 2021 18:10
@alliepiper alliepiper added this to the 1.14.0 milestone Jun 17, 2021
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper added testing: gpuCI passed Passed gpuCI testing. testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). labels Jun 21, 2021
@alliepiper
Copy link
Collaborator Author

DVS CL: 30103594

@alliepiper
Copy link
Collaborator Author

New DVS CL: 30104019

@alliepiper alliepiper force-pushed the new_namespace_macros branch 2 times, most recently from d1bf59b to 7fb9428 Compare June 22, 2021 21:07
@alliepiper
Copy link
Collaborator Author

New DVS CL: 30109376

@alliepiper
Copy link
Collaborator Author

NVBug 3333484 is preventing a clean build on DVS when the anonymous namespace wrapper is used.

@alliepiper alliepiper force-pushed the new_namespace_macros branch from 7fb9428 to e27e4bc Compare June 30, 2021 16:02
@alliepiper
Copy link
Collaborator Author

Dropped the anonymous namespace macros since the associated compiler bug won't be fixed by the time this is needed.

run tests

@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper requested a review from gevtushenko July 13, 2021 18:59
Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike in CUB, there are quite a lot of headers that don't include config.h or namespace.h directly but use the new macro. I'm fond of self-sufficient headers. It's quite hard to convince myself that there is no issue otherwise. Here is the list of headers:

for f in $(rg "THRUST_NAMESPACE_BEGIN" -l); do if ! grep -q "config.h" $f; then echo $f; fi; done
limits.h
mr/allocator.h
mr/fancy_pointer_resource.h
mr/memory_resource.h
mr/polymorphic_adaptor.h
mr/tls_pool.h
mr/new.h
mr/disjoint_sync_pool.h
mr/sync_pool.h
mr/disjoint_pool.h
mr/pool.h
mr/pool_options.h
mr/validator.h
detail/numeric_traits.h
mr/disjoint_tls_pool.h
detail/device_delete.inl
detail/device_new.inl
detail/trivial_sequence.h
detail/distance.inl
detail/tuple_transform.h
detail/device_ptr.inl
detail/tuple.inl
detail/swap_ranges.inl
detail/functional.inl
detail/temporary_array.inl
detail/caching_allocator.h
detail/reduce.inl
detail/fill.inl
detail/equal.inl
detail/cstdint.h
detail/merge.inl
detail/uninitialized_fill.inl
detail/pair.inl
detail/internal_functional.h
detail/uninitialized_copy.inl
detail/tuple_meta_transform.h
detail/scatter.inl
detail/transform.inl
detail/gather.inl
detail/transform_scan.inl
detail/generate.inl
random/detail/linear_congruential_engine_discard.h
random/detail/uniform_real_distribution.inl
random/detail/discard_block_engine.inl
random/detail/linear_feedback_shift_engine.inl
random/detail/xor_combine_engine_max.h
random/detail/normal_distribution.inl
random/detail/subtract_with_carry_engine.inl
random/detail/uniform_int_distribution.inl
random/detail/xor_combine_engine.inl
random/detail/linear_congruential_engine.inl
random/detail/random_core_access.h
random/detail/linear_feedback_shift_engine_wordmask.h
system/detail/error_category.inl
system/detail/bad_alloc.h
system/detail/error_condition.inl
system/detail/system_error.inl
system/cuda/memory_resource.h
system/detail/error_code.inl
detail/allocator/destroy_range.inl
detail/functional/composite.h
detail/config/namespace.h
detail/type_traits/is_metafunction_defined.h
detail/type_traits/minimum_type.h
detail/type_traits/function_traits.h
detail/type_traits/is_call_possible.h
iterator/detail/reverse_iterator_base.h
detail/util/align.h
iterator/detail/zip_iterator.inl
iterator/detail/permutation_iterator_base.h
iterator/detail/normal_iterator.h
iterator/detail/transform_output_iterator.inl
iterator/detail/iterator_traversal_tags.h
iterator/detail/transform_input_output_iterator.inl
iterator/detail/constant_iterator_base.h
iterator/detail/zip_iterator_base.h
iterator/detail/iterator_traits.inl
iterator/detail/counting_iterator.inl
iterator/detail/iterator_adaptor_base.h
iterator/detail/minimum_category.h
iterator/detail/transform_iterator.inl
iterator/detail/reverse_iterator.inl
detail/complex/clogf.h
detail/complex/ccoshf.h
detail/complex/cexpf.h
detail/complex/c99math.h
detail/complex/arithmetic.h
detail/complex/cpow.h
detail/complex/ccosh.h
detail/complex/complex.inl
detail/complex/csqrt.h
detail/complex/ctanhf.h
detail/complex/ctanh.h
detail/complex/csinhf.h
detail/complex/stream.h
detail/complex/csinh.h
detail/complex/clog.h
detail/complex/cproj.h
detail/complex/cexp.h
detail/complex/csqrtf.h
detail/mpl/math.h
system/tbb/detail/merge.inl
system/detail/sequential/sort.inl
system/detail/generic/reduce.inl
system/detail/generic/fill.h
system/detail/sequential/stable_radix_sort.inl
system/detail/generic/reduce_by_key.inl
system/detail/sequential/stable_merge_sort.inl
system/cuda/detail/uninitialized_copy.h
system/cuda/detail/make_unsigned_special.h
system/cuda/detail/error.inl
system/cuda/detail/merge.h
system/cuda/detail/uninitialized_fill.h
system/detail/sequential/binary_search.h
system/cuda/detail/cross_system.h
system/cuda/detail/remove.h
system/cuda/detail/reverse.h
system/cuda/detail/swap_ranges.h
system/cuda/detail/par_to_seq.h
system/cuda/detail/fill.h
system/cuda/detail/set_operations.h
system/cuda/detail/replace.h
system/cuda/detail/transform_reduce.h
system/cuda/detail/inner_product.h
system/cuda/detail/scan_by_key.h
system/cuda/detail/transform_scan.h
system/cuda/detail/binary_search.h
system/cuda/detail/gather.h
system/cuda/detail/scatter.h
system/cuda/detail/core/alignment.h

}}

} // end namespace detail

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any style recommendations for the number of new lines between namespace enclosing? For now it seems to be different in each case:

rm /tmp/{2,3}
find . -name "*.h" -exec bash -c 'lines=$(rg --multiline -e "}\s*THRUST" {} | wc -l); echo 1 >> /tmp/$lines' \;
wc -l /tmp/{2,3}
rm /tmp/{2,3}
 4 /tmp/2
41 /tmp/3
find . -name "*.h" -exec bash -c 'lines=$(rg --multiline -e "} // namespace detail\s*THRUST" {} | wc -l); echo 1 >> /tmp/$lines' \;
wc -l /tmp/{2,3}
 1 /tmp/2
12 /tmp/3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing code is all over the place with these sorts of style issues, so there's not a firm rule.

I generally prefer a single blank line. This is what our proposed .clang-format file will do, and we'll fix this when we get around to setting that up since we can automate it.

@alliepiper
Copy link
Collaborator Author

Good point about the missing config headers. I'll add them before merging.

FYI, These projects haven't enforced IWYU in the past, and our current header tests only check that .h/.cuh files are self-sufficient -- and even then, none of the per-system Thrust headers are checked IIRC. This is another cleanup we may tackle at some point, but it's low priority. We'll definitely be more strict about this for 2.x.

@alliepiper alliepiper force-pushed the new_namespace_macros branch from e32899a to d6cce7b Compare July 16, 2021 21:14
@alliepiper
Copy link
Collaborator Author

Added missing config includes.

run tests

This provides a workaround for downstream projects that encounter
a variety of issues from dynamically linking multiple libraries that
use Thrust.

See the new `thrust/detail/config/namespace.h` header for details.

Added several tests and checks to validate that this behavior is correct,
and the `__THRUST_DEFINE_HAS_MEMBER_FUNCTION` utility has been rewritten
to WAR an nvcc bug when the old implementation was used with objects in
an anonymous namespace.

New tests:
  - testing/namespace_wrapped.cu
  - testing/cmake/check_namespace.cmake
@alliepiper alliepiper force-pushed the new_namespace_macros branch from d6cce7b to 363c352 Compare July 16, 2021 21:22
@alliepiper
Copy link
Collaborator Author

run tests

@alliepiper alliepiper added testing: gpuCI in progress Started gpuCI testing. and removed testing: gpuCI passed Passed gpuCI testing. labels Jul 16, 2021
@alliepiper alliepiper marked this pull request as ready for review July 19, 2021 19:29
@alliepiper alliepiper merged commit d35f44f into NVIDIA:main Jul 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
testing: gpuCI in progress Started gpuCI testing. testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS).
Projects
None yet
2 participants