Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune CUB's ChainedPolicy by __CUDA_ARCH_LIST__ #2154

Merged
merged 6 commits into from
Sep 26, 2024

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Jul 31, 2024

Motivated by @gevtushenko and @elstehle explaining to me why CUB instantiates so many kernels and why having many tuning policies is bad, here is a mitigation: When the macro __CUDA_ARCH_LIST__ is available, we know at compile time what runtime values the ptx version can have, so we can prune the number of dispatches CUB generates from the tuning policies to only those versions. This should give us faster compilation and allow us to use tuning policies more liberally.

Compile time and binary size of cub.example.device.radix_sort before and after:

before:
    ARCH=50;60;70;80;90: 23.826s 3886008B
    ARCH=86:              8.462s 1685520B
after:
    ARCH=50;60;70;80;90: 23.646s 3877904B
    ARCH=86:              6.095s 1232912B

@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Jul 31, 2024
cub/cub/util_device.cuh Outdated Show resolved Hide resolved
@bernhardmgruber bernhardmgruber force-pushed the chained_policy_prune branch 2 times, most recently from 71a0f9a to 04e42c9 Compare August 2, 2024 13:55
@bernhardmgruber
Copy link
Contributor Author

The unit tests now list all virtual architectures, since the list was shorter than I expected.

@bernhardmgruber bernhardmgruber marked this pull request as ready for review August 2, 2024 13:56
@bernhardmgruber bernhardmgruber requested review from a team as code owners August 2, 2024 13:56
@bernhardmgruber bernhardmgruber requested a review from griwes August 2, 2024 13:56
cub/cub/util_device.cuh Outdated Show resolved Hide resolved
@bernhardmgruber
Copy link
Contributor Author

I reworked the feature to now only ever instantiate to the PTX versions that appear in __CUDA_ARCH_LIST__, which is even better.

@bernhardmgruber bernhardmgruber force-pushed the chained_policy_prune branch 3 times, most recently from f6522ab to 75d0f10 Compare August 14, 2024 21:30
Copy link
Contributor

🟨 CI finished in 6h 48m: Pass: 80%/250 | Total: 6d 04h | Avg: 35m 38s | Max: 1h 27m | Hits: 64%/17277
  • 🟨 cub: Pass: 74%/131 | Total: 3d 23h | Avg: 43m 53s | Max: 1h 27m | Hits: 41%/4272

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 00s | Avg: 22m 00s | Max: 22m 58s
      🔍 nvcc               Pass:  74%/129 | Total:  3d 23h | Avg: 44m 13s | Max:  1h 27m | Hits:  41%/4272  
    🟨 ctk
      🟩 11.1               Pass: 100%/15  | Total: 11h 09m | Avg: 44m 39s | Max: 53m 39s | Hits:  41%/712   
      🟥 11.8               Pass:   0%/3   | Total:  4h 01m | Avg:  1h 20m | Max:  1h 27m
      🟨 12.5               Pass:  73%/113 | Total:  3d 08h | Avg: 42m 49s | Max:  1h 11m | Hits:  41%/3560  
    🟨 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 44m 00s | Avg: 22m 00s | Max: 22m 58s
      🟩 nvcc11.1           Pass: 100%/15  | Total: 11h 09m | Avg: 44m 39s | Max: 53m 39s | Hits:  41%/712   
      🟥 nvcc11.8           Pass:   0%/3   | Total:  4h 01m | Avg:  1h 20m | Max:  1h 27m
      🟨 nvcc12.5           Pass:  72%/111 | Total:  3d 07h | Avg: 43m 11s | Max:  1h 11m | Hits:  41%/3560  
    🟨 cxx
      🟨 Clang9             Pass:  50%/6   | Total:  5h 22m | Avg: 53m 48s | Max:  1h 06m
      🟩 Clang10            Pass: 100%/3   | Total:  2h 26m | Avg: 48m 59s | Max: 49m 04s
      🟩 Clang11            Pass: 100%/4   | Total:  3h 20m | Avg: 50m 12s | Max: 51m 58s
      🟩 Clang12            Pass: 100%/4   | Total:  3h 16m | Avg: 49m 07s | Max: 49m 26s
      🟩 Clang13            Pass: 100%/4   | Total:  3h 14m | Avg: 48m 32s | Max: 49m 17s
      🟨 Clang14            Pass:  75%/4   | Total:  3h 32m | Avg: 53m 13s | Max:  1h 05m
      🟥 Clang15            Pass:   0%/4   | Total:  4h 13m | Avg:  1h 03m | Max:  1h 05m
      🟥 Clang16            Pass:   0%/4   | Total:  4h 08m | Avg:  1h 02m | Max:  1h 04m
      🟩 Clang17            Pass: 100%/26  | Total: 12h 50m | Avg: 29m 38s | Max: 57m 40s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 28m | Avg: 44m 17s | Max: 44m 28s
      🟩 GCC7               Pass: 100%/6   | Total:  4h 44m | Avg: 47m 20s | Max: 52m 29s
      🟩 GCC8               Pass: 100%/6   | Total:  4h 44m | Avg: 47m 23s | Max: 52m 42s
      🟩 GCC9               Pass: 100%/6   | Total:  4h 42m | Avg: 47m 03s | Max: 51m 27s
      🟥 GCC10              Pass:   0%/4   | Total:  4h 06m | Avg:  1h 01m | Max:  1h 07m
      🟥 GCC11              Pass:   0%/7   | Total:  8h 00m | Avg:  1h 08m | Max:  1h 27m
      🟨 GCC12              Pass:  25%/4   | Total:  4h 07m | Avg:  1h 01m | Max:  1h 04m
      🟨 GCC13              Pass:  75%/28  | Total: 12h 47m | Avg: 27m 24s | Max: 55m 08s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 37m | Avg: 52m 29s | Max: 53m 22s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 53m 39s | Avg: 53m 39s | Max: 53m 39s | Hits:  41%/712   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 47s | Max: 59m 10s | Hits:  41%/1424  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 15m | Avg:  1h 05m | Max:  1h 11m | Hits:  41%/2136  
    🟨 cxx_family
      🟨 Clang              Pass:  79%/59  | Total:  1d 18h | Avg: 43m 09s | Max:  1h 06m
      🟨 GCC                Pass:  66%/63  | Total:  1d 20h | Avg: 42m 33s | Max:  1h 27m
      🟩 Intel              Pass: 100%/3   | Total:  2h 37m | Avg: 52m 29s | Max: 53m 22s
      🟩 MSVC               Pass: 100%/6   | Total:  6h 04m | Avg:  1h 00m | Max:  1h 11m | Hits:  41%/4272  
    🟨 jobs
      🟨 Build              Pass:  68%/99  | Total:  3d 13h | Avg: 52m 05s | Max:  1h 27m | Hits:  41%/4272  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 31m | Avg: 18m 55s | Max: 20m 53s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 05m | Avg: 15m 41s | Max: 17m 29s
      🟨 HostLaunch         Pass:  87%/8   | Total:  2h 23m | Avg: 17m 57s | Max: 20m 41s
      🟨 TestGPU            Pass:  87%/8   | Total:  2h 52m | Avg: 21m 33s | Max: 27m 35s
    🟨 gpu
      🟨 v100               Pass:  74%/131 | Total:  3d 23h | Avg: 43m 53s | Max:  1h 27m | Hits:  41%/4272  
    🟨 cpu
      🟨 amd64              Pass:  73%/123 | Total:  3d 17h | Avg: 43m 29s | Max:  1h 27m | Hits:  41%/4272  
      🟨 arm64              Pass:  87%/8   | Total:  6h 40m | Avg: 50m 01s | Max: 55m 08s
    🟥 sm
      🟥 60;70;80;90        Pass:   0%/3   | Total:  4h 01m | Avg:  1h 20m | Max:  1h 27m
      🟥 90a                Pass:   0%/4   | Total:  1h 17m | Avg: 19m 17s | Max: 19m 45s
    🟨 std
      🟨 11                 Pass:  79%/34  | Total:  1d 01h | Avg: 44m 24s | Max:  1h 27m
      🟨 14                 Pass:  75%/37  | Total:  1d 03h | Avg: 45m 07s | Max:  1h 16m | Hits:  41%/2136  
      🟨 17                 Pass:  72%/36  | Total:  1d 02h | Avg: 44m 27s | Max:  1h 17m | Hits:  41%/1424  
      🟨 20                 Pass:  70%/24  | Total: 16h 09m | Avg: 40m 23s | Max:  1h 11m | Hits:  41%/712   
    
  • 🟨 thrust: Pass: 87%/118 | Total: 2d 04h | Avg: 26m 47s | Max: 56m 29s | Hits: 71%/13005

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  86%/110 | Total:  2d 01h | Avg: 26m 50s | Max: 56m 29s | Hits:  71%/13005 
      🟩 arm64              Pass: 100%/8   | Total:  3h 28m | Avg: 26m 02s | Max: 30m 21s
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  84%/99  | Total:  2d 00h | Avg: 29m 38s | Max: 56m 29s | Hits:  57%/8670  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 48m | Avg:  9m 51s | Max: 19m 20s | Hits:  99%/4335  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 58m | Avg: 14m 50s | Max: 17m 31s
    🟨 ctk
      🟩 11.1               Pass: 100%/15  | Total:  6h 52m | Avg: 27m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟨 11.8               Pass:  66%/3   | Total:  2h 11m | Avg: 43m 42s | Max: 46m 18s
      🟨 12.5               Pass:  86%/100 | Total:  1d 19h | Avg: 26m 10s | Max: 55m 23s | Hits:  73%/11560 
    🟨 cudacxx
      🟥 ClangCUDA17        Pass:   0%/2   | Total:  1h 12m | Avg: 36m 25s | Max: 37m 46s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 52m | Avg: 27m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟨 nvcc11.8           Pass:  66%/3   | Total:  2h 11m | Avg: 43m 42s | Max: 46m 18s
      🟨 nvcc12.5           Pass:  87%/98  | Total:  1d 18h | Avg: 25m 57s | Max: 55m 23s | Hits:  73%/11560 
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 35m | Avg: 25m 53s | Max: 29m 20s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 22m | Avg: 27m 37s | Max: 28m 55s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 51s | Max: 27m 45s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 48m | Avg: 27m 04s | Max: 28m 26s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 50m | Avg: 27m 35s | Max: 30m 09s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 52m | Avg: 28m 11s | Max: 29m 22s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 15s | Max: 30m 15s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 52m | Avg: 28m 00s | Max: 31m 23s
      🟨 Clang17            Pass:  88%/18  | Total:  6h 10m | Avg: 20m 34s | Max: 37m 46s
      🟩 GCC6               Pass: 100%/2   | Total: 50m 26s | Avg: 25m 13s | Max: 26m 27s
      🟨 GCC7               Pass:  50%/6   | Total:  1h 11m | Avg: 11m 54s | Max: 24m 40s
      🟨 GCC8               Pass:  83%/6   | Total:  2h 15m | Avg: 22m 35s | Max: 28m 42s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 46m | Avg: 27m 40s | Max: 30m 27s
      🟨 GCC10              Pass:  50%/4   | Total:  2h 32m | Avg: 38m 03s | Max: 39m 18s
      🟨 GCC11              Pass:  71%/7   | Total:  4h 43m | Avg: 40m 31s | Max: 46m 18s
      🟨 GCC12              Pass:  75%/4   | Total:  2h 32m | Avg: 38m 03s | Max: 38m 36s
      🟨 GCC13              Pass:  80%/20  | Total:  6h 26m | Avg: 19m 19s | Max: 32m 02s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 59m | Avg: 39m 46s | Max: 39m 47s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 56m 29s | Avg: 56m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 47s | Max: 50m 20s | Hits:  57%/2890  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 30m | Avg: 35m 08s | Max: 55m 23s | Hits:  78%/8670  
    🟨 cxx_family
      🟨 Clang              Pass:  96%/51  | Total: 21h 16m | Avg: 25m 01s | Max: 37m 46s
      🟨 GCC                Pass:  76%/55  | Total: 23h 18m | Avg: 25m 25s | Max: 46m 18s
      🟩 Intel              Pass: 100%/3   | Total:  1h 59m | Avg: 39m 46s | Max: 39m 47s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 06m | Avg: 40m 45s | Max: 56m 29s | Hits:  71%/13005 
    🟨 gpu
      🟨 v100               Pass:  87%/118 | Total:  2d 04h | Avg: 26m 47s | Max: 56m 29s | Hits:  71%/13005 
    🟨 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total:  1h 12m | Avg: 36m 25s | Max: 37m 46s
      🟨 nvcc               Pass:  88%/116 | Total:  2d 03h | Avg: 26m 37s | Max: 56m 29s | Hits:  71%/13005 
    🟨 sm
      🟨 60;70;80;90        Pass:  66%/3   | Total:  2h 11m | Avg: 43m 42s | Max: 46m 18s
      🟥 90a                Pass:   0%/4   | Total:  1h 19m | Avg: 19m 58s | Max: 26m 20s
    🟨 std
      🟨 11                 Pass:  83%/30  | Total: 12h 31m | Avg: 25m 02s | Max: 44m 12s
      🟨 14                 Pass:  85%/34  | Total: 15h 38m | Avg: 27m 36s | Max: 56m 29s | Hits:  67%/5780  
      🟨 17                 Pass:  90%/33  | Total: 15h 22m | Avg: 27m 57s | Max: 50m 13s | Hits:  71%/4335  
      🟨 20                 Pass:  90%/21  | Total:  9h 08m | Avg: 26m 06s | Max: 49m 32s | Hits:  78%/2890  
    
  • 🟥 pycuda: Pass: 0%/1

    🟥 cpu
      🟥 amd64              Pass:   0%/1  
    🟥 ctk
      🟥 12.5               Pass:   0%/1  
    🟥 cudacxx
      🟥 nvcc12.5           Pass:   0%/1  
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1  
    🟥 cxx
      🟥 GCC13              Pass:   0%/1  
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1  
    🟥 gpu
      🟥 v100               Pass:   0%/1  
    🟥 jobs
      🟥 Test               Pass:   0%/1  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟨 CI finished in 1d 15h: Pass: 98%/250 | Total: 4d 21h | Avg: 28m 17s | Max: 1h 11m | Hits: 64%/17277
  • 🟨 cub: Pass: 96%/131 | Total: 2d 21h | Avg: 31m 58s | Max: 1h 11m | Hits: 41%/4272

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  96%/123 | Total:  2d 14h | Avg: 30m 41s | Max:  1h 11m | Hits:  41%/4272  
      🟩 arm64              Pass: 100%/8   | Total:  6h 53m | Avg: 51m 41s | Max: 55m 08s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total: 11h 09m | Avg: 44m 39s | Max: 53m 39s | Hits:  41%/712   
      🟩 11.8               Pass: 100%/3   | Total: 14m 02s | Avg:  4m 40s | Max:  5m 13s
      🔍 12.5               Pass:  96%/113 | Total:  2d 10h | Avg: 31m 00s | Max:  1h 11m | Hits:  41%/3560  
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 44m 00s | Avg: 22m 00s | Max: 22m 58s
      🟩 nvcc11.1           Pass: 100%/15  | Total: 11h 09m | Avg: 44m 39s | Max: 53m 39s | Hits:  41%/712   
      🟩 nvcc11.8           Pass: 100%/3   | Total: 14m 02s | Avg:  4m 40s | Max:  5m 13s
      🔍 nvcc12.5           Pass:  96%/111 | Total:  2d 09h | Avg: 31m 10s | Max:  1h 11m | Hits:  41%/3560  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 00s | Avg: 22m 00s | Max: 22m 58s
      🔍 nvcc               Pass:  96%/129 | Total:  2d 21h | Avg: 32m 07s | Max:  1h 11m | Hits:  41%/4272  
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 26m | Avg: 24m 20s | Max: 45m 57s
      🟩 Clang10            Pass: 100%/3   | Total:  2h 26m | Avg: 48m 59s | Max: 49m 04s
      🟩 Clang11            Pass: 100%/4   | Total:  3h 20m | Avg: 50m 12s | Max: 51m 58s
      🟩 Clang12            Pass: 100%/4   | Total:  3h 16m | Avg: 49m 07s | Max: 49m 26s
      🟩 Clang13            Pass: 100%/4   | Total:  3h 14m | Avg: 48m 32s | Max: 49m 17s
      🟩 Clang14            Pass: 100%/4   | Total:  2h 31m | Avg: 37m 54s | Max: 50m 49s
      🟩 Clang15            Pass: 100%/4   | Total: 18m 17s | Avg:  4m 34s | Max:  4m 58s
      🟩 Clang16            Pass: 100%/4   | Total: 17m 49s | Avg:  4m 27s | Max:  4m 39s
      🟩 Clang17            Pass: 100%/26  | Total: 12h 50m | Avg: 29m 38s | Max: 57m 40s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 28m | Avg: 44m 17s | Max: 44m 28s
      🟩 GCC7               Pass: 100%/6   | Total:  4h 44m | Avg: 47m 20s | Max: 52m 29s
      🟩 GCC8               Pass: 100%/6   | Total:  4h 44m | Avg: 47m 23s | Max: 52m 42s
      🟩 GCC9               Pass: 100%/6   | Total:  4h 42m | Avg: 47m 03s | Max: 51m 27s
      🟩 GCC10              Pass: 100%/4   | Total: 16m 46s | Avg:  4m 11s | Max:  4m 22s
      🟩 GCC11              Pass: 100%/7   | Total: 31m 10s | Avg:  4m 27s | Max:  5m 13s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 18m | Avg: 19m 36s | Max:  1h 04m
      🔍 GCC13              Pass:  85%/28  | Total: 12h 37m | Avg: 27m 03s | Max: 55m 08s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 37m | Avg: 52m 29s | Max: 53m 22s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 53m 39s | Avg: 53m 39s | Max: 53m 39s | Hits:  41%/712   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 47s | Max: 59m 10s | Hits:  41%/1424  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 15m | Avg:  1h 05m | Max:  1h 11m | Hits:  41%/2136  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/59  | Total:  1d 06h | Avg: 31m 14s | Max: 57m 40s
      🔍 GCC                Pass:  93%/63  | Total:  1d 06h | Avg: 28m 56s | Max:  1h 04m
      🟩 Intel              Pass: 100%/3   | Total:  2h 37m | Avg: 52m 29s | Max: 53m 22s
      🟩 MSVC               Pass: 100%/6   | Total:  6h 04m | Avg:  1h 00m | Max:  1h 11m | Hits:  41%/4272  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  95%/99  | Total:  2d 11h | Avg: 35m 59s | Max:  1h 11m | Hits:  41%/4272  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 31m | Avg: 18m 55s | Max: 20m 53s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 05m | Avg: 15m 41s | Max: 17m 29s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 34m | Avg: 19m 19s | Max: 20m 41s
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 12m | Avg: 24m 05s | Max: 27m 35s
    🚨 sm: 90a 🚨
      🟩 60;70;80;90        Pass: 100%/3   | Total: 14m 02s | Avg:  4m 40s | Max:  5m 13s
      🔥 90a                Pass:   0%/4   | Total: 22m 54s | Avg:  5m 43s | Max:  6m 41s
    🟨 gpu
      🟨 v100               Pass:  96%/131 | Total:  2d 21h | Avg: 31m 58s | Max:  1h 11m | Hits:  41%/4272  
    🟨 std
      🟨 11                 Pass:  97%/34  | Total: 18h 46m | Avg: 33m 08s | Max:  1h 04m
      🟨 14                 Pass:  97%/37  | Total: 20h 58m | Avg: 34m 00s | Max:  1h 05m | Hits:  41%/2136  
      🟨 17                 Pass:  97%/36  | Total: 18h 33m | Avg: 30m 56s | Max: 58m 15s | Hits:  41%/1424  
      🟨 20                 Pass:  95%/24  | Total: 11h 28m | Avg: 28m 42s | Max:  1h 11m | Hits:  41%/712   
    
  • 🟨 thrust: Pass: 99%/118 | Total: 1d 23h | Avg: 24m 20s | Max: 56m 29s | Hits: 71%/13005

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/110 | Total:  1d 20h | Avg: 24m 12s | Max: 56m 29s | Hits:  71%/13005 
      🟩 arm64              Pass: 100%/8   | Total:  3h 28m | Avg: 26m 02s | Max: 30m 21s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  6h 52m | Avg: 27m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟩 11.8               Pass: 100%/3   | Total:  1h 30m | Avg: 30m 07s | Max: 46m 18s
      🔍 12.5               Pass:  99%/100 | Total:  1d 15h | Avg: 23m 41s | Max: 55m 23s | Hits:  73%/11560 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 18s | Avg:  4m 09s | Max:  4m 12s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 52m | Avg: 27m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 30m | Avg: 30m 07s | Max: 46m 18s
      🔍 nvcc12.5           Pass:  98%/98  | Total:  1d 15h | Avg: 24m 05s | Max: 55m 23s | Hits:  73%/11560 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 18s | Avg:  4m 09s | Max:  4m 12s
      🔍 nvcc               Pass:  99%/116 | Total:  1d 23h | Avg: 24m 41s | Max: 56m 29s | Hits:  71%/13005 
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 35m | Avg: 25m 53s | Max: 29m 20s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 22m | Avg: 27m 37s | Max: 28m 55s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 51s | Max: 27m 45s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 48m | Avg: 27m 04s | Max: 28m 26s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 50m | Avg: 27m 35s | Max: 30m 09s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 52m | Avg: 28m 11s | Max: 29m 22s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 57m | Avg: 29m 15s | Max: 30m 15s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 52m | Avg: 28m 00s | Max: 31m 23s
      🟩 Clang17            Pass: 100%/18  | Total:  5h 05m | Avg: 16m 59s | Max: 28m 45s
      🟩 GCC6               Pass: 100%/2   | Total: 50m 26s | Avg: 25m 13s | Max: 26m 27s
      🟩 GCC7               Pass: 100%/6   | Total:  2h 31m | Avg: 25m 16s | Max: 29m 26s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 42m | Avg: 27m 04s | Max: 28m 42s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 46m | Avg: 27m 40s | Max: 30m 27s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 26m | Avg: 21m 31s | Max: 39m 18s
      🟩 GCC11              Pass: 100%/7   | Total:  2h 57m | Avg: 25m 20s | Max: 46m 18s
      🟩 GCC12              Pass: 100%/4   | Total: 51m 08s | Avg: 12m 47s | Max: 38m 15s
      🔍 GCC13              Pass:  95%/20  | Total:  5h 28m | Avg: 16m 26s | Max: 32m 02s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 59m | Avg: 39m 46s | Max: 39m 47s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 56m 29s | Avg: 56m 29s | Max: 56m 29s | Hits:  57%/1445  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 47s | Max: 50m 20s | Hits:  57%/2890  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 30m | Avg: 35m 08s | Max: 55m 23s | Hits:  78%/8670  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/51  | Total: 20h 12m | Avg: 23m 45s | Max: 31m 23s
      🔍 GCC                Pass:  98%/55  | Total: 19h 33m | Avg: 21m 20s | Max: 46m 18s
      🟩 Intel              Pass: 100%/3   | Total:  1h 59m | Avg: 39m 46s | Max: 39m 47s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 06m | Avg: 40m 45s | Max: 56m 29s | Hits:  71%/13005 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  98%/99  | Total:  1d 20h | Avg: 26m 43s | Max: 56m 29s | Hits:  57%/8670  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 48m | Avg:  9m 51s | Max: 19m 20s | Hits:  99%/4335  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 58m | Avg: 14m 50s | Max: 17m 31s
    🔍 sm: 90a 🔍
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 30m | Avg: 30m 07s | Max: 46m 18s
      🔍 90a                Pass:  75%/4   | Total: 22m 01s | Avg:  5m 30s | Max: 11m 41s
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/30  | Total: 10h 09m | Avg: 20m 19s | Max: 39m 47s
      🟩 14                 Pass: 100%/34  | Total: 15h 07m | Avg: 26m 41s | Max: 56m 29s | Hits:  67%/5780  
      🟩 17                 Pass: 100%/33  | Total: 14h 25m | Avg: 26m 13s | Max: 50m 13s | Hits:  71%/4335  
      🔍 20                 Pass:  95%/21  | Total:  8h 09m | Avg: 23m 18s | Max: 49m 32s | Hits:  78%/2890  
    🟨 gpu
      🟨 v100               Pass:  99%/118 | Total:  1d 23h | Avg: 24m 20s | Max: 56m 29s | Hits:  71%/13005 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Collaborator

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work, I love the idea! 💚
Since the mechanism is at the core of CUB, I want to make sure all works as expected. I left a few comments that I hope will further improve test coverage.

cub/cub/util_device.cuh Outdated Show resolved Hide resolved
cub/test/catch2_test_util_device.cu Outdated Show resolved Hide resolved
cub/test/catch2_test_util_device.cu Outdated Show resolved Hide resolved
cub/cub/util_device.cuh Show resolved Hide resolved
cub/cub/util_device.cuh Outdated Show resolved Hide resolved
cub/test/catch2_test_util_device.cu Outdated Show resolved Hide resolved
cub/test/catch2_test_util_device.cu Outdated Show resolved Hide resolved
cub/test/catch2_test_util_device.cu Outdated Show resolved Hide resolved
@bernhardmgruber bernhardmgruber force-pushed the chained_policy_prune branch 3 times, most recently from 9c627c6 to 8766281 Compare August 16, 2024 18:52
Copy link
Contributor

🟨 CI finished in 9h 27m: Pass: 83%/250 | Total: 1d 23h | Avg: 11m 19s | Max: 48m 33s | Hits: 98%/16565
  • 🟨 cub: Pass: 68%/131 | Total: 1d 07h | Avg: 14m 30s | Max: 48m 33s | Hits: 97%/3560

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  66%/123 | Total:  1d 06h | Avg: 15m 04s | Max: 48m 33s | Hits:  97%/3560  
      🟩 arm64              Pass: 100%/8   | Total: 46m 16s | Avg:  5m 47s | Max:  7m 42s
    🟨 ctk
      🟥 11.1               Pass:   0%/15  | Total:  6h 51m | Avg: 27m 24s | Max: 30m 32s
      🟩 11.8               Pass: 100%/3   | Total: 17m 17s | Avg:  5m 45s | Max:  8m 23s
      🟨 12.5               Pass:  76%/113 | Total:  1d 00h | Avg: 13m 01s | Max: 48m 33s | Hits:  97%/3560  
    🟨 cudacxx
      🟥 ClangCUDA17        Pass:   0%/2   | Total: 19m 42s | Avg:  9m 51s | Max:  9m 59s
      🟥 nvcc11.1           Pass:   0%/15  | Total:  6h 51m | Avg: 27m 24s | Max: 30m 32s
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 17s | Avg:  5m 45s | Max:  8m 23s
      🟨 nvcc12.5           Pass:  78%/111 | Total:  1d 00h | Avg: 13m 04s | Max: 48m 33s | Hits:  97%/3560  
    🟨 cxx
      🟨 Clang9             Pass:  50%/6   | Total:  1h 44m | Avg: 17m 20s | Max: 28m 45s
      🟩 Clang10            Pass: 100%/3   | Total: 20m 52s | Avg:  6m 57s | Max:  7m 14s
      🟩 Clang11            Pass: 100%/4   | Total: 25m 07s | Avg:  6m 16s | Max:  6m 37s
      🟩 Clang12            Pass: 100%/4   | Total: 25m 27s | Avg:  6m 21s | Max:  6m 51s
      🟩 Clang13            Pass: 100%/4   | Total: 25m 36s | Avg:  6m 24s | Max:  6m 54s
      🟩 Clang14            Pass: 100%/4   | Total: 19m 56s | Avg:  4m 59s | Max:  6m 24s
      🟩 Clang15            Pass: 100%/4   | Total: 20m 44s | Avg:  5m 11s | Max:  6m 27s
      🟩 Clang16            Pass: 100%/4   | Total: 20m 49s | Avg:  5m 12s | Max:  6m 47s
      🟨 Clang17            Pass:  46%/26  | Total:  8h 44m | Avg: 20m 09s | Max: 41m 58s
      🟥 GCC6               Pass:   0%/2   | Total: 56m 49s | Avg: 28m 24s | Max: 29m 21s
      🟨 GCC7               Pass:  50%/6   | Total:  1h 42m | Avg: 17m 01s | Max: 29m 58s
      🟨 GCC8               Pass:  50%/6   | Total:  1h 37m | Avg: 16m 16s | Max: 28m 36s
      🟨 GCC9               Pass:  50%/6   | Total:  1h 42m | Avg: 17m 03s | Max: 30m 32s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 01m | Avg: 15m 28s | Max: 48m 33s
      🟩 GCC11              Pass: 100%/7   | Total: 38m 04s | Avg:  5m 26s | Max:  8m 23s
      🟩 GCC12              Pass: 100%/4   | Total: 20m 12s | Avg:  5m 03s | Max:  6m 37s
      🟨 GCC13              Pass:  57%/28  | Total:  8h 49m | Avg: 18m 54s | Max: 48m 21s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 23m 23s | Avg:  7m 47s | Max:  8m 12s
      🟥 MSVC14.16          Pass:   0%/1   | Total: 16m 49s | Avg: 16m 49s | Max: 16m 49s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 25m 15s | Avg: 12m 37s | Max: 13m 37s | Hits:  97%/1424  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 39m 28s | Avg: 13m 09s | Max: 14m 18s | Hits:  97%/2136  
    🟨 cxx_family
      🟨 Clang              Pass:  71%/59  | Total: 13h 06m | Avg: 13m 20s | Max: 41m 58s
      🟨 GCC                Pass:  63%/63  | Total: 16h 48m | Avg: 16m 00s | Max: 48m 33s
      🟩 Intel              Pass: 100%/3   | Total: 23m 23s | Avg:  7m 47s | Max:  8m 12s
      🟨 MSVC               Pass:  83%/6   | Total:  1h 21m | Avg: 13m 35s | Max: 16m 49s | Hits:  97%/3560  
    🟨 jobs
      🟨 Build              Pass:  82%/99  | Total: 21h 41m | Avg: 13m 08s | Max: 48m 33s | Hits:  97%/3560  
      🟥 DeviceLaunch       Pass:   0%/8   | Total:  2h 24m | Avg: 18m 01s | Max: 27m 45s
      🟥 GraphCapture       Pass:   0%/8   | Total:  2h 05m | Avg: 15m 41s | Max: 19m 17s
      🟥 HostLaunch         Pass:   0%/8   | Total:  2h 18m | Avg: 17m 19s | Max: 18m 27s
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 10m | Avg: 23m 51s | Max: 26m 54s
    🟨 gpu
      🟨 v100               Pass:  68%/131 | Total:  1d 07h | Avg: 14m 30s | Max: 48m 33s | Hits:  97%/3560  
    🟨 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total: 19m 42s | Avg:  9m 51s | Max:  9m 59s
      🟨 nvcc               Pass:  69%/129 | Total:  1d 07h | Avg: 14m 34s | Max: 48m 33s | Hits:  97%/3560  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 17s | Avg:  5m 45s | Max:  8m 23s
      🟩 90a                Pass: 100%/4   | Total: 46m 10s | Avg: 11m 32s | Max: 15m 50s
    🟨 std
      🟨 11                 Pass:  67%/34  | Total:  9h 13m | Avg: 16m 16s | Max: 48m 33s
      🟨 14                 Pass:  67%/37  | Total:  8h 50m | Avg: 14m 19s | Max: 45m 33s | Hits:  97%/1424  
      🟨 17                 Pass:  69%/36  | Total:  8h 05m | Avg: 13m 28s | Max: 45m 48s | Hits:  97%/1424  
      🟨 20                 Pass:  70%/24  | Total:  5h 31m | Avg: 13m 49s | Max: 48m 21s | Hits:  97%/712   
    
  • 🟨 thrust: Pass: 99%/118 | Total: 15h 21m | Avg: 7m 48s | Max: 24m 11s | Hits: 99%/13005

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/110 | Total: 14h 50m | Avg:  8m 05s | Max: 24m 11s | Hits:  99%/13005 
      🟩 arm64              Pass: 100%/8   | Total: 30m 17s | Avg:  3m 47s | Max:  4m 28s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 03m | Avg:  4m 14s | Max: 13m 52s | Hits:  99%/1445  
      🟩 11.8               Pass: 100%/3   | Total: 13m 32s | Avg:  4m 30s | Max:  5m 31s
      🔍 12.5               Pass:  99%/100 | Total: 14h 03m | Avg:  8m 26s | Max: 24m 11s | Hits:  99%/11560 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 11s | Avg:  4m 05s | Max:  4m 08s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 03m | Avg:  4m 14s | Max: 13m 52s | Hits:  99%/1445  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 32s | Avg:  4m 30s | Max:  5m 31s
      🔍 nvcc12.5           Pass:  98%/98  | Total: 13h 55m | Avg:  8m 31s | Max: 24m 11s | Hits:  99%/11560 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 11s | Avg:  4m 05s | Max:  4m 08s
      🔍 nvcc               Pass:  99%/116 | Total: 15h 13m | Avg:  7m 52s | Max: 24m 11s | Hits:  99%/13005 
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total: 24m 59s | Avg:  4m 09s | Max:  5m 06s
      🟩 Clang10            Pass: 100%/3   | Total: 14m 48s | Avg:  4m 56s | Max:  5m 24s
      🟩 Clang11            Pass: 100%/4   | Total: 16m 38s | Avg:  4m 09s | Max:  4m 18s
      🟩 Clang12            Pass: 100%/4   | Total: 15m 45s | Avg:  3m 56s | Max:  4m 04s
      🟩 Clang13            Pass: 100%/4   | Total: 15m 47s | Avg:  3m 56s | Max:  4m 21s
      🟩 Clang14            Pass: 100%/4   | Total: 15m 53s | Avg:  3m 58s | Max:  4m 21s
      🟩 Clang15            Pass: 100%/4   | Total: 16m 52s | Avg:  4m 13s | Max:  4m 29s
      🟩 Clang16            Pass: 100%/4   | Total: 16m 42s | Avg:  4m 10s | Max:  4m 21s
      🟩 Clang17            Pass: 100%/18  | Total:  2h 57m | Avg:  9m 51s | Max: 20m 35s
      🟩 GCC6               Pass: 100%/2   | Total:  7m 11s | Avg:  3m 35s | Max:  3m 47s
      🟩 GCC7               Pass: 100%/6   | Total: 44m 35s | Avg:  7m 25s | Max: 14m 17s
      🟩 GCC8               Pass: 100%/6   | Total: 55m 04s | Avg:  9m 10s | Max: 18m 49s
      🟩 GCC9               Pass: 100%/6   | Total:  1h 01m | Avg: 10m 11s | Max: 21m 30s
      🟩 GCC10              Pass: 100%/4   | Total: 21m 31s | Avg:  5m 22s | Max:  9m 36s
      🟩 GCC11              Pass: 100%/7   | Total: 35m 23s | Avg:  5m 03s | Max: 10m 09s
      🟩 GCC12              Pass: 100%/4   | Total: 46m 13s | Avg: 11m 33s | Max: 13m 30s
      🔍 GCC13              Pass:  95%/20  | Total:  3h 04m | Avg:  9m 12s | Max: 24m 11s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 42s | Avg:  5m 14s | Max:  6m 05s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 52s | Avg: 13m 52s | Max: 13m 52s | Hits:  99%/1445  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 25m 14s | Avg: 12m 37s | Max: 12m 58s | Hits:  99%/2890  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 36m | Avg: 16m 02s | Max: 18m 44s | Hits:  99%/8670  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/51  | Total:  5h 14m | Avg:  6m 10s | Max: 20m 35s
      🔍 GCC                Pass:  98%/55  | Total:  7h 35m | Avg:  8m 16s | Max: 24m 11s
      🟩 Intel              Pass: 100%/3   | Total: 15m 42s | Avg:  5m 14s | Max:  6m 05s
      🟩 MSVC               Pass: 100%/9   | Total:  2h 15m | Avg: 15m 02s | Max: 18m 44s | Hits:  99%/13005 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  98%/99  | Total: 11h 45m | Avg:  7m 07s | Max: 24m 11s | Hits:  99%/8670  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 47m | Avg:  9m 49s | Max: 18m 44s | Hits:  99%/4335  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 47m | Avg: 13m 25s | Max: 15m 56s
    🔍 sm: 90a 🔍
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 32s | Avg:  4m 30s | Max:  5m 31s
      🔍 90a                Pass:  75%/4   | Total: 14m 41s | Avg:  3m 40s | Max:  4m 01s
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/30  | Total:  3h 02m | Avg:  6m 05s | Max: 17m 09s
      🟩 14                 Pass: 100%/34  | Total:  4h 40m | Avg:  8m 15s | Max: 23m 14s | Hits:  99%/5780  
      🟩 17                 Pass: 100%/33  | Total:  4h 42m | Avg:  8m 32s | Max: 24m 11s | Hits:  99%/4335  
      🔍 20                 Pass:  95%/21  | Total:  2h 55m | Avg:  8m 22s | Max: 20m 35s | Hits:  99%/2890  
    🟨 gpu
      🟨 v100               Pass:  99%/118 | Total: 15h 21m | Avg:  7m 48s | Max: 24m 11s | Hits:  99%/13005 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 39s | Avg: 11m 39s | Max: 11m 39s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber force-pushed the chained_policy_prune branch 2 times, most recently from bd97bb1 to bedf081 Compare August 19, 2024 16:22
@bernhardmgruber bernhardmgruber force-pushed the chained_policy_prune branch 2 times, most recently from 44e810f to 5b3ef92 Compare August 28, 2024 10:42
Copy link
Contributor

🟨 CI finished in 4h 01m: Pass: 98%/250 | Total: 5d 21h | Avg: 33m 57s | Max: 1h 17m | Hits: 65%/16657
  • 🟨 cub: Pass: 96%/131 | Total: 3d 19h | Avg: 42m 06s | Max: 1h 17m | Hits: 42%/3580

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  96%/123 | Total:  3d 12h | Avg: 41m 16s | Max:  1h 17m | Hits:  42%/3580  
      🟩 arm64              Pass: 100%/8   | Total:  7h 18m | Avg: 54m 46s | Max: 59m 14s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total: 11h 09m | Avg: 44m 37s | Max: 55m 04s | Hits:  42%/716   
      🟩 11.8               Pass: 100%/3   | Total:  3h 20m | Avg:  1h 06m | Max:  1h 09m
      🔍 12.5               Pass:  96%/113 | Total:  3d 05h | Avg: 41m 06s | Max:  1h 17m | Hits:  42%/2864  
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 46m 08s | Avg: 23m 04s | Max: 23m 54s
      🟩 nvcc11.1           Pass: 100%/15  | Total: 11h 09m | Avg: 44m 37s | Max: 55m 04s | Hits:  42%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  3h 20m | Avg:  1h 06m | Max:  1h 09m
      🔍 nvcc12.5           Pass:  96%/111 | Total:  3d 04h | Avg: 41m 26s | Max:  1h 17m | Hits:  42%/2864  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 46m 08s | Avg: 23m 04s | Max: 23m 54s
      🔍 nvcc               Pass:  96%/129 | Total:  3d 19h | Avg: 42m 23s | Max:  1h 17m | Hits:  42%/3580  
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  4h 33m | Avg: 45m 39s | Max: 49m 23s
      🟩 Clang10            Pass: 100%/3   | Total:  2h 25m | Avg: 48m 34s | Max: 48m 56s
      🟩 Clang11            Pass: 100%/4   | Total:  3h 18m | Avg: 49m 32s | Max: 54m 03s
      🟩 Clang12            Pass: 100%/4   | Total:  3h 15m | Avg: 48m 56s | Max: 51m 04s
      🟩 Clang13            Pass: 100%/4   | Total:  3h 16m | Avg: 49m 09s | Max: 50m 30s
      🟩 Clang14            Pass: 100%/4   | Total:  3h 17m | Avg: 49m 23s | Max: 50m 30s
      🟩 Clang15            Pass: 100%/4   | Total:  3h 28m | Avg: 52m 09s | Max: 56m 34s
      🟩 Clang16            Pass: 100%/4   | Total:  3h 12m | Avg: 48m 06s | Max: 49m 05s
      🟨 Clang17            Pass:  96%/26  | Total: 12h 57m | Avg: 29m 53s | Max: 59m 14s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 27m | Avg: 43m 40s | Max: 44m 27s
      🟨 GCC7               Pass:  83%/6   | Total:  4h 39m | Avg: 46m 37s | Max: 50m 54s
      🟩 GCC8               Pass: 100%/6   | Total:  4h 42m | Avg: 47m 09s | Max: 53m 13s
      🟩 GCC9               Pass: 100%/6   | Total:  4h 49m | Avg: 48m 15s | Max: 53m 35s
      🟩 GCC10              Pass: 100%/4   | Total:  3h 22m | Avg: 50m 30s | Max: 52m 01s
      🟩 GCC11              Pass: 100%/7   | Total:  6h 41m | Avg: 57m 18s | Max:  1h 09m
      🟩 GCC12              Pass: 100%/4   | Total:  3h 31m | Avg: 52m 57s | Max: 55m 23s
      🟨 GCC13              Pass:  96%/28  | Total: 14h 31m | Avg: 31m 07s | Max:  1h 17m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 34m | Avg: 51m 35s | Max: 53m 12s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 55m 04s | Avg: 55m 04s | Max: 55m 04s | Hits:  42%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 58m | Avg: 59m 18s | Max:  1h 00m | Hits:  42%/1432  
      🟨 MSVC14.39          Pass:  66%/3   | Total:  2h 55m | Avg: 58m 26s | Max:  1h 01m | Hits:  42%/1432  
    🟨 cxx_family
      🟨 Clang              Pass:  98%/59  | Total:  1d 15h | Avg: 40m 26s | Max: 59m 14s
      🟨 GCC                Pass:  96%/63  | Total:  1d 19h | Avg: 41m 40s | Max:  1h 17m
      🟩 Intel              Pass: 100%/3   | Total:  2h 34m | Avg: 51m 35s | Max: 53m 12s
      🟨 MSVC               Pass:  83%/6   | Total:  5h 48m | Avg: 58m 09s | Max:  1h 01m | Hits:  42%/3580  
    🟨 jobs
      🟨 Build              Pass:  97%/99  | Total:  3d 08h | Avg: 48m 51s | Max:  1h 09m | Hits:  42%/3580  
      🟨 DeviceLaunch       Pass:  75%/8   | Total:  2h 04m | Avg: 15m 32s | Max: 26m 32s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 18m | Avg: 17m 18s | Max: 23m 53s
      🟩 HostLaunch         Pass: 100%/8   | Total:  3h 30m | Avg: 26m 16s | Max:  1h 17m
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 25m | Avg: 25m 42s | Max: 31m 35s
    🟨 std
      🟨 11                 Pass:  94%/34  | Total: 23h 18m | Avg: 41m 08s | Max:  1h 03m
      🟨 14                 Pass:  97%/37  | Total:  1d 02h | Avg: 43m 19s | Max:  1h 09m | Hits:  42%/2148  
      🟩 17                 Pass: 100%/36  | Total:  1d 02h | Avg: 44m 14s | Max:  1h 17m | Hits:  42%/1432  
      🟨 20                 Pass:  95%/24  | Total: 15h 21m | Avg: 38m 23s | Max: 57m 04s
    🟨 gpu
      🟨 v100               Pass:  96%/131 | Total:  3d 19h | Avg: 42m 06s | Max:  1h 17m | Hits:  42%/3580  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  3h 20m | Avg:  1h 06m | Max:  1h 09m
      🟩 90a                Pass: 100%/4   | Total:  1h 22m | Avg: 20m 32s | Max: 21m 09s
    
  • 🟨 thrust: Pass: 99%/118 | Total: 2d 01h | Avg: 25m 07s | Max: 57m 18s | Hits: 71%/13077

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/110 | Total:  1d 22h | Avg: 25m 05s | Max: 57m 18s | Hits:  71%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  3h 24m | Avg: 25m 31s | Max: 28m 57s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  6h 30m | Avg: 26m 02s | Max: 50m 42s | Hits:  57%/1453  
      🟩 11.8               Pass: 100%/3   | Total:  1h 41m | Avg: 33m 41s | Max: 36m 23s
      🔍 12.5               Pass:  99%/100 | Total:  1d 17h | Avg: 24m 43s | Max: 57m 18s | Hits:  73%/11624 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 52m 15s | Avg: 26m 07s | Max: 27m 40s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 30m | Avg: 26m 02s | Max: 50m 42s | Hits:  57%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 41m | Avg: 33m 41s | Max: 36m 23s
      🔍 nvcc12.5           Pass:  98%/98  | Total:  1d 16h | Avg: 24m 42s | Max: 57m 18s | Hits:  73%/11624 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 52m 15s | Avg: 26m 07s | Max: 27m 40s
      🔍 nvcc               Pass:  99%/116 | Total:  2d 00h | Avg: 25m 06s | Max: 57m 18s | Hits:  71%/13077 
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 27m | Avg: 24m 33s | Max: 28m 12s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 20m | Avg: 26m 42s | Max: 29m 22s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 42m | Avg: 25m 38s | Max: 28m 20s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 57s | Max: 30m 03s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 10s | Max: 28m 51s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 09s | Max: 28m 15s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 45m | Avg: 26m 16s | Max: 28m 57s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 42m | Avg: 25m 44s | Max: 28m 02s
      🟩 Clang17            Pass: 100%/18  | Total:  5h 47m | Avg: 19m 18s | Max: 27m 41s
      🟩 GCC6               Pass: 100%/2   | Total: 45m 17s | Avg: 22m 38s | Max: 25m 59s
      🟩 GCC7               Pass: 100%/6   | Total:  2h 38m | Avg: 26m 21s | Max: 33m 26s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 35m | Avg: 25m 55s | Max: 29m 12s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 32m | Avg: 25m 23s | Max: 30m 22s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 50m | Avg: 27m 41s | Max: 30m 10s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 35m | Avg: 30m 50s | Max: 36m 23s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 55m | Avg: 28m 58s | Max: 31m 16s
      🔍 GCC13              Pass:  95%/20  | Total:  5h 51m | Avg: 17m 34s | Max: 32m 40s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 34m | Avg: 31m 25s | Max: 34m 34s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 50m 42s | Avg: 50m 42s | Max: 50m 42s | Hits:  57%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 36m | Avg: 48m 20s | Max: 48m 58s | Hits:  57%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 39m | Avg: 36m 32s | Max: 57m 18s | Hits:  78%/8718  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/51  | Total: 19h 58m | Avg: 23m 30s | Max: 30m 03s
      🔍 GCC                Pass:  98%/55  | Total: 21h 45m | Avg: 23m 43s | Max: 36m 23s
      🟩 Intel              Pass: 100%/3   | Total:  1h 34m | Avg: 31m 25s | Max: 34m 34s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 06m | Avg: 40m 43s | Max: 57m 18s | Hits:  71%/13077 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  98%/99  | Total:  1d 21h | Avg: 27m 39s | Max: 57m 18s | Hits:  57%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 50m | Avg: 10m 04s | Max: 19m 44s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 56m | Avg: 14m 33s | Max: 22m 02s
    🔍 sm: 90a 🔍
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 41m | Avg: 33m 41s | Max: 36m 23s
      🔍 90a                Pass:  75%/4   | Total: 53m 25s | Avg: 13m 21s | Max: 15m 49s
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/30  | Total: 10h 21m | Avg: 20m 43s | Max: 29m 18s
      🟩 14                 Pass: 100%/34  | Total: 15h 27m | Avg: 27m 17s | Max: 57m 18s | Hits:  67%/5812  
      🟩 17                 Pass: 100%/33  | Total: 14h 55m | Avg: 27m 07s | Max: 49m 38s | Hits:  71%/4359  
      🔍 20                 Pass:  95%/21  | Total:  8h 40m | Avg: 24m 47s | Max: 54m 28s | Hits:  78%/2906  
    🟨 gpu
      🟨 v100               Pass:  99%/118 | Total:  2d 01h | Avg: 25m 07s | Max: 57m 18s | Hits:  71%/13077 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

cub/cub/util_device.cuh Outdated Show resolved Hide resolved
Copy link
Contributor

🟨 CI finished in 7h 42m: Pass: 99%/251 | Total: 5d 23h | Avg: 34m 12s | Max: 1h 05m | Hits: 64%/17373
  • 🟨 cub: Pass: 99%/132 | Total: 3d 20h | Avg: 41m 59s | Max: 1h 05m | Hits: 42%/4296

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/124 | Total:  3d 12h | Avg: 40m 59s | Max:  1h 04m | Hits:  42%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  7h 39m | Avg: 57m 28s | Max:  1h 05m
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total: 11h 12m | Avg: 44m 48s | Max: 54m 14s | Hits:  42%/716   
      🟩 11.8               Pass: 100%/3   | Total:  3h 12m | Avg:  1h 04m | Max:  1h 04m
      🔍 12.5               Pass:  99%/114 | Total:  3d 05h | Avg: 41m 02s | Max:  1h 05m | Hits:  42%/3580  
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 53s | Avg: 23m 56s | Max: 25m 02s
      🟩 nvcc11.1           Pass: 100%/15  | Total: 11h 12m | Avg: 44m 48s | Max: 54m 14s | Hits:  42%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  3h 12m | Avg:  1h 04m | Max:  1h 04m
      🔍 nvcc12.5           Pass:  99%/112 | Total:  3d 05h | Avg: 41m 20s | Max:  1h 05m | Hits:  42%/3580  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 53s | Avg: 23m 56s | Max: 25m 02s
      🔍 nvcc               Pass:  99%/130 | Total:  3d 19h | Avg: 42m 16s | Max:  1h 05m | Hits:  42%/4296  
    🔍 cxx: Clang17 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  4h 49m | Avg: 48m 15s | Max: 54m 39s
      🟩 Clang10            Pass: 100%/3   | Total:  2h 40m | Avg: 53m 39s | Max:  1h 00m
      🟩 Clang11            Pass: 100%/4   | Total:  3h 26m | Avg: 51m 30s | Max: 53m 44s
      🟩 Clang12            Pass: 100%/4   | Total:  3h 20m | Avg: 50m 14s | Max: 51m 04s
      🟩 Clang13            Pass: 100%/4   | Total:  3h 16m | Avg: 49m 04s | Max: 50m 23s
      🟩 Clang14            Pass: 100%/4   | Total:  3h 22m | Avg: 50m 40s | Max: 52m 19s
      🟩 Clang15            Pass: 100%/4   | Total:  3h 29m | Avg: 52m 15s | Max: 54m 03s
      🟩 Clang16            Pass: 100%/4   | Total:  3h 19m | Avg: 49m 50s | Max: 54m 02s
      🔍 Clang17            Pass:  96%/26  | Total: 12h 55m | Avg: 29m 48s | Max:  1h 03m
      🟩 GCC6               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 08s | Max: 48m 19s
      🟩 GCC7               Pass: 100%/6   | Total:  4h 36m | Avg: 46m 00s | Max: 50m 38s
      🟩 GCC8               Pass: 100%/6   | Total:  4h 35m | Avg: 45m 52s | Max: 48m 40s
      🟩 GCC9               Pass: 100%/6   | Total:  4h 55m | Avg: 49m 17s | Max: 56m 14s
      🟩 GCC10              Pass: 100%/4   | Total:  3h 26m | Avg: 51m 41s | Max: 54m 50s
      🟩 GCC11              Pass: 100%/7   | Total:  6h 27m | Avg: 55m 23s | Max:  1h 04m
      🟩 GCC12              Pass: 100%/4   | Total:  3h 20m | Avg: 50m 11s | Max: 54m 01s
      🟩 GCC13              Pass: 100%/29  | Total: 14h 00m | Avg: 28m 59s | Max:  1h 05m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 43m | Avg: 54m 25s | Max: 56m 59s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 54m 14s | Avg: 54m 14s | Max: 54m 14s | Hits:  42%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m | Hits:  42%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 09m | Avg:  1h 03m | Max:  1h 03m | Hits:  42%/2148  
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  98%/59  | Total:  1d 16h | Avg: 41m 21s | Max:  1h 03m
      🟩 GCC                Pass: 100%/64  | Total:  1d 18h | Avg: 40m 14s | Max:  1h 05m
      🟩 Intel              Pass: 100%/3   | Total:  2h 43m | Avg: 54m 25s | Max: 56m 59s
      🟩 MSVC               Pass: 100%/6   | Total:  6h 04m | Avg:  1h 00m | Max:  1h 03m | Hits:  42%/4296  
    🔍 jobs: GraphCapture 🔍
      🟩 Build              Pass: 100%/99  | Total:  3d 10h | Avg: 49m 45s | Max:  1h 05m | Hits:  42%/4296  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 22m | Avg: 17m 48s | Max: 19m 52s
      🔍 GraphCapture       Pass:  87%/8   | Total:  1h 50m | Avg: 13m 49s | Max: 16m 48s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 17m | Avg: 17m 08s | Max: 19m 02s
      🟩 SmallGMem          Pass: 100%/1   | Total: 31m 16s | Avg: 31m 16s | Max: 31m 16s
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 15m | Avg: 24m 25s | Max: 30m 55s
    🔍 std: 14 🔍
      🟩 11                 Pass: 100%/34  | Total: 23h 28m | Avg: 41m 25s | Max:  1h 02m
      🔍 14                 Pass:  97%/37  | Total:  1d 02h | Avg: 42m 57s | Max:  1h 04m | Hits:  42%/2148  
      🟩 17                 Pass: 100%/37  | Total:  1d 02h | Avg: 43m 11s | Max:  1h 04m | Hits:  42%/1432  
      🟩 20                 Pass: 100%/24  | Total: 15h 46m | Avg: 39m 26s | Max:  1h 05m | Hits:  42%/716   
    🟨 gpu
      🟨 v100               Pass:  99%/132 | Total:  3d 20h | Avg: 41m 59s | Max:  1h 05m | Hits:  42%/4296  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  3h 12m | Avg:  1h 04m | Max:  1h 04m
      🟩 90a                Pass: 100%/4   | Total:  1h 21m | Avg: 20m 26s | Max: 21m 19s
    
  • 🟨 thrust: Pass: 99%/118 | Total: 2d 02h | Avg: 25m 40s | Max: 56m 46s | Hits: 71%/13077

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/110 | Total:  1d 23h | Avg: 25m 41s | Max: 56m 46s | Hits:  71%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  3h 24m | Avg: 25m 31s | Max: 28m 40s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  6h 34m | Avg: 26m 16s | Max: 56m 19s | Hits:  57%/1453  
      🟩 11.8               Pass: 100%/3   | Total:  1h 44m | Avg: 34m 40s | Max: 39m 16s
      🔍 12.5               Pass:  99%/100 | Total:  1d 18h | Avg: 25m 19s | Max: 56m 46s | Hits:  73%/11624 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 52m 45s | Avg: 26m 22s | Max: 27m 20s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 34m | Avg: 26m 16s | Max: 56m 19s | Hits:  57%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 44m | Avg: 34m 40s | Max: 39m 16s
      🔍 nvcc12.5           Pass:  98%/98  | Total:  1d 17h | Avg: 25m 17s | Max: 56m 46s | Hits:  73%/11624 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 52m 45s | Avg: 26m 22s | Max: 27m 20s
      🔍 nvcc               Pass:  99%/116 | Total:  2d 01h | Avg: 25m 39s | Max: 56m 46s | Hits:  71%/13077 
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 29m | Avg: 24m 54s | Max: 28m 52s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 26m | Avg: 28m 55s | Max: 32m 23s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 05s | Max: 29m 15s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 47s | Max: 30m 23s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 45m | Avg: 26m 22s | Max: 28m 20s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 48m | Avg: 27m 04s | Max: 30m 22s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 46m | Avg: 26m 34s | Max: 29m 27s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 54s | Max: 28m 48s
      🟩 Clang17            Pass: 100%/18  | Total:  5h 57m | Avg: 19m 50s | Max: 29m 44s
      🟩 GCC6               Pass: 100%/2   | Total: 46m 38s | Avg: 23m 19s | Max: 24m 46s
      🟩 GCC7               Pass: 100%/6   | Total:  2h 32m | Avg: 25m 23s | Max: 30m 26s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 31m | Avg: 25m 15s | Max: 28m 48s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 38m | Avg: 26m 20s | Max: 30m 02s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 54m | Avg: 28m 38s | Max: 32m 18s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 40m | Avg: 31m 28s | Max: 39m 16s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 54m | Avg: 28m 33s | Max: 33m 08s
      🔍 GCC13              Pass:  95%/20  | Total:  6h 05m | Avg: 18m 15s | Max: 31m 04s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 37m | Avg: 32m 39s | Max: 38m 10s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 56m 19s | Avg: 56m 19s | Max: 56m 19s | Hits:  57%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 44m | Avg: 52m 16s | Max: 54m 21s | Hits:  57%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 39m | Avg: 36m 39s | Max: 56m 46s | Hits:  78%/8718  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/51  | Total: 20h 28m | Avg: 24m 05s | Max: 32m 23s
      🔍 GCC                Pass:  98%/55  | Total: 22h 02m | Avg: 24m 03s | Max: 39m 16s
      🟩 Intel              Pass: 100%/3   | Total:  1h 37m | Avg: 32m 39s | Max: 38m 10s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 20m | Avg: 42m 18s | Max: 56m 46s | Hits:  71%/13077 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  98%/99  | Total:  1d 22h | Avg: 28m 01s | Max: 56m 46s | Hits:  57%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 55m | Avg: 10m 27s | Max: 21m 39s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 20m | Avg: 17m 34s | Max: 19m 42s
    🔍 sm: 90a 🔍
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 44m | Avg: 34m 40s | Max: 39m 16s
      🔍 90a                Pass:  75%/4   | Total: 52m 52s | Avg: 13m 13s | Max: 15m 15s
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/30  | Total: 10h 35m | Avg: 21m 11s | Max: 29m 46s
      🟩 14                 Pass: 100%/34  | Total: 15h 25m | Avg: 27m 14s | Max: 56m 19s | Hits:  67%/5812  
      🟩 17                 Pass: 100%/33  | Total: 15h 17m | Avg: 27m 48s | Max: 54m 21s | Hits:  71%/4359  
      🔍 20                 Pass:  95%/21  | Total:  9h 10m | Avg: 26m 12s | Max: 56m 46s | Hits:  78%/2906  
    🟨 gpu
      🟨 v100               Pass:  99%/118 | Total:  2d 02h | Avg: 25m 40s | Max: 56m 46s | Hits:  71%/13077 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 251)

# Runner
178 linux-amd64-cpu16
42 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

bernhardmgruber and others added 4 commits August 30, 2024 12:23
```
/home/coder/cccl/thrust/thrust/cmake/../../thrust/iterator/detail/transform_input_output_iterator.inl:68:9: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
     68 |     *io = output_function(x);
        |     ~~~~^~~~~~~~~~~~~~~~~~~~~
```
Copy link
Contributor

🟨 CI finished in 4h 12m: Pass: 99%/251 | Total: 3d 15h | Avg: 21m 01s | Max: 1h 40m | Hits: 72%/17373
  • 🟨 cub: Pass: 98%/132 | Total: 2d 08h | Avg: 25m 35s | Max: 1h 40m | Hits: 59%/4296

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  98%/124 | Total:  2d 04h | Avg: 25m 16s | Max:  1h 40m | Hits:  59%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  4h 05m | Avg: 30m 37s | Max: 55m 28s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 11.8               Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🔍 12.5               Pass:  98%/114 | Total:  2d 01h | Avg: 26m 11s | Max:  1h 40m | Hits:  59%/3580  
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🔍 nvcc12.5           Pass:  98%/112 | Total:  2d 01h | Avg: 26m 35s | Max:  1h 40m | Hits:  59%/3580  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🔍 nvcc               Pass:  98%/130 | Total:  2d 08h | Avg: 25m 55s | Max:  1h 40m | Hits:  59%/4296  
    🔍 cxx: Clang17 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 03m | Avg: 20m 39s | Max: 53m 29s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 06s | Max: 50m 46s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 18s | Max: 50m 10s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 54m | Avg: 28m 39s | Max: 50m 12s
      🟩 Clang13            Pass: 100%/4   | Total:  2h 04m | Avg: 31m 04s | Max: 54m 35s
      🟩 Clang14            Pass: 100%/4   | Total:  2h 03m | Avg: 30m 54s | Max: 54m 39s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 26s | Max: 51m 24s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 56m | Avg: 29m 03s | Max: 52m 03s
      🔍 Clang17            Pass:  92%/26  | Total:  8h 55m | Avg: 20m 35s | Max: 53m 48s
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 31s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 49m | Avg: 18m 18s | Max: 47m 17s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 00m | Avg: 20m 02s | Max: 52m 47s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 37m | Avg: 26m 18s | Max: 51m 09s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 57m | Avg: 29m 17s | Max: 51m 53s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 27m | Avg: 29m 37s | Max:  1h 11m
      🟩 GCC12              Pass: 100%/4   | Total:  1h 58m | Avg: 29m 33s | Max: 51m 53s
      🟩 GCC13              Pass: 100%/29  | Total: 11h 37m | Avg: 24m 03s | Max:  1h 40m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 52m 23s | Avg: 52m 23s | Max: 52m 23s | Hits:  59%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m | Hits:  59%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 56m | Avg: 58m 45s | Max: 59m 26s | Hits:  59%/2148  
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  96%/59  | Total: 23h 51m | Avg: 24m 15s | Max: 54m 39s
      🟩 GCC                Pass: 100%/64  | Total:  1d 01h | Avg: 23m 59s | Max:  1h 40m
      🟩 Intel              Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC               Pass: 100%/6   | Total:  5h 48m | Avg: 58m 08s | Max:  1h 02m | Hits:  59%/4296  
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/34  | Total:  6h 05m | Avg: 10m 44s | Max: 41m 23s
      🟩 14                 Pass: 100%/37  | Total:  8h 16m | Avg: 13m 24s | Max: 58m 20s | Hits:  59%/2148  
      🟩 17                 Pass: 100%/37  | Total:  1d 03h | Avg: 44m 07s | Max:  1h 40m | Hits:  59%/1432  
      🔍 20                 Pass:  91%/24  | Total: 14h 44m | Avg: 36m 51s | Max: 59m 26s | Hits:  59%/716   
    🟨 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 20h | Avg: 27m 09s | Max:  1h 11m | Hits:  59%/4296  
      🟨 DeviceLaunch       Pass:  87%/8   | Total:  2h 10m | Avg: 16m 21s | Max: 19m 55s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 13m | Avg: 16m 42s | Max: 32m 53s
      🟨 HostLaunch         Pass:  87%/8   | Total:  2h 03m | Avg: 15m 28s | Max: 18m 33s
      🟩 SmallGMem          Pass: 100%/1   | Total: 31m 38s | Avg: 31m 38s | Max: 31m 38s
      🟩 TestGPU            Pass: 100%/8   | Total:  4h 29m | Avg: 33m 39s | Max:  1h 40m
    🟨 gpu
      🟨 v100               Pass:  98%/132 | Total:  2d 08h | Avg: 25m 35s | Max:  1h 40m | Hits:  59%/4296  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🟩 90a                Pass: 100%/4   | Total: 52m 24s | Avg: 13m 06s | Max: 21m 43s
    
  • 🟩 thrust: Pass: 100%/118 | Total: 1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits: 76%/13077

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  1d 05h | Avg: 16m 04s | Max: 52m 53s | Hits:  76%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  2h 00m | Avg: 15m 05s | Max: 27m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 11.8               Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 12.5               Pass: 100%/100 | Total:  1d 03h | Avg: 16m 31s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 03h | Avg: 16m 46s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc               Pass: 100%/116 | Total:  1d 07h | Avg: 16m 12s | Max: 52m 53s | Hits:  76%/13077 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  1h 12m | Avg: 12m 09s | Max: 28m 39s
      🟩 Clang10            Pass: 100%/3   | Total: 41m 59s | Avg: 13m 59s | Max: 31m 52s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 03m | Avg: 15m 45s | Max: 29m 03s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 28m 53s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 37s | Max: 27m 09s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 34s | Max: 28m 02s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 05m | Avg: 16m 28s | Max: 28m 48s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 06m | Avg: 16m 35s | Max: 29m 46s
      🟩 Clang17            Pass: 100%/18  | Total:  3h 30m | Avg: 11m 41s | Max: 28m 29s
      🟩 GCC6               Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  8m 21s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 45s | Max: 29m 45s
      🟩 GCC8               Pass: 100%/6   | Total:  1h 13m | Avg: 12m 12s | Max: 31m 52s
      🟩 GCC9               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 46s | Max: 30m 33s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 10m | Avg: 17m 34s | Max: 31m 09s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 51m | Avg: 15m 52s | Max: 35m 56s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 30m 58s
      🟩 GCC13              Pass: 100%/20  | Total:  4h 53m | Avg: 14m 41s | Max: 52m 53s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 48m 28s | Avg: 48m 28s | Max: 48m 28s | Hits:  65%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 36s | Max: 51m 56s | Hits:  65%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 29m | Avg: 34m 55s | Max: 51m 37s | Hits:  82%/8718  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 11h 50m | Avg: 13m 55s | Max: 31m 52s
      🟩 GCC                Pass: 100%/55  | Total: 12h 49m | Avg: 13m 59s | Max: 52m 53s
      🟩 Intel              Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 01m | Avg: 40m 07s | Max: 51m 56s | Hits:  76%/13077 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits:  76%/13077 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 03h | Avg: 16m 27s | Max: 51m 56s | Hits:  65%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 53m | Avg: 10m 21s | Max: 20m 22s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 25m | Avg: 18m 12s | Max: 52m 53s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 90a                Pass: 100%/4   | Total: 38m 12s | Avg:  9m 33s | Max: 16m 00s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 24m | Avg:  4m 48s | Max: 14m 42s
      🟩 14                 Pass: 100%/34  | Total:  5h 25m | Avg:  9m 35s | Max: 51m 16s | Hits:  74%/5812  
      🟩 17                 Pass: 100%/33  | Total: 14h 41m | Avg: 26m 43s | Max: 51m 56s | Hits:  76%/4359  
      🟩 20                 Pass: 100%/21  | Total:  8h 56m | Avg: 25m 32s | Max: 52m 53s | Hits:  82%/2906  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 251)

# Runner
178 linux-amd64-cpu16
42 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟨 CI finished in 4h 49m: Pass: 99%/251 | Total: 3d 16h | Avg: 21m 03s | Max: 1h 40m | Hits: 72%/17373
  • 🟨 cub: Pass: 99%/132 | Total: 2d 08h | Avg: 25m 39s | Max: 1h 40m | Hits: 59%/4296

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/124 | Total:  2d 04h | Avg: 25m 20s | Max:  1h 40m | Hits:  59%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  4h 05m | Avg: 30m 37s | Max: 55m 28s
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 11.8               Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🔍 12.5               Pass:  99%/114 | Total:  2d 01h | Avg: 26m 15s | Max:  1h 40m | Hits:  59%/3580  
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🔍 nvcc12.5           Pass:  99%/112 | Total:  2d 01h | Avg: 26m 40s | Max:  1h 40m | Hits:  59%/3580  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🔍 nvcc               Pass:  99%/130 | Total:  2d 08h | Avg: 25m 59s | Max:  1h 40m | Hits:  59%/4296  
    🔍 cxx: Clang17 🔍
      🟩 Clang9             Pass: 100%/6   | Total:  2h 03m | Avg: 20m 39s | Max: 53m 29s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 06s | Max: 50m 46s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 18s | Max: 50m 10s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 54m | Avg: 28m 39s | Max: 50m 12s
      🟩 Clang13            Pass: 100%/4   | Total:  2h 04m | Avg: 31m 04s | Max: 54m 35s
      🟩 Clang14            Pass: 100%/4   | Total:  2h 03m | Avg: 30m 54s | Max: 54m 39s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 26s | Max: 51m 24s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 56m | Avg: 29m 03s | Max: 52m 03s
      🔍 Clang17            Pass:  96%/26  | Total:  9h 04m | Avg: 20m 55s | Max: 53m 48s
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 31s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 49m | Avg: 18m 18s | Max: 47m 17s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 00m | Avg: 20m 02s | Max: 52m 47s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 37m | Avg: 26m 18s | Max: 51m 09s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 57m | Avg: 29m 17s | Max: 51m 53s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 27m | Avg: 29m 37s | Max:  1h 11m
      🟩 GCC12              Pass: 100%/4   | Total:  1h 58m | Avg: 29m 33s | Max: 51m 53s
      🟩 GCC13              Pass: 100%/29  | Total: 11h 37m | Avg: 24m 03s | Max:  1h 40m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 52m 23s | Avg: 52m 23s | Max: 52m 23s | Hits:  59%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m | Hits:  59%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 56m | Avg: 58m 45s | Max: 59m 26s | Hits:  59%/2148  
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  98%/59  | Total:  1d 00h | Avg: 24m 24s | Max: 54m 39s
      🟩 GCC                Pass: 100%/64  | Total:  1d 01h | Avg: 23m 59s | Max:  1h 40m
      🟩 Intel              Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC               Pass: 100%/6   | Total:  5h 48m | Avg: 58m 08s | Max:  1h 02m | Hits:  59%/4296  
    🔍 jobs: DeviceLaunch 🔍
      🟩 Build              Pass: 100%/99  | Total:  1d 20h | Avg: 27m 09s | Max:  1h 11m | Hits:  59%/4296  
      🔍 DeviceLaunch       Pass:  87%/8   | Total:  2h 09m | Avg: 16m 14s | Max: 19m 55s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 13m | Avg: 16m 42s | Max: 32m 53s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 13m | Avg: 16m 41s | Max: 18m 33s
      🟩 SmallGMem          Pass: 100%/1   | Total: 31m 38s | Avg: 31m 38s | Max: 31m 38s
      🟩 TestGPU            Pass: 100%/8   | Total:  4h 29m | Avg: 33m 39s | Max:  1h 40m
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/34  | Total:  6h 05m | Avg: 10m 44s | Max: 41m 23s
      🟩 14                 Pass: 100%/37  | Total:  8h 16m | Avg: 13m 24s | Max: 58m 20s | Hits:  59%/2148  
      🟩 17                 Pass: 100%/37  | Total:  1d 03h | Avg: 44m 07s | Max:  1h 40m | Hits:  59%/1432  
      🔍 20                 Pass:  95%/24  | Total: 14h 53m | Avg: 37m 13s | Max: 59m 26s | Hits:  59%/716   
    🟨 gpu
      🟨 v100               Pass:  99%/132 | Total:  2d 08h | Avg: 25m 39s | Max:  1h 40m | Hits:  59%/4296  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🟩 90a                Pass: 100%/4   | Total: 52m 24s | Avg: 13m 06s | Max: 21m 43s
    
  • 🟩 thrust: Pass: 100%/118 | Total: 1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits: 76%/13077

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  1d 05h | Avg: 16m 04s | Max: 52m 53s | Hits:  76%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  2h 00m | Avg: 15m 05s | Max: 27m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 11.8               Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 12.5               Pass: 100%/100 | Total:  1d 03h | Avg: 16m 31s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 03h | Avg: 16m 46s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc               Pass: 100%/116 | Total:  1d 07h | Avg: 16m 12s | Max: 52m 53s | Hits:  76%/13077 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  1h 12m | Avg: 12m 09s | Max: 28m 39s
      🟩 Clang10            Pass: 100%/3   | Total: 41m 59s | Avg: 13m 59s | Max: 31m 52s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 03m | Avg: 15m 45s | Max: 29m 03s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 28m 53s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 37s | Max: 27m 09s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 34s | Max: 28m 02s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 05m | Avg: 16m 28s | Max: 28m 48s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 06m | Avg: 16m 35s | Max: 29m 46s
      🟩 Clang17            Pass: 100%/18  | Total:  3h 30m | Avg: 11m 41s | Max: 28m 29s
      🟩 GCC6               Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  8m 21s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 45s | Max: 29m 45s
      🟩 GCC8               Pass: 100%/6   | Total:  1h 13m | Avg: 12m 12s | Max: 31m 52s
      🟩 GCC9               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 46s | Max: 30m 33s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 10m | Avg: 17m 34s | Max: 31m 09s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 51m | Avg: 15m 52s | Max: 35m 56s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 30m 58s
      🟩 GCC13              Pass: 100%/20  | Total:  4h 53m | Avg: 14m 41s | Max: 52m 53s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 48m 28s | Avg: 48m 28s | Max: 48m 28s | Hits:  65%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 36s | Max: 51m 56s | Hits:  65%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 29m | Avg: 34m 55s | Max: 51m 37s | Hits:  82%/8718  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 11h 50m | Avg: 13m 55s | Max: 31m 52s
      🟩 GCC                Pass: 100%/55  | Total: 12h 49m | Avg: 13m 59s | Max: 52m 53s
      🟩 Intel              Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 01m | Avg: 40m 07s | Max: 51m 56s | Hits:  76%/13077 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits:  76%/13077 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 03h | Avg: 16m 27s | Max: 51m 56s | Hits:  65%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 53m | Avg: 10m 21s | Max: 20m 22s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 25m | Avg: 18m 12s | Max: 52m 53s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 90a                Pass: 100%/4   | Total: 38m 12s | Avg:  9m 33s | Max: 16m 00s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 24m | Avg:  4m 48s | Max: 14m 42s
      🟩 14                 Pass: 100%/34  | Total:  5h 25m | Avg:  9m 35s | Max: 51m 16s | Hits:  74%/5812  
      🟩 17                 Pass: 100%/33  | Total: 14h 41m | Avg: 26m 43s | Max: 51m 56s | Hits:  76%/4359  
      🟩 20                 Pass: 100%/21  | Total:  8h 56m | Avg: 25m 32s | Max: 52m 53s | Hits:  82%/2906  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 251)

# Runner
178 linux-amd64-cpu16
42 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟩 CI finished in 5h 24m: Pass: 100%/251 | Total: 3d 16h | Avg: 21m 08s | Max: 1h 40m | Hits: 72%/17373
  • 🟩 cub: Pass: 100%/132 | Total: 2d 08h | Avg: 25m 48s | Max: 1h 40m | Hits: 59%/4296

    🟩 cpu
      🟩 amd64              Pass: 100%/124 | Total:  2d 04h | Avg: 25m 29s | Max:  1h 40m | Hits:  59%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  4h 05m | Avg: 30m 37s | Max: 55m 28s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 11.8               Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🟩 12.5               Pass: 100%/114 | Total:  2d 02h | Avg: 26m 26s | Max:  1h 40m | Hits:  59%/3580  
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  5h 03m | Avg: 20m 14s | Max: 52m 23s | Hits:  59%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🟩 nvcc12.5           Pass: 100%/112 | Total:  2d 02h | Avg: 26m 50s | Max:  1h 40m | Hits:  59%/3580  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 07s | Avg:  3m 33s | Max:  3m 40s
      🟩 nvcc               Pass: 100%/130 | Total:  2d 08h | Avg: 26m 09s | Max:  1h 40m | Hits:  59%/4296  
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 03m | Avg: 20m 39s | Max: 53m 29s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 06s | Max: 50m 46s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 18s | Max: 50m 10s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 54m | Avg: 28m 39s | Max: 50m 12s
      🟩 Clang13            Pass: 100%/4   | Total:  2h 04m | Avg: 31m 04s | Max: 54m 35s
      🟩 Clang14            Pass: 100%/4   | Total:  2h 03m | Avg: 30m 54s | Max: 54m 39s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 26s | Max: 51m 24s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 56m | Avg: 29m 03s | Max: 52m 03s
      🟩 Clang17            Pass: 100%/26  | Total:  9h 23m | Avg: 21m 41s | Max: 53m 48s
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 31s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 49m | Avg: 18m 18s | Max: 47m 17s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 00m | Avg: 20m 02s | Max: 52m 47s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 37m | Avg: 26m 18s | Max: 51m 09s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 57m | Avg: 29m 17s | Max: 51m 53s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 27m | Avg: 29m 37s | Max:  1h 11m
      🟩 GCC12              Pass: 100%/4   | Total:  1h 58m | Avg: 29m 33s | Max: 51m 53s
      🟩 GCC13              Pass: 100%/29  | Total: 11h 37m | Avg: 24m 03s | Max:  1h 40m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 52m 23s | Avg: 52m 23s | Max: 52m 23s | Hits:  59%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m | Hits:  59%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 56m | Avg: 58m 45s | Max: 59m 26s | Hits:  59%/2148  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  1d 00h | Avg: 24m 44s | Max: 54m 39s
      🟩 GCC                Pass: 100%/64  | Total:  1d 01h | Avg: 23m 59s | Max:  1h 40m
      🟩 Intel              Pass: 100%/3   | Total:  1h 02m | Avg: 20m 58s | Max: 52m 53s
      🟩 MSVC               Pass: 100%/6   | Total:  5h 48m | Avg: 58m 08s | Max:  1h 02m | Hits:  59%/4296  
    🟩 gpu
      🟩 v100               Pass: 100%/132 | Total:  2d 08h | Avg: 25m 48s | Max:  1h 40m | Hits:  59%/4296  
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 20h | Avg: 27m 09s | Max:  1h 11m | Hits:  59%/4296  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 29m | Avg: 18m 42s | Max: 23m 29s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 13m | Avg: 16m 42s | Max: 32m 53s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 13m | Avg: 16m 41s | Max: 18m 33s
      🟩 SmallGMem          Pass: 100%/1   | Total: 31m 38s | Avg: 31m 38s | Max: 31m 38s
      🟩 TestGPU            Pass: 100%/8   | Total:  4h 29m | Avg: 33m 39s | Max:  1h 40m
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 29m | Avg: 29m 48s | Max:  1h 11m
      🟩 90a                Pass: 100%/4   | Total: 52m 24s | Avg: 13m 06s | Max: 21m 43s
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  6h 05m | Avg: 10m 44s | Max: 41m 23s
      🟩 14                 Pass: 100%/37  | Total:  8h 16m | Avg: 13m 24s | Max: 58m 20s | Hits:  59%/2148  
      🟩 17                 Pass: 100%/37  | Total:  1d 03h | Avg: 44m 07s | Max:  1h 40m | Hits:  59%/1432  
      🟩 20                 Pass: 100%/24  | Total: 15h 13m | Avg: 38m 03s | Max: 59m 26s | Hits:  59%/716   
    
  • 🟩 thrust: Pass: 100%/118 | Total: 1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits: 76%/13077

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  1d 05h | Avg: 16m 04s | Max: 52m 53s | Hits:  76%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  2h 00m | Avg: 15m 05s | Max: 27m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 11.8               Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 12.5               Pass: 100%/100 | Total:  1d 03h | Avg: 16m 31s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  3h 12m | Avg: 12m 50s | Max: 48m 28s | Hits:  65%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 03h | Avg: 16m 46s | Max: 52m 53s | Hits:  78%/11624 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 41s | Avg:  4m 20s | Max:  4m 34s
      🟩 nvcc               Pass: 100%/116 | Total:  1d 07h | Avg: 16m 12s | Max: 52m 53s | Hits:  76%/13077 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  1h 12m | Avg: 12m 09s | Max: 28m 39s
      🟩 Clang10            Pass: 100%/3   | Total: 41m 59s | Avg: 13m 59s | Max: 31m 52s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 03m | Avg: 15m 45s | Max: 29m 03s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 28m 53s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 37s | Max: 27m 09s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 34s | Max: 28m 02s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 05m | Avg: 16m 28s | Max: 28m 48s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 06m | Avg: 16m 35s | Max: 29m 46s
      🟩 Clang17            Pass: 100%/18  | Total:  3h 30m | Avg: 11m 41s | Max: 28m 29s
      🟩 GCC6               Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  8m 21s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 45s | Max: 29m 45s
      🟩 GCC8               Pass: 100%/6   | Total:  1h 13m | Avg: 12m 12s | Max: 31m 52s
      🟩 GCC9               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 46s | Max: 30m 33s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 10m | Avg: 17m 34s | Max: 31m 09s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 51m | Avg: 15m 52s | Max: 35m 56s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 30m 58s
      🟩 GCC13              Pass: 100%/20  | Total:  4h 53m | Avg: 14m 41s | Max: 52m 53s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 48m 28s | Avg: 48m 28s | Max: 48m 28s | Hits:  65%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 36s | Max: 51m 56s | Hits:  65%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 29m | Avg: 34m 55s | Max: 51m 37s | Hits:  82%/8718  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 11h 50m | Avg: 13m 55s | Max: 31m 52s
      🟩 GCC                Pass: 100%/55  | Total: 12h 49m | Avg: 13m 59s | Max: 52m 53s
      🟩 Intel              Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 36m 38s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 01m | Avg: 40m 07s | Max: 51m 56s | Hits:  76%/13077 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  1d 07h | Avg: 16m 00s | Max: 52m 53s | Hits:  76%/13077 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 03h | Avg: 16m 27s | Max: 51m 56s | Hits:  65%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 53m | Avg: 10m 21s | Max: 20m 22s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  2h 25m | Avg: 18m 12s | Max: 52m 53s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 43m 46s | Avg: 14m 35s | Max: 35m 56s
      🟩 90a                Pass: 100%/4   | Total: 38m 12s | Avg:  9m 33s | Max: 16m 00s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 24m | Avg:  4m 48s | Max: 14m 42s
      🟩 14                 Pass: 100%/34  | Total:  5h 25m | Avg:  9m 35s | Max: 51m 16s | Hits:  74%/5812  
      🟩 17                 Pass: 100%/33  | Total: 14h 41m | Avg: 26m 43s | Max: 51m 56s | Hits:  76%/4359  
      🟩 20                 Pass: 100%/21  | Total:  8h 56m | Avg: 25m 32s | Max: 52m 53s | Hits:  82%/2906  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 36s | Avg: 11m 36s | Max: 11m 36s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 251)

# Runner
178 linux-amd64-cpu16
42 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber
Copy link
Contributor Author

@gevtushenko: @elstehle said he wouldn't want to merge without your approval. So we are waiting for to merge this PR.

@gevtushenko ping.

@davidwendt
Copy link
Contributor

Per @elstehle request, I tested this successfully with libcudf 24.10 without our current scan-tuning patch and this worked well us.

@bernhardmgruber
Copy link
Contributor Author

@gevtushenko: @elstehle said he wouldn't want to merge without your approval. So we are waiting for to merge this PR.

@gevtushenko ping.

@gevtushenko ping.

cub/cub/util_device.cuh Show resolved Hide resolved
cub/cub/util_device.cuh Outdated Show resolved Hide resolved
cub/cub/util_device.cuh Show resolved Hide resolved
Copy link
Contributor

🟩 CI finished in 2h 05m: Pass: 100%/251 | Total: 6d 03h | Avg: 35m 22s | Max: 1h 42m | Hits: 64%/17373
  • 🟩 cub: Pass: 100%/132 | Total: 4d 02h | Avg: 44m 50s | Max: 1h 42m | Hits: 42%/4296

    🟩 cpu
      🟩 amd64              Pass: 100%/124 | Total:  3d 19h | Avg: 44m 12s | Max:  1h 42m | Hits:  42%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  7h 19m | Avg: 54m 54s | Max:  1h 02m
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total: 16h 00m | Avg:  1h 04m | Max:  1h 09m | Hits:  42%/716   
      🟩 11.8               Pass: 100%/3   | Total:  4h 53m | Avg:  1h 37m | Max:  1h 42m
      🟩 12.5               Pass: 100%/114 | Total:  3d 05h | Avg: 40m 56s | Max:  1h 02m | Hits:  42%/3580  
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 50s | Avg: 23m 55s | Max: 26m 01s
      🟩 nvcc11.1           Pass: 100%/15  | Total: 16h 00m | Avg:  1h 04m | Max:  1h 09m | Hits:  42%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total:  4h 53m | Avg:  1h 37m | Max:  1h 42m
      🟩 nvcc12.5           Pass: 100%/112 | Total:  3d 04h | Avg: 41m 14s | Max:  1h 02m | Hits:  42%/3580  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 50s | Avg: 23m 55s | Max: 26m 01s
      🟩 nvcc               Pass: 100%/130 | Total:  4d 01h | Avg: 45m 10s | Max:  1h 42m | Hits:  42%/4296  
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  5h 42m | Avg: 57m 08s | Max:  1h 06m
      🟩 Clang10            Pass: 100%/3   | Total:  2h 32m | Avg: 50m 50s | Max: 51m 57s
      🟩 Clang11            Pass: 100%/4   | Total:  3h 12m | Avg: 48m 06s | Max: 48m 24s
      🟩 Clang12            Pass: 100%/4   | Total:  3h 22m | Avg: 50m 40s | Max: 54m 57s
      🟩 Clang13            Pass: 100%/4   | Total:  3h 22m | Avg: 50m 31s | Max: 54m 36s
      🟩 Clang14            Pass: 100%/4   | Total:  3h 18m | Avg: 49m 38s | Max: 52m 41s
      🟩 Clang15            Pass: 100%/4   | Total:  3h 14m | Avg: 48m 30s | Max: 49m 25s
      🟩 Clang16            Pass: 100%/4   | Total:  3h 29m | Avg: 52m 17s | Max: 58m 03s
      🟩 Clang17            Pass: 100%/26  | Total: 13h 17m | Avg: 30m 40s | Max: 54m 10s
      🟩 GCC6               Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m
      🟩 GCC7               Pass: 100%/6   | Total:  5h 32m | Avg: 55m 22s | Max:  1h 03m
      🟩 GCC8               Pass: 100%/6   | Total:  5h 41m | Avg: 56m 56s | Max:  1h 06m
      🟩 GCC9               Pass: 100%/6   | Total:  5h 45m | Avg: 57m 39s | Max:  1h 09m
      🟩 GCC10              Pass: 100%/4   | Total:  3h 23m | Avg: 50m 50s | Max: 53m 13s
      🟩 GCC11              Pass: 100%/7   | Total:  8h 06m | Avg:  1h 09m | Max:  1h 42m
      🟩 GCC12              Pass: 100%/4   | Total:  3h 16m | Avg: 49m 09s | Max: 50m 35s
      🟩 GCC13              Pass: 100%/29  | Total: 14h 32m | Avg: 30m 04s | Max:  1h 02m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 50m | Avg: 56m 55s | Max: 59m 41s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 53m 30s | Avg: 53m 30s | Max: 53m 30s | Hits:  42%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 38s | Max: 58m 49s | Hits:  42%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 59m | Avg: 59m 58s | Max:  1h 01m | Hits:  42%/2148  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  1d 17h | Avg: 42m 14s | Max:  1h 06m
      🟩 GCC                Pass: 100%/64  | Total:  2d 00h | Avg: 45m 25s | Max:  1h 42m
      🟩 Intel              Pass: 100%/3   | Total:  2h 50m | Avg: 56m 55s | Max: 59m 41s
      🟩 MSVC               Pass: 100%/6   | Total:  5h 50m | Avg: 58m 26s | Max:  1h 01m | Hits:  42%/4296  
    🟩 gpu
      🟩 v100               Pass: 100%/132 | Total:  4d 02h | Avg: 44m 50s | Max:  1h 42m | Hits:  42%/4296  
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  3d 15h | Avg: 52m 45s | Max:  1h 42m | Hits:  42%/4296  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 25m | Avg: 18m 12s | Max: 23m 28s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 28m | Avg: 18m 35s | Max: 23m 44s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 33m | Avg: 19m 12s | Max: 22m 55s
      🟩 SmallGMem          Pass: 100%/1   | Total: 38m 37s | Avg: 38m 37s | Max: 38m 37s
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 30m | Avg: 26m 16s | Max: 30m 05s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  4h 53m | Avg:  1h 37m | Max:  1h 42m
      🟩 90a                Pass: 100%/4   | Total:  1h 25m | Avg: 21m 21s | Max: 22m 59s
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  1d 02h | Avg: 46m 16s | Max:  1h 36m
      🟩 14                 Pass: 100%/37  | Total:  1d 04h | Avg: 46m 05s | Max:  1h 34m | Hits:  42%/2148  
      🟩 17                 Pass: 100%/37  | Total:  1d 04h | Avg: 45m 52s | Max:  1h 42m | Hits:  42%/1432  
      🟩 20                 Pass: 100%/24  | Total: 15h 44m | Avg: 39m 20s | Max:  1h 02m | Hits:  42%/716   
    
  • 🟩 thrust: Pass: 100%/118 | Total: 2d 01h | Avg: 24m 58s | Max: 58m 41s | Hits: 71%/13077

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  1d 21h | Avg: 24m 56s | Max: 58m 41s | Hits:  71%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  3h 22m | Avg: 25m 20s | Max: 28m 13s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  6h 25m | Avg: 25m 42s | Max: 50m 44s | Hits:  57%/1453  
      🟩 11.8               Pass: 100%/3   | Total:  1h 47m | Avg: 35m 57s | Max: 43m 27s
      🟩 12.5               Pass: 100%/100 | Total:  1d 16h | Avg: 24m 32s | Max: 58m 41s | Hits:  73%/11624 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 49m 38s | Avg: 24m 49s | Max: 25m 14s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 25m | Avg: 25m 42s | Max: 50m 44s | Hits:  57%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 47m | Avg: 35m 57s | Max: 43m 27s
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 16h | Avg: 24m 31s | Max: 58m 41s | Hits:  73%/11624 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 49m 38s | Avg: 24m 49s | Max: 25m 14s
      🟩 nvcc               Pass: 100%/116 | Total:  2d 00h | Avg: 24m 58s | Max: 58m 41s | Hits:  71%/13077 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 30m | Avg: 25m 07s | Max: 30m 51s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 20m | Avg: 26m 53s | Max: 30m 29s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 53s | Max: 31m 59s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 01s | Max: 29m 32s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 45s | Max: 28m 28s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 45s | Max: 28m 33s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 10s | Max: 27m 24s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 40m | Avg: 25m 12s | Max: 27m 28s
      🟩 Clang17            Pass: 100%/18  | Total:  5h 27m | Avg: 18m 12s | Max: 29m 22s
      🟩 GCC6               Pass: 100%/2   | Total: 48m 01s | Avg: 24m 00s | Max: 27m 53s
      🟩 GCC7               Pass: 100%/6   | Total:  2h 28m | Avg: 24m 41s | Max: 28m 27s
      🟩 GCC8               Pass: 100%/6   | Total:  2h 27m | Avg: 24m 33s | Max: 30m 02s
      🟩 GCC9               Pass: 100%/6   | Total:  2h 30m | Avg: 25m 04s | Max: 28m 23s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 48m | Avg: 27m 01s | Max: 28m 38s
      🟩 GCC11              Pass: 100%/7   | Total:  3h 37m | Avg: 31m 01s | Max: 43m 27s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 50m | Avg: 27m 34s | Max: 29m 52s
      🟩 GCC13              Pass: 100%/20  | Total:  5h 54m | Avg: 17m 42s | Max: 29m 12s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 36m | Avg: 32m 07s | Max: 33m 59s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 50m 44s | Avg: 50m 44s | Max: 50m 44s | Hits:  57%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 45m | Avg: 52m 56s | Max: 58m 41s | Hits:  57%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 48m | Avg: 38m 00s | Max: 58m 10s | Hits:  78%/8718  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 19h 42m | Avg: 23m 11s | Max: 31m 59s
      🟩 GCC                Pass: 100%/55  | Total: 21h 23m | Avg: 23m 20s | Max: 43m 27s
      🟩 Intel              Pass: 100%/3   | Total:  1h 36m | Avg: 32m 07s | Max: 33m 59s
      🟩 MSVC               Pass: 100%/9   | Total:  6h 24m | Avg: 42m 44s | Max: 58m 41s | Hits:  71%/13077 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  2d 01h | Avg: 24m 58s | Max: 58m 41s | Hits:  71%/13077 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 21h | Avg: 27m 30s | Max: 58m 41s | Hits:  57%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 54m | Avg: 10m 24s | Max: 22m 51s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 50m | Avg: 13m 45s | Max: 16m 41s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 47m | Avg: 35m 57s | Max: 43m 27s
      🟩 90a                Pass: 100%/4   | Total: 57m 30s | Avg: 14m 22s | Max: 15m 24s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total: 10h 11m | Avg: 20m 23s | Max: 29m 56s
      🟩 14                 Pass: 100%/34  | Total: 14h 53m | Avg: 26m 16s | Max: 50m 44s | Hits:  67%/5812  
      🟩 17                 Pass: 100%/33  | Total: 15h 13m | Avg: 27m 40s | Max: 58m 41s | Hits:  71%/4359  
      🟩 20                 Pass: 100%/21  | Total:  8h 48m | Avg: 25m 10s | Max: 58m 10s | Hits:  78%/2906  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 21s | Avg: 11m 21s | Max: 11m 21s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 251)

# Runner
178 linux-amd64-cpu16
42 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@elstehle elstehle merged commit bda69fd into NVIDIA:main Sep 26, 2024
265 of 266 checks passed
@bernhardmgruber bernhardmgruber deleted the chained_policy_prune branch September 26, 2024 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants