Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Add 1.14.0 changelog. #358

Merged
merged 1 commit into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,57 @@
# CUB 1.14.0 (NVIDIA HPC SDK 21.9)

## Summary

CUB 1.14.0 is a major release accompanying the NVIDIA HPC SDK 21.9.

This release provides the often-requested merge sort algorithm, ported from the
`thrust::sort` implementation. Merge sort provides more flexibility than the
existing radix sort by supporting arbitrary data types and comparators, though
radix sorting is still faster for supported inputs. This functionality is
provided through the new `cub::DeviceMergeSort` and `cub::BlockMergeSort`
algorithms.

The namespace wrapping mechanism has been overhauled for 1.14. The existing
macros (`CUB_NS_PREFIX`/`CUB_NS_POSTFIX`) can now be replaced by a single macro,
`CUB_WRAPPED_NAMESPACE`, which is set to the name of the desired wrapped
namespace. Defining a similar `THRUST_CUB_WRAPPED_NAMESPACE` macro will embed
both `thrust::` and `cub::` symbols in the same external namespace. The
prefix/postfix macros are still supported, but now require a new
`CUB_NS_QUALIFIER` macro to be defined, which provides the fully qualified CUB
namespace (e.g. `::foo::cub`). See `cub/util_namespace.cuh` for details.

## Breaking Changes

- NVIDIA/cub#350: When the `CUB_NS_[PRE|POST]FIX` macros are set,
`CUB_NS_QUALIFIER` must also be defined to the fully qualified CUB namespace
(e.g. `#define CUB_NS_QUALIFIER ::foo::cub`). Note that this is handled
automatically when using the new `[THRUST_]CUB_WRAPPED_NAMESPACE` mechanism.

## New Features

- NVIDIA/cub#322: Ported the merge sort algorithm from Thrust:
`cub::BlockMergeSort` and `cub::DeviceMergeSort` are now available.
- NVIDIA/cub#326: Simplify the namespace wrapper macros, and detect when
Thrust's symbols are in a wrapped namespace.

## Bug Fixes

- NVIDIA/cub#160, NVIDIA/cub#163, NVIDIA/cub#352: Fixed several bugs in
`cub::DeviceSpmv` and added basic tests for this algorithm. Thanks to James
Wyles and Seunghwa Kang for their contributions.
- NVIDIA/cub#328: Fixed error handling bug and incorrect debugging output in
`cub::CachingDeviceAllocator`. Thanks to Felix Kallenborn for this
contribution.
- NVIDIA/cub#335: Fixed a compile error affecting clang and NVRTC. Thanks to
Jiading Guo for this contribution.
- NVIDIA/cub#351: Fixed some errors in the `cub::DeviceHistogram` documentation.

## Enhancements

- NVIDIA/cub#348: Add an example that demonstrates how to use dynamic shared
memory with a CUB block algorithm. Thanks to Matthias Jouanneaux for this
contribution.

# CUB 1.13.1 (CUDA Toolkit 11.5)

CUB 1.13.1 is a minor release accompanying the CUDA Toolkit 11.5.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ See the [changelog](CHANGELOG.md) for details about specific releases.

| CUB Release | Included In |
| ------------------------- | --------------------------------------- |
| 1.14.0 | NVIDIA HPC SDK 21.9 |
| 1.13.1 | CUDA Toolkit 11.5 |
| 1.13.0 | NVIDIA HPC SDK 21.7 |
| 1.12.1 | CUDA Toolkit 11.4 |
| 1.12.0 | NVIDIA HPC SDK 21.3 |
Expand Down