- PR #375 Support out-of-band buffers in Python pickling
- PR #391 Add
get_default_resource_type
- PR #396 Remove deprecated RMM APIs
- PR #425 Add CUDA per-thread default stream support and thread safety to
pool_memory_resource
- PR #436 Always build and test with per-thread default stream enabled in the GPU CI build
- PR #444 Add
owning_wrapper
to simplify lifetime management of resources and their upstreams - PR #450 Add support for new build process (Project Flash)
- PR #428 Add the option to automatically flush memory allocate/free logs
- PR #378 Use CMake
FetchContent
to obtain latest release ofcub
andthrust
- PR #377 A better way to fetch
spdlog
- PR #372 Use CMake
FetchContent
to obtaincnmem
instead of git submodule - PR #382 Rely on NumPy arrays for out-of-band pickling
- PR #386 Add short commit to conda package name
- PR #401 Update
get_ipc_handle()
to use cuda driver API - PR #404 Make all memory resources thread safe in Python
- PR #402 Install dependencies via rapids-build-env
- PR #405 Move doc customization scripts to Jenkins
- PR #427 Add DeviceBuffer.release() cdef method
- PR #414 Add element-wise access for device_uvector
- PR #421 Capture thread id in logging and improve logger testing
- PR #426 Added multi-threaded support to replay benchmark.
- PR #429 Fix debug build and add new CUDA assert utility.
- PR #435 Update conda upload versions for new supported CUDA/Python
- PR #437 Test with
pickle5
(for older Python versions) - PR #443 Remove thread safe adaptor from PoolMemoryResource
- PR #445 Make all resource operators/ctors explicit
- PR #447 Update Python README with info about DeviceBuffer/MemoryResource and external libraries
- PR #433 Fix python imports
- PR #400 Fix segfault in RANDOM_ALLOCATIONS_BENCH
- PR #383 Explicitly require NumPy
- PR #398 Fix missing head flag in merge_blocks (pool_memory_resource) and improve block class
- PR #403 Mark Cython
memory_resource_wrappers
extern
asnogil
- PR #406 Sets Google Benchmark to a fixed version, v1.5.1.
- PR #434: Fix issue with incorrect docker image being used in local build script
- PR #317 Provide External Memory Management Plugin for Numba
- PR #362 Add spdlog as a dependency in the conda package
- PR #360 Support logging to stdout/stderr
- PR #341 Enable logging
- PR #343 Add in option to statically link against cudart
- PR #364 Added new uninitialized device vector type,
device_uvector
- PR #369 Use CMake
FetchContent
to obtainspdlog
instead of vendoring - PR #366 Remove installation of extra test dependencies
- PR #354 Add CMake option for per-thread default stream
- PR #350 Add .clang-format file & format all files
- PR #358 Fix typo in
rmm_cupy_allocator
docstring - PR #357 Add Docker 19 support to local gpuci build
- PR #365 Make .clang-format consistent with cuGRAPH and cuDF
- PR #371 Add docs build script to repository
- PR #363 Expose
memory_resources
in Python
- PR #373 Fix build.sh
- PR #346 Add clearer exception message when RMM_LOG_FILE is unset
- PR #347 Mark rmmFinalizeWrapper nogil
- PR #348 Fix unintentional use of pool-managed resource.
- PR #367 Fix flake8 issues
- PR #368 Fix
clang-format
missing comma bug - PR #370 Fix stream and mr use in
device_buffer
methods - PR #379 Remove deprecated calls from synchronization.cpp
- PR #381 Remove test_benchmark.cpp from cmakelists
- PR #392 SPDLOG matches other header-only acquisition patterns
- PR #253 Add
frombytes
to convertbytes
-like toDeviceBuffer
- PR #252 Add
__sizeof__
method toDeviceBuffer
- PR #258 Define pickling behavior for
DeviceBuffer
- PR #261 Add
__bytes__
method toDeviceBuffer
- PR #262 Moved device memory resource files to
mr/device
directory - PR #266 Drop
rmm.auto_device
- PR #268 Add Cython/Python
copy_to_host
andto_device
- PR #272 Add
host_memory_resource
. - PR #273 Moved device memory resource tests to
device/
directory. - PR #274 Add
copy_from_host
method toDeviceBuffer
- PR #275 Add
copy_from_device
method toDeviceBuffer
- PR #283 Add random allocation benchmark.
- PR #287 Enabled CUDA CXX11 for unit tests.
- PR #292 Revamped RMM exceptions.
- PR #297 Use spdlog to implement
logging_resource_adaptor
. - PR #303 Added replay benchmark.
- PR #319 Add
thread_safe_resource_adaptor
class. - PR #314 New suballocator memory_resources.
- PR #330 Fixed incorrect name of
stream_free_blocks_
debug symbol. - PR #331 Move to C++14 and deprecate legacy APIs.
- PR #246 Type
DeviceBuffer
arguments to__cinit__
- PR #249 Use
DeviceBuffer
indevice_array
- PR #255 Add standard header to all Cython files
- PR #256 Cast through
uintptr_t
tocudaStream_t
- PR #254 Use
const void*
inDeviceBuffer.__cinit__
- PR #257 Mark Cython-exposed C++ functions that raise
- PR #269 Doc sync behavior in
copy_ptr_to_host
- PR #278 Allocate a
bytes
object to fill up with RMM log data - PR #280 Drop allocation/deallocation of
offset
- PR #282
DeviceBuffer
use default constructor for size=0 - PR #296 Use CuPy's
UnownedMemory
for RMM-backed allocations - PR #310 Improve
device_buffer
allocation logic. - PR #309 Sync default stream in
DeviceBuffer
constructor - PR #326 Sync only on copy construction
- PR #308 Fix typo in README
- PR #334 Replace
rmm_allocator
for Thrust allocations - PR #345 Remove stream synchronization from
device_scalar
constructor andset_value
- PR #298 Remove RMM_CUDA_TRY from cuda_event_timer destructor
- PR #299 Fix assert condition blocking debug builds
- PR #300 Fix host mr_tests compile error
- PR #312 Fix libcudf compilation errors due to explicit defaulted device_buffer constructor.
- PR #218 Add
_DevicePointer
- PR #219 Add method to copy
device_buffer
back to host memory - PR #222 Expose free and total memory in Python interface
- PR #235 Allow construction of
DeviceBuffer
with astream
- PR #214 Add codeowners
- PR #226 Add some tests of the Python
DeviceBuffer
- PR #233 Reuse the same
CUDA_HOME
logic from cuDF - PR #234 Add missing
size_t
inDeviceBuffer
- PR #239 Cleanup
DeviceBuffer
's__cinit__
- PR #242 Special case 0-size
DeviceBuffer
intobytes
- PR #244 Explicitly force
DeviceBuffer.size
to anint
- PR #247 Simplify casting in
tobytes
and other cleanup
- PR #215 Catch polymorphic exceptions by reference instead of by value
- PR #221 Fix segfault calling rmmGetInfo when uninitialized
- PR #225 Avoid invoking Python operations in c_free
- PR #230 Fix duplicate symbol issues with
copy_to_host
- PR #232 Move
copy_to_host
doc back to header file
- PR #106 Added multi-GPU initialization
- PR #167 Added value setter to
device_scalar
- PR #163 Add Cython bindings to
device_buffer
- PR #177 Add
__cuda_array_interface__
toDeviceBuffer
- PR #198 Add
rmm.rmm_cupy_allocator()
- PR #161 Use
std::atexit
to finalize RMM after Python interpreter shutdown - PR #165 Align memory resource allocation sizes to 8-byte
- PR #171 Change public API of RMM to only expose
reinitialize(...)
- PR #175 Drop
cython
from run requirements - PR #169 Explicit stream argument for device_buffer methods
- PR #186 Add nbytes and len to DeviceBuffer
- PR #188 Require kwargs in
DeviceBuffer
's constructor - PR #194 Drop unused imports from
device_buffer.pyx
- PR #196 Remove unused CUDA conda labels
- PR #200 Simplify DeviceBuffer methods
- PR #174 Make
device_buffer
default ctor explicit to work around type_dispatcher issue in libcudf. - PR #170 Always build librmm and rmm, but conditionally upload based on CUDA / Python version
- PR #182 Prefix
DeviceBuffer
's C functions - PR #189 Drop
__reduce__
fromDeviceBuffer
- PR #193 Remove thrown exception from
rmm_allocator::deallocate
- PR #224 Slice the CSV log before converting to bytes
- PR #99 Added
device_buffer
class - PR #133 Added
device_scalar
class
- PR #123 Remove driver install from ci scripts
- PR #131 Use YYMMDD tag in nightly build
- PR #137 Replace CFFI python bindings with Cython
- PR #127 Use Memory Resource classes for allocations
- PR #107 Fix local build generated file ownerships
- PR #110 Fix Skip Test Functionality
- PR #125 Fixed order of private variables in LogIt
- PR #139 Expose
_make_finalizer
python API needed by cuDF - PR #142 Fix ignored exceptions in Cython
- PR #146 Fix rmmFinalize() not freeing memory pools
- PR #149 Force finalization of RMM objects before RMM is finalized (Python)
- PR #154 Set ptr to 0 on rmm::alloc error
- PR #157 Check if initialized before freeing for Numba finalizer and use
weakref
instead ofatexit
- PR #96 Added
device_memory_resource
for beginning of overhaul of RMM design - PR #103 Add and use unified build script
- PR #111 Streamline CUDA_REL environment variable
- PR #113 Handle ucp.BufferRegion objects in auto_device
...
- PR #95 Add skip test functionality to build.sh
...
- PR #92 Update docs version
- PR #67 Add random_allocate microbenchmark in tests/performance
- PR #70 Create conda environments and conda recipes
- PR #77 Add local build script to mimic gpuCI
- PR #80 Add build script for docs
- PR #76 Add cudatoolkit conda dependency
- PR #84 Use latest release version in update-version CI script
- PR #90 Avoid using c++14 auto return type for thrust_rmm_allocator.h
- PR #68 Fix signed/unsigned mismatch in random_allocate benchmark
- PR #74 Fix rmm conda recipe librmm version pinning
- PR #72 Remove unnecessary _BSD_SOURCE define in random_allocate.cpp
- PR #43 Add gpuCI build & test scripts
- PR #44 Added API to query whether RMM is initialized and with what options.
- PR #60 Default to CXX11_ABI=ON
- PR #58 Eliminate unreliable check for change in available memory in test
- PR #49 Fix pep8 style errors detected by flake8
- PR #2 Added CUDA Managed Memory allocation mode
- PR #12 Enable building RMM as a submodule
- PR #13 CMake: Added CXX11ABI option and removed Travis references
- PR #16 CMake: Added PARALLEL_LEVEL environment variable handling for GTest build parallelism (matches cuDF)
- PR #17 Update README with v0.5 changes including Managed Memory support
- PR #10 Change cnmem submodule URL to use https
- PR #15 Temporarily disable hanging AllocateTB test for managed memory
- PR #28 Fix invalid reference to local stack variable in
rmm::exec_policy
- PR #1 Spun off RMM from cuDF into its own repository.
- CUDF PR #472 RMM: Created centralized rmm::device_vector alias and rmm::exec_policy
- CUDF PR #465 Added templated C++ API for RMM to avoid explicit cast to
void**
RMM was initially implemented as part of cuDF, so we include the relevant changelog history below.
- PR #336 CSV Reader string support
- CUDF PR #333 Add Rapids Memory Manager documentation
- CUDF PR #321 Rapids Memory Manager adds file/line location logging and convenience macros
These were initial releases of cuDF based on previously separate pyGDF and libGDF libraries. RMM was initially implemented as part of libGDF.