Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v21.12 #9689

Merged
merged 252 commits into from
Dec 3, 2021
Merged
Show file tree
Hide file tree
Changes from 226 commits
Commits
Show all changes
252 commits
Select commit Hold shift + click to select a range
a4771b3
Update cudf java bindings to 21.12.0-SNAPSHOT (#9248)
pxLi Sep 22, 2021
3ed97af
Merge branch-21.10 into branch-21.12
galipremsagar Sep 23, 2021
e0cf38b
Merge pull request #9285 from galipremsagar/branch-21.12-merge-21.10
ajschmidt8 Sep 23, 2021
e2098e5
Merge pull request #9288 from rapidsai/branch-21.10
GPUtester Sep 23, 2021
65850e2
Merge pull request #9293 from rapidsai/branch-21.10
GPUtester Sep 23, 2021
7277443
Merge pull request #9294 from rapidsai/branch-21.10
GPUtester Sep 23, 2021
8d4c523
Merge pull request #9295 from rapidsai/branch-21.10
GPUtester Sep 24, 2021
20498f7
Added deprecation warning for `.label_encoding()` (#9289)
mayankanand007 Sep 24, 2021
a269820
Merge pull request #9297 from rapidsai/branch-21.10
GPUtester Sep 24, 2021
2718443
Merge pull request #9298 from rapidsai/branch-21.10
GPUtester Sep 24, 2021
908c130
Merge pull request #9302 from rapidsai/branch-21.10
GPUtester Sep 24, 2021
165cdac
Pin mypy in .pre-commit-config.yaml to match conda environment pinnin…
bdice Sep 24, 2021
b4560f4
Potential overflow of `decimal32` when casting to `int64_t` (#9287)
codereport Sep 24, 2021
fd77ca6
add a new memory buffer that directly allocates with cudaMalloc (#9311)
rongou Sep 27, 2021
4de968e
Unpin `dask` and `distributed` in CI (#9307)
galipremsagar Sep 27, 2021
42ffe78
Use nvcomp's snappy decompressor in avro reader (#9181)
devavret Sep 27, 2021
0b007a5
Add shallow hash function and shallow equality comparison for column_…
karthikeyann Sep 27, 2021
e245cd5
Consolidate more methods in Frame (#9305)
vyasr Sep 27, 2021
6484e2a
Merge pull request #9323 from rapidsai/branch-21.10
GPUtester Sep 28, 2021
cafd943
Merge pull request #9335 from rapidsai/branch-21.10
GPUtester Sep 29, 2021
fd0b710
Merge pull request #9336 from rapidsai/branch-21.10
GPUtester Sep 29, 2021
bef2c35
Use f-string in join helper warning message. (#9325)
bdice Sep 29, 2021
fdb9e3b
Fix some unused variable warnings in libcudf (#9326)
davidwendt Sep 29, 2021
f9ce870
Pure-python masked UDFs (#9174)
brandon-b-miller Sep 29, 2021
8935f6d
Fix Cython compilation warnings. (#9327)
bdice Sep 29, 2021
e63f455
Merge pull request #9342 from rapidsai/branch-21.10
GPUtester Sep 29, 2021
840faf5
Avoid casting to list or struct dtypes in dask_cudf.read_parquet (#9314)
rjzamora Sep 30, 2021
ef50796
Add `isocalendar` API support (#9169)
marlenezw Sep 30, 2021
5cea6b5
Optionally nullify out-of-bounds indices in segmented_gather(). (#9318)
mythrocks Sep 30, 2021
ffb3814
Use Default Memory Resource for Temporaries in `reduction.cpp` (#9344)
isVoid Sep 30, 2021
65fe400
Set pass_filenames: false in mypy pre-commit configuration. (#9349)
bdice Sep 30, 2021
4d7e69a
Refactor cuIO timestamp processing with `cuda::std::chrono` (#9278)
PointKernel Sep 30, 2021
8e371cd
Merge pull request #9352 from rapidsai/branch-21.10
GPUtester Oct 1, 2021
4090c45
Fix memcheck error in groupby-tdigest get_scalar_minmax (#9339)
davidwendt Oct 1, 2021
91f1dea
Add optional-iterator support to indexalator (#9306)
davidwendt Oct 1, 2021
34b54ca
Deprecate method parameters to DataFrame.join, DataFrame.merge. (#9291)
bdice Oct 1, 2021
e597075
New array conversion methods (#9236)
vyasr Oct 1, 2021
cf0b2ca
doc reorder mr, stream to stream, mr (#9308)
karthikeyann Oct 1, 2021
3648783
Series `apply` method backed by masked UDFs (#9217)
brandon-b-miller Oct 1, 2021
e5203dc
Remove usage of deprecated thrust::host_space_tag. (#9350)
bdice Oct 1, 2021
1e5835a
Fixing SubwordTokenizer docs issue (#9354)
mayankanand007 Oct 4, 2021
d68e626
Use gather.hpp when gather-map exists in device memory (#9299)
davidwendt Oct 4, 2021
8bd7d68
Remove Table class (#9315)
vyasr Oct 4, 2021
fb18491
Add Arrow-NativeFile and PythonFile support to read_parquet and read_…
rjzamora Oct 4, 2021
122da20
Move rank scan implementations from scan_inclusive.cu to rank_scan.cu…
davidwendt Oct 5, 2021
fe04d21
Fix the crash in stats code (#9368)
devavret Oct 5, 2021
6593339
Fix libcudf compile warnings on debug 11.4 build (#9360)
davidwendt Oct 5, 2021
88eefe5
Merge pull request #9374 from rapidsai/branch-21.10
GPUtester Oct 5, 2021
25f8363
Improved deprecation warnings. (#9347)
bdice Oct 5, 2021
2f75ff3
Fix many documentation errors in libcudf. (#9355)
karthikeyann Oct 5, 2021
1424a2d
Update groupby result_cache to allow sharing intermediate results bas…
karthikeyann Oct 5, 2021
56edd42
Fix cudf_assert in cudf::io::orc::gpu::gpuDecodeOrcColumnData (#9348)
davidwendt Oct 5, 2021
4d6ecd1
Fix Java table partition test to account for non-deterministic orderi…
jlowe Oct 6, 2021
a743ce8
Change strings copy_if_else to use optional-iterator instead of pair-…
davidwendt Oct 6, 2021
4e7c820
Fix `StructColumn.to_pandas` type handling issues (#9388)
galipremsagar Oct 6, 2021
68c56b7
Use Arrow PythonFile for remote CSV storage (#9376)
rjzamora Oct 6, 2021
3f09f96
Fix null count in statistics for parquet (#9303)
devavret Oct 6, 2021
187ab65
Change all DeprecationWarnings to FutureWarning. (#9392)
bdice Oct 7, 2021
c6bc111
Add detail interface for `split` and `slice(table_view)`, refactors b…
isVoid Oct 7, 2021
f4ff454
Support Arrow NativeFile and PythonFile for remote ORC storage (#9377)
rjzamora Oct 7, 2021
aaea353
Add support for writing ORC with map columns (#9369)
vuule Oct 7, 2021
8203d3d
Fix timestamp truncation/overflow bugs in orc/parquet (#9382)
PointKernel Oct 7, 2021
dda5210
Frame copy to use __class__ instead of type() (#9397)
madsbk Oct 8, 2021
8bb1e86
Add multi-threaded writing to GDS writes (#9372)
devavret Oct 8, 2021
9b17f08
Correct issues in the build dir cudf-config.cmake (#9386)
robertmaynard Oct 8, 2021
f400ab1
Add parameters to control row index stride and stripe size in ORC wri…
vuule Oct 8, 2021
c593859
Add Java API to deserialize a table to host columns (#9402)
jlowe Oct 11, 2021
e6caaf5
Implement `one_hot_encoding` in libcudf and bind to python (#9229)
isVoid Oct 11, 2021
56eb91a
Expose OutOfBoundsPolicy in JNI for Table.gather (#9406)
abellina Oct 11, 2021
5e46c7e
Add cudf strings is_title API (#9380)
davidwendt Oct 12, 2021
7fa2738
Support Python UDFs written in terms of rows (#9343)
brandon-b-miller Oct 12, 2021
18c9763
Update to UCX-Py 0.23 (#9407)
pentschev Oct 12, 2021
8ff97f7
Merge pull request #9421 from rapidsai/branch-21.10
GPUtester Oct 12, 2021
fa571a7
Skip Comparing Uniform Window Results in Var/std Tests (#9416)
isVoid Oct 12, 2021
6dbea58
MD5 Python hash API (#9390)
bdice Oct 12, 2021
1ab315a
Use pre-commit for CI (#9412)
vyasr Oct 12, 2021
df27da2
JNI: Support nested types in ORC writer (#9334)
firestarman Oct 13, 2021
a4f6c6d
Use single kernel to extract all groups in cudf::strings::extract (#9…
davidwendt Oct 13, 2021
794863c
extract_list_elements() with column_view indices (#9367)
mythrocks Oct 13, 2021
f9806ff
Use optional-iterator for copy-if-else kernel (#9324)
davidwendt Oct 13, 2021
cff71ff
Add `ascending` parameter for dask-cudf `sort_values` (#9250)
charlesbluca Oct 13, 2021
bdd0922
Consolidate binary ops into `Frame` (#9357)
isVoid Oct 13, 2021
8a18d7a
Add IndexedFrame class and move SingleColumnFrame to a separate modul…
vyasr Oct 14, 2021
5bcb3e8
Fix quantile division / partition handling for dask-cudf sort on null…
charlesbluca Oct 14, 2021
4f47480
Make Series.hash_encode results reproducible. (#9366)
bdice Oct 14, 2021
800fd7b
BUG FIX: CSV Writer ignores the header parameter when no metadata is …
skirui-source Oct 14, 2021
a286974
Revert "Fix quantile division / partition handling for dask-cudf sort…
charlesbluca Oct 14, 2021
a85b56f
Allow int-like objects for the `decimals` argument in `round` (#9428)
shwina Oct 14, 2021
690cb39
Remove pyarrow import in `dask_cudf.io.parquet` (#9429)
charlesbluca Oct 14, 2021
061aa3a
Adds Deprecation Warnings to `one_hot_encoding` and Implement `get_du…
isVoid Oct 14, 2021
08ae072
Move Several Series Function to Frame (#9394)
isVoid Oct 15, 2021
13e9ec0
Update pre-commit hook URLs. (#9433)
bdice Oct 15, 2021
b66f14e
Fix stream compaction's `drop_duplicates` API to use stable sort (#9417)
ttnghia Oct 15, 2021
057728c
Fix bug in dask_cudf.read_parquet for index=False (#9453)
rjzamora Oct 15, 2021
baeffb8
Preserve the decimal scale when creating a default scalar (#9449)
revans2 Oct 15, 2021
edb6c78
Augment `order_by` to Accept a List of `null_precedence` (#9455)
isVoid Oct 16, 2021
8acf0dd
add missing stream to scalar.is_valid() wherever stream is available …
karthikeyann Oct 16, 2021
410efd9
Add Covariance, Pearson correlation for sort groupby (libcudf) (#9154)
karthikeyann Oct 18, 2021
4beee70
Improvements to tdigest aggregation code. (#9403)
nvdbaranec Oct 18, 2021
74763ba
Fix memcheck error in gtest SegmentedGatherTest/GatherSliced (#9442)
davidwendt Oct 18, 2021
399d5b5
Small clean up to simplify column selection code in ORC reader (#9444)
vuule Oct 18, 2021
823958b
Implement `lists::stable_sort_lists` for stable sorting of elements w…
ttnghia Oct 18, 2021
d6acedd
Push down parent nulls when flattening nested columns. (#9443)
mythrocks Oct 19, 2021
a3a27c6
add ctest memcheck using cuda-sanitizer (#9414)
karthikeyann Oct 19, 2021
a19bd23
Optimizations for `cudf.concat` when `axis=1` (#9333)
galipremsagar Oct 19, 2021
5e2aaf9
Support Unary Operations in Masked UDF (#9409)
isVoid Oct 19, 2021
4e04334
Deprecate Series.hash_encode. (#9457)
bdice Oct 19, 2021
6fa562f
Fix memcheck error in copy-if-else (#9467)
davidwendt Oct 19, 2021
2144034
Implement DataFrame.hash_values, deprecate DataFrame.hash_columns. (#…
bdice Oct 19, 2021
7e84aa1
Miscellaneous documentation fixes to `cudf` (#9471)
galipremsagar Oct 20, 2021
bb98505
Add fixed point to AllTypes in libcudf unit tests (#9472)
karthikeyann Oct 20, 2021
61f79f8
Add `na_position` param to dask-cudf `sort_values` (#9264)
charlesbluca Oct 20, 2021
fc868b8
Resolve `hash_columns` `FutureWarning` in `dask_cudf` (#9481)
pentschev Oct 20, 2021
919fedf
Enable Datetime/Timedelta dtypes in Masked UDFs (#9451)
brandon-b-miller Oct 20, 2021
52b7a9e
Rename strings/array_tests.cu to strings/array_tests.cpp (#9480)
davidwendt Oct 20, 2021
e4e4870
Miscellaneous column cleanup (#9370)
vyasr Oct 21, 2021
23de7d0
Add cudf python groupby.diff (#9446)
karthikeyann Oct 21, 2021
5d67946
Match conda pinnings for style checks (revert part of #9412, #9433). …
bdice Oct 21, 2021
5c76bc2
Update help message to escape quotes in ./build.sh --cmake-args. (#9494)
bdice Oct 21, 2021
72694d2
Refactor MD5 implementation. (#9212)
bdice Oct 21, 2021
ca77ca5
Improve Python docstring formatting. (#9493)
bdice Oct 25, 2021
6e88d26
Update table of I/O supported types (#9476)
vuule Oct 25, 2021
30c31c9
Fix mismatched types error in clip() when using non int64 numeric typ…
davidwendt Oct 25, 2021
ca40e18
Enable casting to int64, uint64, and double in AST code. (#9379)
vyasr Oct 25, 2021
5cd3003
Enable running tests using RMM arena and async memory resources (#9506)
rongou Oct 26, 2021
063c982
Update Java nvcomp JNI bindings to nvcomp 2.x API (#9384)
jbrennan333 Oct 26, 2021
d5228f0
Initial pass of generalizing `decimal` support in `cudf` python layer…
galipremsagar Oct 26, 2021
fea54e6
Cleanup some libcudf strings gtests (#9489)
davidwendt Oct 26, 2021
d6b624c
Fix for inserting duplicates in groupby result cache (#9508)
karthikeyann Oct 26, 2021
c0951ba
add min_periods, ddof to groupby covariance, & correlation aggregatio…
karthikeyann Oct 26, 2021
2208ed1
Fix regex handling of embedded null characters (#9470)
davidwendt Oct 26, 2021
90dd9fe
Use raw strings to avoid SyntaxErrors in parsed docstrings. (#9526)
bdice Oct 27, 2021
306e42f
Cleanup for flattening nested columns (#9509)
rwlee Oct 27, 2021
1c9a92b
Fix several test and benchmark issues related to bitmask allocations.…
nvdbaranec Oct 27, 2021
2e76b1a
Add Java bindings for rolling window stddev aggregation (#9527)
razajafri Oct 27, 2021
3c6d1ee
Simplify read_csv by removing unnecessary reader/impl classes (#9041)
cwharris Oct 27, 2021
d8f23c1
catch rmm::out_of_memory exceptions in jni (#9525)
rongou Oct 27, 2021
0606325
Add example to docstrings in `rolling.apply` (#9522)
isVoid Oct 27, 2021
ce385ce
Add an overload of `make_empty_column` with `type_id` parameter (#9524)
ttnghia Oct 27, 2021
baf8275
Generalize some more indexed frame methods (#9529)
vyasr Oct 27, 2021
28d9a55
Expose APIs to wrap CUDA or RMM allocations with a Java device buffer…
jlowe Oct 28, 2021
dce38d4
Fix pytests failing in `cuda-11.5` environment (#9547)
galipremsagar Oct 28, 2021
8e0e70d
More granular column selection in ORC reader (#9496)
vuule Oct 28, 2021
a40df90
Deprecate DataFrame.label_encoding, use private _label_encoding metho…
bdice Oct 29, 2021
e1a05f4
Add groupby scan min/max support for strings values (#9502)
davidwendt Oct 29, 2021
f41e05f
Document invalid regex patterns as undefined behavior (#9473)
davidwendt Oct 29, 2021
201f750
Implement Series.datetime.floor (#9488)
skirui-source Oct 29, 2021
c12b691
Accelerate conditional inner joins with larger right tables (#9523)
vyasr Oct 29, 2021
5cbcf49
Enable linting of CMake files using pre-commit (#9484)
vyasr Oct 29, 2021
fe6c93c
remove alignment options for RMM jni (#9550)
rongou Oct 29, 2021
8aeea8f
Revert "Implement Series.datetime.floor (#9488)" (#9560)
galipremsagar Oct 29, 2021
77c6f1d
Add list output option to character_ngrams() function (#9499)
davidwendt Oct 30, 2021
67019f7
Add axis parameter passthrough to `DataFrame` and `Series` take for p…
dantegd Nov 1, 2021
59f2bdd
Remove dependency on six. (#9495)
bdice Nov 1, 2021
ca347ff
Increase max RLE stream size estimate to avoid potential overflows (#…
vuule Nov 1, 2021
83f605c
compile libnvcomp with PTDS if requested (#9540)
jbrennan333 Nov 1, 2021
237b0ce
Move libcudacxx to use `rapids_cpm` and use newer versions (#9539)
robertmaynard Nov 1, 2021
d073ecb
Add support for single-line regex anchors ^/$ in contains_re (#9482)
davidwendt Nov 1, 2021
e632f97
Update docstring of `DataFrame.merge` (#9572)
galipremsagar Nov 1, 2021
0e76035
Refactor hash join with cuCollections multimap (#8934)
PointKernel Nov 2, 2021
d674c55
Add librdkafka and python-confluent-kafka to dev conda environments s…
jdye64 Nov 2, 2021
32db39e
Build CUDA version agnostic packages for dask-cudf (#9578)
jjacobelli Nov 2, 2021
8a5adad
Support re.Pattern object for pat arg in str.replace (#9573)
davidwendt Nov 2, 2021
1c10790
Generalize comparison binary operations (#9542)
vyasr Nov 2, 2021
2ecebe1
Refactor sorting APIs (#9464)
vyasr Nov 2, 2021
cfcf90f
Fix test failure with cuda 11.5 in row_bit_count tests. (#9581)
nvdbaranec Nov 2, 2021
1a2c347
Add scan min/max support for chrono types to libcudf reduction-scan (…
davidwendt Nov 3, 2021
395b190
Adds cudaProfilerStart/cudaProfilerStop in JNI api (#9543)
abellina Nov 3, 2021
0674316
Enable CMake format in CI and fix style (#9570)
vyasr Nov 3, 2021
4dd8293
Set RMM pool to a fixed size in JNI (#9583)
rongou Nov 3, 2021
7e11268
Add offsets_begin/end() to strings_column_view (#9559)
davidwendt Nov 3, 2021
275f5fc
Fix edge case in tdigest scalar generation for groups containing all …
nvdbaranec Nov 3, 2021
9d375f5
Correct _LIBCUDACXX_CUDACC_VER value computation (#9579)
robertmaynard Nov 4, 2021
f041a47
Support `args=` in `apply` (#9514)
brandon-b-miller Nov 4, 2021
2ef82f2
Fix `segmented_gather()` for null LIST rows (#9537)
mythrocks Nov 4, 2021
713a3b2
remove deprecated Rmm.initialize method (#9607)
rongou Nov 4, 2021
9fa6ec0
Add `xfail` for parquet reader `11.5` issue (#9612)
galipremsagar Nov 5, 2021
5427c22
Use HostColumnVectorCore for child columns in JCudfSerialization.unpa…
sperlingxx Nov 5, 2021
59460af
Add `11.5` dev.yml to `cudf` (#9617)
galipremsagar Nov 5, 2021
893fae6
Add NVTX Start/End Ranges to JNI (#9563)
abellina Nov 5, 2021
52f2619
Add handling of mixed numeric types in `to_dlpack` (#9585)
galipremsagar Nov 5, 2021
4cba672
Fix `usecols` parameter handling in `dask_cudf.read_csv` (#9618)
galipremsagar Nov 5, 2021
7a6c728
Add scan sum support for duration types to libcudf (#9536)
davidwendt Nov 8, 2021
eda31b6
Avoid passing NativeFileDatasource to pyarrow in read_parquet (#9608)
rjzamora Nov 8, 2021
8f16d4c
Add `calendrical_month_sequence` in c++ and `date_range` in python (#…
shwina Nov 8, 2021
a7a8250
Add dedicated page for `StringHandling` in python docs (#9624)
galipremsagar Nov 8, 2021
9a60296
Add support for string `'nan', 'inf' & '-inf'` values while type-cast…
galipremsagar Nov 8, 2021
34832f6
Enable Series.divide and DataFrame.divide (#9630)
vyasr Nov 8, 2021
7666d4a
Use nvCOMP for Snappy compression/decompression (#9582)
vuule Nov 8, 2021
281fed9
Use std::size_t when computing join output size (#9626)
jlowe Nov 8, 2021
3280be2
Update cuCollections to version that supports installed libcudacxx (#…
robertmaynard Nov 9, 2021
a7d520c
Make sure all dask-cudf supported aggs are handled in `_tree_node_agg…
charlesbluca Nov 9, 2021
499ebae
Add format API for list column of strings (#9454)
davidwendt Nov 9, 2021
b47ff24
Fail gracefully when compiling python UDFs that attempt to access col…
brandon-b-miller Nov 9, 2021
d60e2e6
add const when getting data from a JNI data wrapper (#9637)
wjxiz1992 Nov 10, 2021
f301b28
Fixed tests warning: "TYPED_TEST_CASE is deprecated, please use TYPED…
ttnghia Nov 10, 2021
3a97ac1
Support structs column in `min`, `max`, `argmin` and `argmax` groupby…
ttnghia Nov 10, 2021
4f7173e
Update Documentation to use `TYPED_TEST_SUITE` (#9654)
codereport Nov 10, 2021
9e3a89d
Fix behavior of equals for non-DataFrame Frames and add tests. (#9653)
vyasr Nov 10, 2021
3ca7c96
Dont recompute output size if it is already available (#9649)
abellina Nov 10, 2021
77dc477
Simplify read_json by removing unnecessary reader/impl classes (#9088)
cwharris Nov 11, 2021
544643c
Reimplement `lists::drop_list_duplicates` for keys-values lists colum…
ttnghia Nov 11, 2021
9f9a377
Miscellaneous improvements for UDFs (#9422)
isVoid Nov 11, 2021
1e4afd1
Fix read_parquet bug for extended dtypes from remote storage (#9638)
rjzamora Nov 11, 2021
507a3c5
Fix debrotli issue on CUDA 11.5 (#9632)
vuule Nov 11, 2021
98f5a28
Update `conda` recipes for Enhanced Compatibility effort (#9456)
ajschmidt8 Nov 11, 2021
5402787
Updating cudf version also updates rapids cmake branch (#9249)
robertmaynard Nov 11, 2021
ba2b51d
Update `bitmask_and` and `bitmask_or` to return a pair of resulting m…
PointKernel Nov 11, 2021
36b3344
Followup to PR 9088 comments (#9659)
cwharris Nov 11, 2021
10cbbd7
Add JNI for `lists::drop_list_duplicates` with keys-values input colu…
ttnghia Nov 12, 2021
79b4f54
Remove sizeof and standardize on memory_usage (#9544)
vyasr Nov 12, 2021
84e5a03
Force inlining to improve AST performance (#9530)
vyasr Nov 12, 2021
157a8c3
Fix read_parquet bug for bytes input (#9669)
rjzamora Nov 12, 2021
47af69a
Use `_gather` internal for `sort_*` (#9668)
isVoid Nov 12, 2021
49d1cc2
Grouping by frequency and resampling (#9178)
shwina Nov 13, 2021
dd0f8db
Don't overflow on preceding and following computation (#9674)
revans2 Nov 15, 2021
10d80b1
Correct `to_cupy`/`to_numpy` order (#9688)
dantegd Nov 16, 2021
22a2ac9
Apply Type Metadata to Column Children (#9391)
isVoid Nov 16, 2021
d317147
Fix multiindex memory_usage (#9693)
ChrisJar Nov 16, 2021
0da63f0
Update copyright text on resample.py (#9701)
shwina Nov 16, 2021
c1f20c7
Add support for `decimal128` (#9483)
codereport Nov 16, 2021
9aefbc2
Java Support for Decimal 128 (#9485)
revans2 Nov 17, 2021
d623c93
change minimum pin of cupy (#9636)
galipremsagar Nov 18, 2021
012bfe9
[REVIEW] Upgrade `clang` to 11.1.0 (#9716)
galipremsagar Nov 18, 2021
cb894e0
Fix `dask-cudf` recipe for Enhanced Compatibility (#9733)
ajschmidt8 Nov 19, 2021
e05bd4b
update changelog
ajschmidt8 Nov 19, 2021
dbf3c1c
Merge remote-tracking branch 'upstream/branch-21.10' into fix-changelog
ajschmidt8 Nov 19, 2021
0906a09
fix merge issues
ajschmidt8 Nov 19, 2021
ff5d987
Merge pull request #9739 from ajschmidt8/fix-changelog
ajschmidt8 Nov 19, 2021
9fc35b7
Update `DEFAULT_CUDA_VER` in `ci/cpu/prebuild.sh` (#9749)
ajschmidt8 Nov 22, 2021
8e2ac44
[REVIEW] Pin max `dask` & `distributed` versions (#9734)
galipremsagar Nov 22, 2021
8d9d222
[FIX] Add `arrow_dataset` and `parquet` targets to build exports (#9491)
trxcllnt Nov 29, 2021
a1ca8c1
Use ptxcompiler to patch Numba at runtime to support CUDA enhanced co…
shwina Nov 29, 2021
0ebeffa
Only run runtime jit tests with CUDA 11.5 runtime
robertmaynard Nov 23, 2021
dfcb48d
Fix style issues found by CI
robertmaynard Nov 23, 2021
bbf137e
WIP: disable csv test
robertmaynard Nov 24, 2021
a24d2a8
WIP: disable all io tests
robertmaynard Nov 24, 2021
f614395
remove jit integration tests
karthikeyann Nov 25, 2021
16fcf48
remove jit code which are supported by compiled binops
karthikeyann Nov 25, 2021
1b9d624
remove jit benchmark
karthikeyann Nov 25, 2021
e49a343
skip generic op udf (jit ptx) in pytest CUDA<11.5
karthikeyann Nov 25, 2021
8f64086
add deleted UserDefinedOp
karthikeyann Nov 25, 2021
efb203b
fix missing includes
karthikeyann Nov 25, 2021
011fb48
fix segfault by nullptr check in cufile_shim dtor
karthikeyann Nov 29, 2021
9bdc28b
enable cuio tests again
karthikeyann Nov 29, 2021
a3ba687
addres review comments
karthikeyann Nov 29, 2021
2c6f919
Merge pull request #9794 from robertmaynard/Remove_jit_code_of_binary…
jjacobelli Nov 30, 2021
74ac6ed
fix make_empty_scalar_like (#9782)
sperlingxx Nov 30, 2021
69e6dbb
Move the binary_ops common dispatcher logic to be executed on the CPU…
robertmaynard Dec 3, 2021
a93d333
Fix ORC writer crash with empty input columns (#9831)
vuule Dec 3, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
55 changes: 37 additions & 18 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
repos:
- repo: https://github.com/pycqa/isort
- repo: https://github.com/PyCQA/isort
rev: 5.6.4
hooks:
- id: isort
Expand Down Expand Up @@ -27,12 +27,12 @@ repos:
name: isort-dask-cudf
args: ["--settings-path=python/dask_cudf/setup.cfg"]
files: python/dask_cudf/.*
- repo: https://github.com/ambv/black
- repo: https://github.com/psf/black
rev: 19.10b0
hooks:
- id: black
files: python/.*
- repo: https://gitlab.com/pycqa/flake8
- repo: https://github.com/PyCQA/flake8
rev: 3.8.3
hooks:
- id: flake8
Expand All @@ -45,30 +45,49 @@ repos:
name: flake8-cython
args: ["--config=python/.flake8.cython"]
types: [cython]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v0.782'
hooks:
- id: mypy
args: ["--config-file=python/cudf/setup.cfg", "python/cudf/cudf"]
pass_filenames: false
- repo: https://github.com/PyCQA/pydocstyle
rev: 6.1.1
hooks:
- id: pydocstyle
args: ["--config=python/.flake8"]
- repo: local
hooks:
- id: clang-format
# Using the pre-commit stage to simplify invocation of all
# other hooks simultaneously (via any other hook stage). This
# can be removed if we also move to running clang-format
# entirely through pre-commit.
stages: [commit]
name: clang-format
description: Format files with ClangFormat.
entry: clang-format -i
language: system
files: \.(cu|cuh|h|hpp|cpp|inl)$
args: ['-fallback-style=none']
- repo: local
hooks:
- id: mypy
name: mypy
description: mypy
pass_filenames: false
entry: mypy --config-file=python/cudf/setup.cfg python/cudf/cudf
language: system
types: [python]
- repo: https://github.com/pycqa/pydocstyle
rev: 6.0.0
hooks:
- id: pydocstyle
args: ["--config=python/.flake8"]

- id: cmake-format
name: cmake-format
entry: bash cpp/scripts/run-cmake-format.sh cmake-format
language: python
types: [cmake]
# Note that pre-commit autoupdate does not update the versions
# of dependencies, so we'll have to update this manually.
additional_dependencies:
- cmake-format==0.6.11
- id: cmake-lint
name: cmake-lint
entry: bash cpp/scripts/run-cmake-format.sh cmake-lint
language: python
types: [cmake]
# Note that pre-commit autoupdate does not update the versions
# of dependencies, so we'll have to update this manually.
additional_dependencies:
- cmake-format==0.6.11

default_language_version:
python: python3
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# cuDF 21.12.00 (Date TBD)

Please see https://github.com/rapidsai/cudf/releases/tag/v21.12.00a for the latest changes to this development branch.

# cuDF 21.10.00 (Date TBD)

Please see https://github.com/rapidsai/cudf/releases/tag/v21.10.00a for the latest changes to this development branch.
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,12 @@ The following instructions are for developers and contributors to cuDF OSS devel
Compilers:

* `gcc` version 9.3+
* `nvcc` version 11.0+
* `nvcc` version 11.5+
* `cmake` version 3.20.1+

CUDA/GPU:

* CUDA 11.0+
* CUDA 11.5+
* NVIDIA driver 450.80.02+
* Pascal architecture or better

Expand Down
2 changes: 1 addition & 1 deletion build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ ARGS=$*
REPODIR=$(cd $(dirname $0); pwd)

VALIDARGS="clean libcudf cudf dask_cudf benchmarks tests libcudf_kafka cudf_kafka custreamz -v -g -n -l --allgpuarch --disable_nvtx --show_depr_warn --ptds -h"
HELP="$0 [clean] [libcudf] [cudf] [dask_cudf] [benchmarks] [tests] [libcudf_kafka] [cudf_kafka] [custreamz] [-v] [-g] [-n] [-h] [-l] [--cmake-args=\"<args>\"]
HELP="$0 [clean] [libcudf] [cudf] [dask_cudf] [benchmarks] [tests] [libcudf_kafka] [cudf_kafka] [custreamz] [-v] [-g] [-n] [-h] [-l] [--cmake-args=\\\"<args>\\\"]
clean - remove all existing build artifacts and configuration (start
over)
libcudf - build the cudf C++ code only
Expand Down
2 changes: 1 addition & 1 deletion ci/benchmark/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export GBENCH_BENCHMARKS_DIR="$WORKSPACE/cpp/build/gbenchmarks/"
export LIBCUDF_KERNEL_CACHE_PATH="$HOME/.jitify-cache"

# Dask & Distributed git tag
export DASK_DISTRIBUTED_GIT_TAG='2021.09.1'
export DASK_DISTRIBUTED_GIT_TAG='main'

function remove_libcudf_kernel_cache_dir {
EXITCODE=$?
Expand Down
117 changes: 7 additions & 110 deletions ci/checks/style.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,119 +14,18 @@ LANG=C.UTF-8
. /opt/conda/etc/profile.d/conda.sh
conda activate rapids

# Run isort-cudf and get results/return code
ISORT_CUDF=`isort python/cudf --check-only --settings-path=python/cudf/setup.cfg 2>&1`
ISORT_CUDF_RETVAL=$?
FORMAT_FILE_URL=https://raw.githubusercontent.com/rapidsai/rapids-cmake/main/cmake-format-rapids-cmake.json
export RAPIDS_CMAKE_FORMAT_FILE=/tmp/rapids_cmake_ci/cmake-formats-rapids-cmake.json
mkdir -p $(dirname ${RAPIDS_CMAKE_FORMAT_FILE})
wget -O ${RAPIDS_CMAKE_FORMAT_FILE} ${FORMAT_FILE_URL}

# Run isort-cudf-kafka and get results/return code
ISORT_CUDF_KAFKA=`isort python/cudf_kafka --check-only --settings-path=python/cudf_kafka/setup.cfg 2>&1`
ISORT_CUDF_KAFKA_RETVAL=$?

# Run isort-custreamz and get results/return code
ISORT_CUSTREAMZ=`isort python/custreamz --check-only --settings-path=python/custreamz/setup.cfg 2>&1`
ISORT_CUSTREAMZ_RETVAL=$?

# Run isort-dask-cudf and get results/return code
ISORT_DASK_CUDF=`isort python/dask_cudf --check-only --settings-path=python/dask_cudf/setup.cfg 2>&1`
ISORT_DASK_CUDF_RETVAL=$?

# Run black and get results/return code
BLACK=`black --check python 2>&1`
BLACK_RETVAL=$?

# Run flake8 and get results/return code
FLAKE=`flake8 --config=python/.flake8 python 2>&1`
FLAKE_RETVAL=$?

# Run flake8-cython and get results/return code
FLAKE_CYTHON=`flake8 --config=python/.flake8.cython 2>&1`
FLAKE_CYTHON_RETVAL=$?

# Run mypy and get results/return code
MYPY_CUDF=`mypy --config=python/cudf/setup.cfg python/cudf/cudf 2>&1`
MYPY_CUDF_RETVAL=$?

# Run pydocstyle and get results/return code
PYDOCSTYLE=`pydocstyle --config=python/.flake8 python 2>&1`
PYDOCSTYLE_RETVAL=$?
pre-commit run --hook-stage manual --all-files
PRE_COMMIT_RETVAL=$?

# Run clang-format and check for a consistent code format
CLANG_FORMAT=`python cpp/scripts/run-clang-format.py 2>&1`
CLANG_FORMAT_RETVAL=$?

# Output results if failure otherwise show pass
if [ "$ISORT_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-cudf style check; begin output\n\n"
echo -e "$ISORT_CUDF"
echo -e "\n\n>>>> FAILED: isort-cudf style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-cudf style check\n\n"
fi

if [ "$ISORT_CUDF_KAFKA_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-cudf-kafka style check; begin output\n\n"
echo -e "$ISORT_CUDF_KAFKA"
echo -e "\n\n>>>> FAILED: isort-cudf-kafka style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-cudf-kafka style check\n\n"
fi

if [ "$ISORT_CUSTREAMZ_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-custreamz style check; begin output\n\n"
echo -e "$ISORT_CUSTREAMZ"
echo -e "\n\n>>>> FAILED: isort-custreamz style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-custreamz style check\n\n"
fi

if [ "$ISORT_DASK_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-dask-cudf style check; begin output\n\n"
echo -e "$ISORT_DASK_CUDF"
echo -e "\n\n>>>> FAILED: isort-dask-cudf style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-dask-cudf style check\n\n"
fi

if [ "$BLACK_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: black style check; begin output\n\n"
echo -e "$BLACK"
echo -e "\n\n>>>> FAILED: black style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: black style check\n\n"
fi

if [ "$FLAKE_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: flake8 style check; begin output\n\n"
echo -e "$FLAKE"
echo -e "\n\n>>>> FAILED: flake8 style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: flake8 style check\n\n"
fi

if [ "$FLAKE_CYTHON_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: flake8-cython style check; begin output\n\n"
echo -e "$FLAKE_CYTHON"
echo -e "\n\n>>>> FAILED: flake8-cython style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: flake8-cython style check\n\n"
fi

if [ "$MYPY_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: mypy style check; begin output\n\n"
echo -e "$MYPY_CUDF"
echo -e "\n\n>>>> FAILED: mypy style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: mypy style check\n\n"
fi

if [ "$PYDOCSTYLE_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: pydocstyle style check; begin output\n\n"
echo -e "$PYDOCSTYLE"
echo -e "\n\n>>>> FAILED: pydocstyle style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: pydocstyle style check\n\n"
fi

if [ "$CLANG_FORMAT_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: clang format check; begin output\n\n"
echo -e "$CLANG_FORMAT"
Expand All @@ -141,9 +40,7 @@ HEADER_META_RETVAL=$?
echo -e "$HEADER_META"

RETVALS=(
$ISORT_CUDF_RETVAL $ISORT_CUDF_KAFKA_RETVAL $ISORT_CUSTREAMZ_RETVAL $ISORT_DASK_CUDF_RETVAL
$BLACK_RETVAL $FLAKE_RETVAL $FLAKE_CYTHON_RETVAL $PYDOCSTYLE_RETVAL $CLANG_FORMAT_RETVAL
$HEADER_META_RETVAL $MYPY_CUDF_RETVAL
$PRE_COMMIT_RETVAL $CLANG_FORMAT_RETVAL
)
IFS=$'\n'
RETVAL=`echo "${RETVALS[*]}" | sort -nr | head -n1`
Expand Down
4 changes: 2 additions & 2 deletions ci/gpu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ export GIT_DESCRIBE_TAG=`git describe --tags`
export MINOR_VERSION=`echo $GIT_DESCRIBE_TAG | grep -o -E '([0-9]+\.[0-9]+)'`

# Dask & Distributed git tag
export DASK_DISTRIBUTED_GIT_TAG='2021.09.1'
export DASK_DISTRIBUTED_GIT_TAG='main'

################################################################################
# TRAP - Setup trap for removing jitify cache
Expand Down Expand Up @@ -83,7 +83,7 @@ gpuci_mamba_retry install -y \
"rapids-notebook-env=$MINOR_VERSION.*" \
"dask-cuda=${MINOR_VERSION}" \
"rmm=$MINOR_VERSION.*" \
"ucx-py=0.22.*"
"ucx-py=0.23.*"

# https://docs.rapids.ai/maintainers/depmgmt/
# gpuci_mamba_retry remove --force rapids-build-env rapids-notebook-env
Expand Down
2 changes: 1 addition & 1 deletion ci/gpu/java.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ gpuci_conda_retry install -y \
"rapids-notebook-env=$MINOR_VERSION.*" \
"dask-cuda=${MINOR_VERSION}" \
"rmm=$MINOR_VERSION.*" \
"ucx-py=0.22.*" \
"ucx-py=0.23.*" \
"openjdk=8.*" \
"maven"

Expand Down
3 changes: 3 additions & 0 deletions ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ sed_runner 's/'"CUDA_KAFKA VERSION .* LANGUAGES"'/'"CUDA_KAFKA VERSION ${NEXT_FU
# cpp cudf_jni update
sed_runner 's/'"CUDF_JNI VERSION .* LANGUAGES"'/'"CUDF_JNI VERSION ${NEXT_FULL_TAG} LANGUAGES"'/g' java/src/main/native/CMakeLists.txt

# rapids-cmake version
sed_runner 's/'"branch-.*\/RAPIDS.cmake"'/'"branch-${NEXT_SHORT_TAG}\/RAPIDS.cmake"'/g' fetch_rapids.cmake

# doxyfile update
sed_runner 's/PROJECT_NUMBER = .*/PROJECT_NUMBER = '${NEXT_FULL_TAG}'/g' cpp/doxygen/Doxyfile

Expand Down
22 changes: 13 additions & 9 deletions conda/environments/cudf_dev_cuda11.0.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ channels:
- rapidsai-nightly
- conda-forge
dependencies:
- clang=11.0.0
- clang-tools=11.0.0
- cupy>7.1.0,<10.0.0a0
- rmm=21.10.*
- clang=11.1.0
- clang-tools=11.1.0
- cupy>=9.5.0,<10.0.0a0
- rmm=21.12.*
- cmake>=3.20.1
- cmake_setuptools>=0.1.3
- python>=3.7,<3.9
Expand All @@ -19,6 +19,7 @@ dependencies:
- pandas>=1.0,<1.4.0dev0
- pyarrow=5.0.0=*cuda
- fastavro>=0.22.9
- python-snappy>=0.6.0
- notebook>=0.5.0
- cython>=0.29,<0.30
- fsspec>=0.6.0
Expand All @@ -37,10 +38,11 @@ dependencies:
- black=19.10
- isort=5.6.4
- mypy=0.782
- pydocstyle=6.1.1
- typing_extensions
- pre_commit
- dask=2021.09.1
- distributed=2021.09.1
- pre-commit
- dask>=2021.09.1
- distributed>=2021.09.1
- streamz
- arrow-cpp=5.0.0
- dlpack>=0.5,<0.6.0a0
Expand All @@ -57,8 +59,10 @@ dependencies:
- cachetools
- transformers<=4.10.3
- pydata-sphinx-theme
- librdkafka=1.7.0
- python-confluent-kafka=1.7.0
- pip:
- git+https://github.com/dask/dask.git@2021.09.1
- git+https://github.com/dask/distributed.git@2021.09.1
- git+https://github.com/dask/dask.git@main
- git+https://github.com/dask/distributed.git@main
- git+https://github.com/python-streamz/streamz.git@master
- pyorc
Loading