Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v0.13 #4714

Merged
merged 4,223 commits into from
Mar 31, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
4223 commits
Select commit Hold shift + click to select a range
bc1acf1
Use accumulate to compute extent.
jrhemstad Mar 17, 2020
9a38419
Fix out of bounds error
devavret Mar 17, 2020
55dcb32
Merge branch 'fea-argmin-sort-groupby' of https://github.com/devavret…
devavret Mar 17, 2020
5590777
Revert "Remove RMM init/finalize from cudf test fixture."
jrhemstad Mar 17, 2020
1410b60
Merge pull request #4537 from jakirkham/use_elif_deserialize
jakirkham Mar 17, 2020
2fd1d9b
Use void* instead of ptrdiff_t.
jrhemstad Mar 17, 2020
1f68ea1
Update columns without children to not need a deleter.
jrhemstad Mar 17, 2020
2850c3d
Update child extent calculation to use accumulate.
jrhemstad Mar 17, 2020
9e58610
Update child device storage to use a device buffer.
jrhemstad Mar 17, 2020
7480e2a
Use deleter that deletes the device_buffer.
jrhemstad Mar 17, 2020
eee6339
Remove uneeded stream sync.
jrhemstad Mar 17, 2020
806db49
Grouped Rolling Window: Resolve merge conflict in rolling.cu
mythrocks Mar 17, 2020
7046517
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 17, 2020
5e7b079
fix bad imports
kkraus14 Mar 17, 2020
8bbc9f7
Merge pull request #4456 from devavret/fea-argmin-sort-groupby
kkraus14 Mar 17, 2020
63496fe
Disable compile-errors on deprecation warnings, for JNI
mythrocks Mar 17, 2020
57b278e
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into p…
shwina Mar 17, 2020
299f1ca
Updated CHANGELOG.md for #4557
mythrocks Mar 17, 2020
812788f
Merge pull request #4549 from jrhemstad/option-disable-deprecation-wa…
harrism Mar 17, 2020
3482431
Merge pull request #4557 from mythrocks/deprecation-error-disable-bra…
kkraus14 Mar 18, 2020
bd58def
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 18, 2020
2122272
Fix serializable classes list
shwina Mar 18, 2020
11903d2
Merge pull request #4235 from millerhooks/fea-cython-nvtx
kkraus14 Mar 18, 2020
6f6b793
Merge branch 'branch-0.13' of github.com:rapidsai/cudf into fea-optim…
trxcllnt Mar 18, 2020
e44f913
Merge branch 'branch-0.13' into port-groupby-libxx-python
trxcllnt Mar 18, 2020
3d278ba
remove conditional from fused_concatenate_kernel
trxcllnt Mar 18, 2020
fb0f594
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into p…
shwina Mar 18, 2020
94793a6
Merge branch 'port-groupby-libxx-python' of https://github.com/shwina…
shwina Mar 18, 2020
5094c6f
Remove duplicate arg in docstring
shwina Mar 18, 2020
a5bcacf
Remove extraneous #includes
shwina Mar 18, 2020
4f84872
remove copies to host in generate_pandas_metadata
galipremsagar Mar 18, 2020
974d835
Update CHANGELOG.md
galipremsagar Mar 18, 2020
7049215
Optimize `__reduce__` in `StringColumn`
jakirkham Mar 18, 2020
b2598d5
INT32 to data_type size_type
karthikeyann Mar 18, 2020
cf38a6d
Merge branch 'branch-0.13' into bug-slice-step-1
davidwendt Mar 18, 2020
30bc08f
Merge pull request #4346 from shwina/port-groupby-libxx-python
kkraus14 Mar 18, 2020
029e58a
Merge branch 'branch-0.13' into bug-slice-step-1
davidwendt Mar 18, 2020
f182668
Load JNI native dependencies for Scalar class
jlowe Mar 18, 2020
56ed1a7
Remove unneeded temp pointers.
jrhemstad Mar 18, 2020
3de8948
changelog
jlowe Mar 18, 2020
713505d
Remove extraneous setting of child members.
jrhemstad Mar 18, 2020
730858c
Rename descendant storage buffers.
jrhemstad Mar 18, 2020
d75fc0c
Rename storage buffers.
jrhemstad Mar 18, 2020
7794bd3
Use size_t for extent.
jrhemstad Mar 18, 2020
af87d81
Add the sync back w/ documentation.
jrhemstad Mar 18, 2020
11840d9
Merge pull request #4571 from jlowe/load-scalar-deps
kkraus14 Mar 18, 2020
24632d3
Merge pull request #4538 from davidwendt/bug-slice-step-1
kkraus14 Mar 18, 2020
1bfe231
add categorical validation
galipremsagar Mar 18, 2020
a374146
Merge branch 'parquet_fix' of https://github.com/galipremsagar/cudf i…
galipremsagar Mar 18, 2020
5cd8999
Merge remote-tracking branch 'upstream/branch-0.13' into new_branch
galipremsagar Mar 18, 2020
3ed3a67
Merge branch 'branch-0.13' into new_branch
galipremsagar Mar 18, 2020
9761235
remove special handling in tests
galipremsagar Mar 18, 2020
8f35314
Merge branch 'new_branch' of https://github.com/galipremsagar/cudf in…
galipremsagar Mar 18, 2020
cfb401e
Add Cython declarations for datetime.hpp
shwina Mar 18, 2020
03b9864
Fix typo in serialize.py
shwina Mar 18, 2020
6e6b149
Changelog
shwina Mar 18, 2020
04d9137
Merge branch 'branch-0.13' into fix-groupby-serialize-typo
shwina Mar 18, 2020
f1cf7ab
include thrust/advance.h
trxcllnt Mar 18, 2020
9c75f30
Merge branch 'branch-0.13' of github.com:rapidsai/cudf into port/cyth…
trxcllnt Mar 18, 2020
6b3db1a
update Frame._concat nvtx range calls
trxcllnt Mar 18, 2020
585ece4
Merge branch 'fix-groupby-serialize-typo' of github.com:shwina/cudf i…
trxcllnt Mar 18, 2020
0abd8a2
Merge branch 'fea-optimize-concatenate-for-many-cols' of github.com:t…
trxcllnt Mar 18, 2020
66ac48e
Merge pull request #4563 from galipremsagar/parquet_fix
kkraus14 Mar 18, 2020
f14729e
Merge branch 'branch-0.13' into new_branch
kkraus14 Mar 18, 2020
6e7d3b8
Merge pull request #4576 from shwina/fix-groupby-serialize-typo
kkraus14 Mar 18, 2020
9e03ff4
Revert "Add the sync back w/ documentation."
jrhemstad Mar 18, 2020
975224f
remove thrust::prev since it's not in cuda 10.0
trxcllnt Mar 18, 2020
5edf856
Merge branch 'fea-optimize-concatenate-for-many-cols' of github.com:t…
trxcllnt Mar 18, 2020
c5a9307
Merge branch 'branch-0.13' of github.com:rapidsai/cudf into port/cyth…
trxcllnt Mar 18, 2020
f96db08
Merge pull request #4567 from jakirkham/opt_strcol_pickle
kkraus14 Mar 18, 2020
a417d0a
Add sync.
jrhemstad Mar 18, 2020
ba51799
Merge remote-tracking branch 'upstream/branch-0.13' into remove-rmm-a…
jrhemstad Mar 18, 2020
0a5ea3e
Remove uneccessary type def for deleter type.
jrhemstad Mar 18, 2020
b0ca83c
changelog.
jrhemstad Mar 18, 2020
dec4a03
add a temporary column to hold order and sort based on it post-join
galipremsagar Mar 18, 2020
17b1e99
add tests related to Multi-index join order fix
galipremsagar Mar 18, 2020
ba95ac0
Update CHANGELOG.md
galipremsagar Mar 18, 2020
0804085
Add Cython bindings for datetime functions and use in Python
shwina Mar 18, 2020
e1f4a44
Merge pull request #4516 from galipremsagar/new_branch
kkraus14 Mar 18, 2020
945d512
Drop duplicate registration of `Series`
jakirkham Mar 18, 2020
08f2099
Register `Index` and `MultiIndex`
jakirkham Mar 18, 2020
1d1d0d7
Note Dask serialization registration of `*Index`
jakirkham Mar 18, 2020
b4015a2
Fix
Mar 18, 2020
6abeb93
Register `CategoricalDtype` for Dask serialization
jakirkham Mar 18, 2020
a67b9e0
fix Frame._concat libcudf++ nvtx cython calls
trxcllnt Mar 18, 2020
7529758
Note a few types registered for Dask serialization
jakirkham Mar 18, 2020
709793a
Add `_Grouping` to Dask serializable objects
jakirkham Mar 18, 2020
37f5611
Register `ColumnBase` for serialization
jakirkham Mar 18, 2020
af5aa9f
Merge pull request #4224 from trevorsm7/fea-optimize-concatenate-for-…
kkraus14 Mar 18, 2020
f24d6bd
Grouped Rolling Window: Switched to upper_bound/lower_bound for windo…
mythrocks Mar 18, 2020
4bb6068
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 18, 2020
6426215
Merge branch 'branch-0.13' of github.com:rapidsai/cudf into port/cyth…
trxcllnt Mar 18, 2020
d26d136
move concatenate.hpp import to the bottom
trxcllnt Mar 18, 2020
5f164ae
Fix exec dangling pointer issue in legacy groupby
devavret Mar 19, 2020
b3ea342
CHANGELOG
devavret Mar 19, 2020
65058d9
code changes
rgsl888prabhu Mar 19, 2020
a1b87fe
CHANGELOG.md
rgsl888prabhu Mar 19, 2020
d409446
fix pd.DataFrame type input
galipremsagar Mar 19, 2020
caa6840
Merge branch 'branch-0.13' into multiindex_slice_issue_with_popn
kkraus14 Mar 19, 2020
0c14dbc
add tests
galipremsagar Mar 19, 2020
6c359cd
Update CHANGELOG.md
galipremsagar Mar 19, 2020
fd738af
Merge pull request #4590 from jakirkham/reg_index_dask_serialize
kkraus14 Mar 19, 2020
3274143
push down pandas check below rapids check
galipremsagar Mar 19, 2020
a2ebc45
Merge branch 'df_fix' of https://github.com/galipremsagar/cudf into d…
galipremsagar Mar 19, 2020
49f80f6
Merge pull request #4344 from trxcllnt/port/cython-libxx-concat
kkraus14 Mar 19, 2020
dad350d
Merge remote-tracking branch 'upstream/branch-0.13' into multi_index
galipremsagar Mar 19, 2020
eb2dd7d
Merge pull request #4594 from devavret/bug-exec-legacy-sort-helper
harrism Mar 19, 2020
fc2601e
Merge pull request #4596 from rgsl888prabhu/multiindex_slice_issue_wi…
kkraus14 Mar 19, 2020
96c9c6a
Apply suggestions from code review: *rank_data with begin
karthikeyann Mar 19, 2020
033b167
Merge branch 'branch-0.13' into df_fix
kkraus14 Mar 19, 2020
5790826
doc update
karthikeyann Mar 19, 2020
66bf832
review updates by jake - mainly iterator usage
karthikeyann Mar 19, 2020
e901cd7
Added missing rmm_api.h include to benchmark_fixture.hpp
harrism Mar 19, 2020
f9910a2
CHANGELOG for #4600
harrism Mar 19, 2020
8b7fa45
Merge remote-tracking branch 'nv/branch-0.13' into bugfix/parquet-row…
Mar 19, 2020
54de42c
Remove legacy unaryops and stream_compaction bindings
shwina Mar 19, 2020
6c22be8
Style/imports
shwina Mar 19, 2020
1d5e5fc
Changelog
shwina Mar 19, 2020
a3e901f
Remove legacy stream_compaction.pxd
shwina Mar 19, 2020
bd18eaa
Unused import
shwina Mar 19, 2020
47f0510
Fix typo of using constant iterator instead of counting.
jrhemstad Mar 19, 2020
5463667
Merge pull request #4600 from harrism/fix-benchmark-fixture-rmm-include
jrhemstad Mar 19, 2020
3a8380b
Make `_get_dt_field` non public in DatetimeProperties
shwina Mar 19, 2020
263f6db
Improving fix, Python tests passing now
Mar 19, 2020
d47c60a
handling pandas mult-index in a to_frame flow
galipremsagar Mar 19, 2020
20b49ae
Merge branch 'multi_index' of https://github.com/galipremsagar/cudf i…
galipremsagar Mar 19, 2020
bb5841d
Merge branch 'branch-0.13' into multi_index
galipremsagar Mar 19, 2020
80221df
Updating Changelog
Mar 19, 2020
ee558b6
Merge remote-tracking branch 'nv/branch-0.13' into bugfix/parquet-row…
Mar 19, 2020
f2c5aea
Merge pull request #4598 from galipremsagar/df_fix
kkraus14 Mar 19, 2020
e69b390
Get rid of legacy typecast bindings
shwina Mar 19, 2020
5af7a90
add source_data arg & use gather instead of join + sort
galipremsagar Mar 19, 2020
769b116
Merge branch 'multi_index' of https://github.com/galipremsagar/cudf i…
galipremsagar Mar 19, 2020
e3c8f80
Add except +
shwina Mar 19, 2020
044f230
Stale import
shwina Mar 19, 2020
efe23a9
Update python/cudf/cudf/core/multiindex.py
galipremsagar Mar 19, 2020
8282bb1
Update python/cudf/cudf/core/multiindex.py
galipremsagar Mar 19, 2020
69dae5e
Update python/cudf/cudf/core/multiindex.py
galipremsagar Mar 19, 2020
af1982e
Grouped Rolling Window: Error when grouping-columns empty. Assert wit…
mythrocks Mar 19, 2020
db2b2a0
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 19, 2020
7a304d6
add names param
galipremsagar Mar 19, 2020
30335fa
Merge pull request #4591 from rommelDB/bugfix/parquet-rowgroups
OlivierNV Mar 19, 2020
21cb70b
Update python/cudf/cudf/core/multiindex.py
galipremsagar Mar 19, 2020
d361427
Grouped Rolling Window: Roll back debug prints in grouped_rolling_test
mythrocks Mar 19, 2020
04278da
Merge pull request #4588 from galipremsagar/multi_index
kkraus14 Mar 19, 2020
45e8fde
code changes
rgsl888prabhu Mar 19, 2020
009bbcb
CHANGELOG.md
rgsl888prabhu Mar 19, 2020
72ba430
Update CHANGELOG.md
rgsl888prabhu Mar 19, 2020
8b763fb
Update indexing.py
rgsl888prabhu Mar 19, 2020
a7944c3
Update CHANGELOG.md
rgsl888prabhu Mar 19, 2020
140b577
Merge branch 'branch-0.13' into direct_slicing_than_calling_row_major
rgsl888prabhu Mar 19, 2020
482e43c
Merge pull request #4602 from shwina/add-datetime-cython
kkraus14 Mar 19, 2020
7de0ca2
Grouped Rolling Window: Added better method documentation
mythrocks Mar 19, 2020
6893a91
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 19, 2020
7ad3eca
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into d…
rgsl888prabhu Mar 20, 2020
a6bf4fb
remove nvstrings usage
galipremsagar Mar 20, 2020
356534e
Update CHANGELOG.md
galipremsagar Mar 20, 2020
9059bf1
Update python/cudf/cudf/core/series.py
galipremsagar Mar 20, 2020
17f707b
Merge pull request #4554 from jrhemstad/remove-rmm-alloc-device-view
harrism Mar 20, 2020
e7aac2c
Merge branch 'branch-0.13' into nvstrings_cleanup
harrism Mar 20, 2020
1b0a48c
Applying suggestions to improve documentation
mythrocks Mar 20, 2020
e55c52d
fix hash function for StringColumn in dask_cudf
galipremsagar Mar 20, 2020
fa3494f
Grouped Rolling Window: Fixing documentation for grouped_time_range_r…
mythrocks Mar 20, 2020
540a719
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 20, 2020
2e080ec
review update, int64 rank bug in pandas, pytest
karthikeyann Mar 20, 2020
fe6f1d7
fix type bug after hash-based repartition
rjzamora Mar 20, 2020
b083f65
Merge pull request #4619 from galipremsagar/nvstrings_cleanup
kkraus14 Mar 20, 2020
de5d3b7
changelog
rjzamora Mar 20, 2020
9d7c32e
remove dangling pointer to RMM exec policy.
jrhemstad Mar 20, 2020
5be6019
Change log.
jrhemstad Mar 20, 2020
3fd8daf
Return empty column if all columns to concat are empty.
jrhemstad Mar 20, 2020
16c36cb
Add tests for empty concat inputs.
jrhemstad Mar 20, 2020
9dc0c8f
Changelog.
jrhemstad Mar 20, 2020
2f136f0
Not a pointer.
jrhemstad Mar 20, 2020
7bb1cae
Merge pull request #4625 from rjzamora/fix-repartition-bug
kkraus14 Mar 20, 2020
a58423d
Merge branch 'branch-0.13' into fix-drop-duplicates-legacy-test
kkraus14 Mar 20, 2020
6ca1226
categorical column handling in column.pyx
rgsl888prabhu Mar 20, 2020
7010740
Merge pull request #4611 from rgsl888prabhu/direct_slicing_than_calli…
kkraus14 Mar 20, 2020
f4aa7de
Merge pull request #4630 from jrhemstad/fix-drop-duplicates-legacy-test
kkraus14 Mar 20, 2020
8a973d2
Merge branch 'branch-0.13' into fix-empty-columns-to-concate
jrhemstad Mar 20, 2020
0e416c2
added _libxx/sort.pxd for rank_method
karthikeyann Mar 20, 2020
2519fb6
percentage=true non-dense tests
karthikeyann Mar 20, 2020
82fc60f
style fix cython
karthikeyann Mar 20, 2020
3268d8d
add replace exp
rnyak Mar 21, 2020
bf12f84
Most vexing parse.
jrhemstad Mar 21, 2020
991e8b5
Merge branch 'fix-empty-columns-to-concate' of github.com:jrhemstad/c…
jrhemstad Mar 21, 2020
1b94068
Merge pull request #4632 from jrhemstad/fix-empty-columns-to-concate
jrhemstad Mar 21, 2020
8f47b32
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into b…
rnyak Mar 23, 2020
916ea1a
update
rnyak Mar 23, 2020
f9f101c
Update CHANGELOG.md
rnyak Mar 23, 2020
f66222f
upgrade numba version requirement
kkraus14 Mar 23, 2020
b51ff31
changelog
kkraus14 Mar 23, 2020
94fa244
fix groupby bad merge
kkraus14 Mar 23, 2020
ec56567
code changes
rgsl888prabhu Mar 23, 2020
b0af062
CHANGELOG.md
rgsl888prabhu Mar 23, 2020
60b0256
Merge remote-tracking branch 'origin/branch-0.13' into grouped-rollin…
mythrocks Mar 23, 2020
d9c46d6
Merge pull request #4662 from rgsl888prabhu/4647_partition_by_hash_ke…
kkraus14 Mar 23, 2020
f7e6fc1
Grouped Rolling Window: More window type checks in JNI, more tests
mythrocks Mar 23, 2020
2dc5e50
Grouped Rolling Window: Fixed spelling error in rolling.hpp documenta…
mythrocks Mar 23, 2020
dac6aa0
Missing linebreak in changelog
kkraus14 Mar 24, 2020
ee59cd2
Merge pull request #4641 from rnyak/branch-0.13-dataframe
shwina Mar 24, 2020
323e02a
Merge branch 'branch-0.13' of github.com:rapidsai/cudf into fea-serie…
karthikeyann Mar 24, 2020
9159d69
Raise when building a categorical column with empty categories and
shwina Mar 24, 2020
78f2c5d
Fix building a categorical column from codes with offsets
shwina Mar 24, 2020
ec5c309
Fix non-empty meta generation for categoricals
shwina Mar 24, 2020
786dcdb
Add test for grouping by categorical key
shwina Mar 24, 2020
458f767
Black
shwina Mar 24, 2020
d0b15d5
Update changelog
shwina Mar 24, 2020
d4d3624
Merge branch 'branch-0.13' into fix-dask-categorical-issues
kkraus14 Mar 24, 2020
aa06791
_shuffle_group bug fix
rjzamora Mar 24, 2020
879ef91
cahngelog and test coverage
rjzamora Mar 24, 2020
9891bbc
make note about test params
rjzamora Mar 24, 2020
a4a4899
Merge pull request #4676 from rjzamora/fix-mod-bug
rjzamora Mar 24, 2020
6b59e12
code changes and test cases
rgsl888prabhu Mar 25, 2020
dd68f70
fix test_repr tests that generated RangeIndex column names
kkraus14 Mar 25, 2020
38b39bf
add linebreak
kkraus14 Mar 25, 2020
db87236
changelog
kkraus14 Mar 25, 2020
9db6c49
Include frame lengths in Dask serialized header
jakirkham Mar 25, 2020
99b7d6c
code changes and tests
rgsl888prabhu Mar 25, 2020
9ccaede
CHANGELOG.md
rgsl888prabhu Mar 25, 2020
001a58f
Merge branch 'branch-0.13' into 4678_dataframe_slice_copy_issur
rgsl888prabhu Mar 25, 2020
557d106
Merge pull request #4681 from kkraus14/fix_test_repr
kkraus14 Mar 25, 2020
4d87706
Revert "Fix building a categorical column from codes with offsets"
shwina Mar 25, 2020
0eb9c04
missed changes
rgsl888prabhu Mar 25, 2020
132742a
Merge branch '4678_dataframe_slice_copy_issur' of https://github.com/…
rgsl888prabhu Mar 25, 2020
fe98e94
Use column categories if they exist in _meta_nonempty
shwina Mar 25, 2020
5af8583
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
shwina Mar 25, 2020
20f9892
Merge branch 'fix-dask-categorical-issues' of github.com:shwina/cudf …
shwina Mar 25, 2020
f60d00e
Merge pull request #4654 from kkraus14/update_numba_ver
kkraus14 Mar 25, 2020
8d4e9b6
Grouped Rolling Window: Fixed error in documented example
mythrocks Mar 25, 2020
7b00c5a
Merge pull request #4669 from shwina/fix-dask-categorical-issues
kkraus14 Mar 25, 2020
08e74a5
review changes
rgsl888prabhu Mar 25, 2020
44d2b82
missed changes
rgsl888prabhu Mar 25, 2020
8dd4811
Merge pull request #4682 from jakirkham/add_cuda_frame_lengths
shwina Mar 25, 2020
f4a5799
Merge pull request #4363 from mythrocks/grouped-rolling-window-rebase…
harrism Mar 25, 2020
2d923fb
Merge branch 'branch-0.13' into fea-series_rank
devavret Mar 25, 2020
bf8d6b6
ENH Hide deprecation warnings by default, add flag
mike-wendt Mar 25, 2020
149ff45
Merge pull request #4698 from rapidsai/enh-hide-depr-warns
mike-wendt Mar 25, 2020
ed6cccb
review changes
rgsl888prabhu Mar 25, 2020
2313943
review changes to use base_children in cython section and to set mask…
rgsl888prabhu Mar 26, 2020
67f8e5b
review changes
rgsl888prabhu Mar 26, 2020
f90b91e
Merge pull request #4294 from karthikeyann/fea-series_rank
karthikeyann Mar 26, 2020
0661cfa
adding assert for children of cat not to have mask
rgsl888prabhu Mar 26, 2020
36d73f0
missed change
rgsl888prabhu Mar 26, 2020
4ee660b
review changes and resolved circular dependency
rgsl888prabhu Mar 26, 2020
b8c4924
Merge pull request #4683 from rgsl888prabhu/4678_dataframe_slice_copy…
raydouglass Mar 26, 2020
75a1ea2
FIX Restrict fsspec to prevent dask test failures
mike-wendt Mar 28, 2020
b91dd8c
FIX Update fsspec and change to CUDA 10.2
mike-wendt Mar 28, 2020
c554cfb
DOC Update change log
mike-wendt Mar 28, 2020
6158033
Merge pull request #4729 from rapidsai/fix-fsspec
mike-wendt Mar 28, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
python/cudf/cudf/_version.py export-subst
CHANGELOG.md merge=union
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ cpp/ @rapidsai/cudf-cpp-codeowners

#python code owners
python/ @rapidsai/cudf-python-codeowners
notebooks/ @rapidsai/cudf-python-codeowners
python/dask_cudf/ @rapidsai/cudf-dask-codeowners

#cmake code owners
Expand Down
35 changes: 35 additions & 0 deletions .github/workflows/new-issues-to-triage-projects.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Auto Assign New Issues to Triage Project

on:
issues:
types: [opened]

env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

jobs:
assign_one_project:
runs-on: ubuntu-latest
name: Assign to New Issues to Triage Project
steps:
- name: Process bug issues
uses: docker://takanabe/github-actions-automate-projects:v0.0.1
if: contains(github.event.issue.labels.*.name, 'bug') && contains(github.event.issue.labels.*.name, '? - Needs Triage')
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_PROJECT_URL: https://github.com/rapidsai/cudf/projects/1
GITHUB_PROJECT_COLUMN_NAME: 'Needs prioritizing'
- name: Process feature issues
uses: docker://takanabe/github-actions-automate-projects:v0.0.1
if: contains(github.event.issue.labels.*.name, 'feature request') && contains(github.event.issue.labels.*.name, '? - Needs Triage')
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_PROJECT_URL: https://github.com/rapidsai/cudf/projects/9
GITHUB_PROJECT_COLUMN_NAME: 'Needs prioritizing'
- name: Process other issues
uses: docker://takanabe/github-actions-automate-projects:v0.0.1
if: contains(github.event.issue.labels.*.name, '? - Needs Triage') && (!contains(github.event.issue.labels.*.name, 'bug') && !contains(github.event.issue.labels.*.name, 'feature request'))
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_PROJECT_URL: https://github.com/rapidsai/cudf/projects/10
GITHUB_PROJECT_COLUMN_NAME: 'Needs prioritizing'
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,15 @@ DartConfiguration.tcl
.DS_Store
*.manifest
*.spec
.nfs*

## Python build directories & artifacts
dask-worker-space/
dist/
cudf.egg-info/
python/build
python/*/build
python/cudf/cudf-coverage.xml
python/cudf/*/_lib/**/*.cpp
python/cudf/*/_lib/**/*.h
python/cudf/*/_lib/.nfs*
Expand All @@ -28,6 +31,7 @@ python/cudf/*/_libxx/**/*.h
python/cudf/*/_libxx/.nfs*
python/cudf/*.ipynb
python/cudf/.ipynb_checkpoints
python/nvstrings/nvstrings-coverage.xml
python/*/record.txt
.Python
env/
Expand Down Expand Up @@ -55,8 +59,11 @@ htmlcov/
.cache
nosetests.xml
coverage.xml
junit-cudf.xml
junit-nvstrings.xml
*.cover
.hypothesis/
test-results

## Patching
*.diff
Expand Down Expand Up @@ -142,3 +149,9 @@ ENV/

# mypy
.mypy_cache/

## VSCode IDE
.vscode

# Dask
dask-worker-space/
326 changes: 318 additions & 8 deletions CHANGELOG.md

Large diffs are not rendered by default.

14 changes: 9 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,11 +159,12 @@ git submodule update --init --remote --recursive
# create the conda environment (assuming in base `cudf` directory)
conda env create --name cudf_dev --file conda/environments/cudf_dev_cuda10.0.yml
# activate the environment
source activate cudf_dev
conda activate cudf_dev
```
- If you're using CUDA 9.2, you will need to create the environment with `conda env create --name cudf_dev --file conda/environments/cudf_dev_cuda9.2.yml` instead.
- If using CUDA 9.2, create the environment with `conda env create --name cudf_dev --file conda/environments/cudf_dev_cuda9.2.yml` instead.
- For other CUDA versions, check the corresponding cudf_dev_cuda*.yml file in conda/environments

- Build and install `libcudf`. CMake depends on the `nvcc` executable being on your path or defined in `$CUDACXX`.
- Build and install `libcudf` after its dependencies. CMake depends on the `nvcc` executable being on your path or defined in `$CUDACXX`.
```bash
$ cd $CUDF_HOME/cpp # navigate to C/C++ CUDA source root directory
$ mkdir build # make a build directory
Expand All @@ -173,15 +174,18 @@ $ cd build # ente
# -DCMAKE_INSTALL_PREFIX set to the install path for your libraries or $CONDA_PREFIX if you're using Anaconda, i.e. -DCMAKE_INSTALL_PREFIX=/install/path or -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX
# -DCMAKE_CXX11_ABI set to ON or OFF depending on the ABI version you want, defaults to ON. When turned ON, ABI compability for C++11 is used. When OFF, pre-C++11 ABI compability is used.
$ cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX -DCMAKE_CXX11_ABI=ON # configure cmake ...

$ make -j # compile the libraries librmm.so, libcudf.so ... '-j' will start a parallel job using the number of physical cores available on your system
$ make install # install the libraries librmm.so, libcudf.so to the CMAKE_INSTALL_PREFIX
```

- As a convenience, a `build.sh` script is provided in `$CUDF_HOME`. To execute the same build commands above, run the script as shown below. Note that the libraries will be installed to the location set in `$INSTALL_PREFIX` if set (i.e. `export INSTALL_PREFIX=/install/path`), otherwise to `$CONDA_PREFIX`.
```bash
$ cd $CUDF_HOME
$ ./build.sh libcudf # compile the cuDF libraries and install them to $INSTALL_PREFIX if set, otherwise $CONDA_PREFIX
$ ./build.sh # To build both C++ and Python cuDF versions with their dependencies
```
- To build only the C++ component with the script
```bash
$ ./build.sh libnvstrings libcudf # Build only the cuDF C++ components and install them to $INSTALL_PREFIX if set, otherwise $CONDA_PREFIX
```

- To run tests (Optional):
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# <div align="left"><img src="img/rapids_logo.png" width="90px"/>&nbsp;cuDF - GPU DataFrames</div>

[![Build Status](https://gpuci.gpuopenanalytics.com/buildStatus/icon?job=gpuCI%2Fcudf%2Fbranches%2Fcudf-gpu-branch-0.12)](https://gpuci.gpuopenanalytics.com/job/gpuCI/job/cudf/job/branches/job/cudf-gpu-branch-0.12/)
[![Build Status](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/branches/job/cudf-branch-pipeline/badge/icon)](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/branches/job/cudf-branch-pipeline/)

**NOTE:** For the latest stable [README.md](https://github.com/rapidsai/cudf/blob/master/README.md) ensure you are on the `master` branch.

Expand Down
71 changes: 52 additions & 19 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,25 @@ ARGS=$*
# script, and that this script resides in the repo dir!
REPODIR=$(cd $(dirname $0); pwd)

VALIDARGS="clean libnvstrings nvstrings libcudf cudf dask_cudf benchmarks -v -g -n --allgpuarch -h"
HELP="$0 [clean] [libcudf] [cudf] [dask_cudf] [benchmarks] [-v] [-g] [-n] [-h]
clean - remove all existing build artifacts and configuration (start
over)
libnvstrings - build the nvstrings C++ code only
nvstrings - build the nvstrings Python package
libcudf - build the cudf C++ code only
cudf - build the cudf Python package
dask_cudf - build the dask_cudf Python package
benchmarks - build benchmarks
-v - verbose build mode
-g - build for debug
-n - no install step
--allgpuarch - build for all supported GPU architectures
-h - print this text
VALIDARGS="clean libnvstrings nvstrings libcudf cudf dask_cudf benchmarks tests -v -g -n -l --allgpuarch --disable_nvtx --show_depr_warn -h"
HELP="$0 [clean] [libcudf] [cudf] [dask_cudf] [benchmarks] [tests] [-v] [-g] [-n] [-h] [-l]
clean - remove all existing build artifacts and configuration (start
over)
libnvstrings - build the nvstrings C++ code only
nvstrings - build the nvstrings Python package
libcudf - build the cudf C++ code only
cudf - build the cudf Python package
dask_cudf - build the dask_cudf Python package
benchmarks - build benchmarks
tests - build tests
-v - verbose build mode
-g - build for debug
-n - no install step
-l - build legacy tests
--allgpuarch - build for all supported GPU architectures
--disable_nvtx - disable inserting NVTX profiling ranges
--show_depr_warn - show cmake deprecation warnings
-h - print this text

default action (no args) is to build and install 'libnvstrings' then
'nvstrings' then 'libcudf' then 'cudf' then 'dask_cudf' targets
Expand All @@ -49,6 +53,10 @@ BUILD_TYPE=Release
INSTALL_TARGET=install
BENCHMARKS=OFF
BUILD_ALL_GPU_ARCH=0
BUILD_NVTX=ON
BUILD_TESTS=OFF
BUILD_LEGACY_TESTS=OFF
BUILD_DISABLE_DEPRECATION_WARNING=ON

# Set defaults for vars that may not have been defined externally
# FIXME: if INSTALL_PREFIX is not set, check PREFIX, then check
Expand Down Expand Up @@ -88,12 +96,26 @@ if hasArg -g; then
fi
if hasArg -n; then
INSTALL_TARGET=""
LIBCUDF_BUILD_DIR=${LIB_BUILD_DIR}
LIBNVSTRINGS_BUILD_DIR=${LIB_BUILD_DIR}
fi
if hasArg -l; then
BUILD_LEGACY_TESTS=ON
fi
if hasArg --allgpuarch; then
BUILD_ALL_GPU_ARCH=1
fi
if hasArg benchmarks; then
BENCHMARKS=ON
BENCHMARKS="ON"
fi
if hasArg tests; then
BUILD_TESTS=ON
fi
if hasArg --disable_nvtx; then
BUILD_NVTX="OFF"
fi
if hasArg --show_depr_warn; then
BUILD_DISABLE_DEPRECATION_WARNING=OFF
fi

# If clean given, run it prior to any other steps
Expand Down Expand Up @@ -128,7 +150,10 @@ if buildAll || hasArg libnvstrings || hasArg libcudf; then
cmake -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} \
-DCMAKE_CXX11_ABI=ON \
${GPU_ARCH} \
-DUSE_NVTX=${BUILD_NVTX} \
-DBUILD_BENCHMARKS=${BENCHMARKS} \
-DBUILD_LEGACY_TESTS=${BUILD_LEGACY_TESTS} \
-DDISABLE_DEPRECATION_WARNING=${BUILD_DISABLE_DEPRECATION_WARNING} \
-DCMAKE_BUILD_TYPE=${BUILD_TYPE} ..
fi

Expand All @@ -140,6 +165,10 @@ if buildAll || hasArg libnvstrings; then
else
make -j${PARALLEL_LEVEL} nvstrings VERBOSE=${VERBOSE}
fi

if [[ ${BUILD_TESTS} == "ON" ]]; then
make -j${PARALLEL_LEVEL} build_tests_nvstrings VERBOSE=${VERBOSE}
fi
fi

# Build and install the nvstrings Python package
Expand All @@ -150,7 +179,7 @@ if buildAll || hasArg nvstrings; then
python setup.py build_ext
python setup.py install --single-version-externally-managed --record=record.txt
else
python setup.py build_ext --library-dir=${LIBNVSTRINGS_BUILD_DIR}
python setup.py build_ext --build-lib=${PWD} --library-dir=${LIBNVSTRINGS_BUILD_DIR}
fi
fi

Expand All @@ -163,17 +192,21 @@ if buildAll || hasArg libcudf; then
else
make -j${PARALLEL_LEVEL} cudf VERBOSE=${VERBOSE}
fi

if [[ ${BUILD_TESTS} == "ON" ]]; then
make -j${PARALLEL_LEVEL} build_tests_cudf VERBOSE=${VERBOSE}
fi
fi

# Build and install the cudf Python package
if buildAll || hasArg cudf; then

cd ${REPODIR}/python/cudf
if [[ ${INSTALL_TARGET} != "" ]]; then
python setup.py build_ext --inplace
PARALLEL_LEVEL=${PARALLEL_LEVEL} python setup.py build_ext --inplace
python setup.py install --single-version-externally-managed --record=record.txt
else
python setup.py build_ext --inplace --library-dir=${LIBCUDF_BUILD_DIR}
PARALLEL_LEVEL=${PARALLEL_LEVEL} python setup.py build_ext --inplace --library-dir=${LIBCUDF_BUILD_DIR}
fi
fi

Expand Down
2 changes: 1 addition & 1 deletion ci/cpu/prebuild.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bash

#Upload cudf once per PYTHON
if [[ "$CUDA" == "9.2" ]]; then
if [[ "$CUDA" == "10.0" ]]; then
export UPLOAD_CUDF=1
else
export UPLOAD_CUDF=0
Expand Down
28 changes: 18 additions & 10 deletions ci/gpu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ logger "Activate conda env..."
source activate gdf
conda install "rmm=$MINOR_VERSION.*" "cudatoolkit=$CUDA_REL" \
"dask>=2.1.0" "distributed>=2.1.0" "numpy>=1.16" "double-conversion" \
"rapidjson" "flatbuffers" "boost-cpp" "fsspec>=0.3.3" "dlpack" \
"rapidjson" "flatbuffers" "boost-cpp" "fsspec>=0.3.3,<0.7.0a0" "dlpack" \
"feather-format" "cupy>=6.6.0,<8.0.0a0,!=7.1.0" "arrow-cpp=0.15.0" "pyarrow=0.15.0" \
"fastavro>=0.22.0" "pandas>=0.25,<0.26" "hypothesis" "s3fs" "gcsfs" \
"boto3" "moto" "httpretty" "streamz"
"boto3" "moto" "httpretty" "streamz" "ipython=7.3*" "jupyterlab"

# Install the master version of dask, distributed, and streamz
logger "pip install git+https://github.com/dask/distributed.git --upgrade --no-deps"
Expand All @@ -83,7 +83,11 @@ conda list
################################################################################

logger "Build libcudf..."
$WORKSPACE/build.sh clean libnvstrings nvstrings libcudf cudf dask_cudf benchmarks
if [[ ${BUILD_MODE} == "pull-request" ]]; then
$WORKSPACE/build.sh clean libnvstrings nvstrings libcudf cudf dask_cudf benchmarks tests
else
$WORKSPACE/build.sh clean libnvstrings nvstrings libcudf cudf dask_cudf benchmarks tests -l
fi

################################################################################
# TEST - Run GoogleTest and py.tests for libnvstrings, nvstrings, libcudf, and
Expand All @@ -96,20 +100,22 @@ else
logger "Check GPU usage..."
nvidia-smi

logger "GoogleTest for libnvstrings..."
logger "GoogleTests..."
cd $WORKSPACE/cpp/build
GTEST_OUTPUT="xml:${WORKSPACE}/test-results/" make -j${PARALLEL_LEVEL} test_nvstrings

logger "GoogleTest for libcudf..."
cd $WORKSPACE/cpp/build
GTEST_OUTPUT="xml:${WORKSPACE}/test-results/" make -j${PARALLEL_LEVEL} test_cudf
for gt in ${WORKSPACE}/cpp/build/gtests/* ; do
test_name=$(basename ${gt})
echo "Running GoogleTest $test_name"
${gt} --gtest_output=xml:${WORKSPACE}/test-results/
done


# set environment variable for numpy 1.16
# will be enabled for later versions by default
np_ver=$(python -c "import numpy; print('.'.join(numpy.__version__.split('.')[:-1]))")
if [ "$np_ver" == "1.16" ];then
logger "export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1"
export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1
logger "export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1"
export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1
fi

cd $WORKSPACE/python/nvstrings
Expand All @@ -128,4 +134,6 @@ else
logger "Python py.test for cuStreamz..."
py.test --cache-clear --junitxml=${WORKSPACE}/junit-custreamz.xml -v --cov-config=.coveragerc --cov=custreamz --cov-report=xml:${WORKSPACE}/python/custreamz/custreamz-coverage.xml --cov-report term

${WORKSPACE}/ci/gpu/test-notebooks.sh 2>&1 | tee nbtest.log
python ${WORKSPACE}/ci/utils/nbtestlog2junitxml.py nbtest.log
fi
47 changes: 47 additions & 0 deletions ci/gpu/test-notebooks.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash

NOTEBOOKS_DIR=${WORKSPACE}/notebooks
NBTEST=${WORKSPACE}/ci/utils/nbtest.sh
LIBCUDF_KERNEL_CACHE_PATH=${WORKSPACE}/.jitcache

cd ${NOTEBOOKS_DIR}
TOPLEVEL_NB_FOLDERS=$(find . -name *.ipynb |cut -d'/' -f2|sort -u)

# Add notebooks that should be skipped here
# (space-separated list of filenames without paths)

SKIPNBS=""

## Check env
env

EXITCODE=0

# Always run nbtest in all TOPLEVEL_NB_FOLDERS, set EXITCODE to failure
# if any run fails

cd ${NOTEBOOKS_DIR}
for nb in $(find . -name "*.ipynb"); do
nbBasename=$(basename ${nb})
# Skip all NBs that use dask (in the code or even in their name)
if ((echo ${nb}|grep -qi dask) || \
(grep -q dask ${nb})); then
echo "--------------------------------------------------------------------------------"
echo "SKIPPING: ${nb} (suspected Dask usage, not currently automatable)"
echo "--------------------------------------------------------------------------------"
elif (echo " ${SKIPNBS} " | grep -q " ${nbBasename} "); then
echo "--------------------------------------------------------------------------------"
echo "SKIPPING: ${nb} (listed in skip list)"
echo "--------------------------------------------------------------------------------"
else
nvidia-smi
${NBTEST} ${nbBasename}
EXITCODE=$((EXITCODE | $?))
rm -rf ${LIBCUDF_KERNEL_CACHE_PATH}/*
fi
done


nvidia-smi

exit ${EXITCODE}
Loading