-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Allow merging index column with data column using keyword "on" #6453
[WIP] Allow merging index column with data column using keyword "on" #6453
Conversation
…into skiprowbranch
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #6453 +/- ##
==============================================
Coverage ? 82.95%
==============================================
Files ? 95
Lines ? 14919
Branches ? 0
==============================================
Hits ? 12376
Misses ? 2543
Partials ? 0 Continue to review full report at Codecov.
|
Adds a new developer guide for libcudf. This is based on the existing libcudf++ transition guide. Fixes rapidsai#5273 TODO - [x] Description of `dictionary_column_wrapper` and `fixed_point_column_wrapper` - [x] Benchmarking Section (put in a new file, Benchmarking.md)? - [x] Better discussion of nested types - [x] Introductory section on data types - [x] Consider splitting into multiple documents: DEVELOPER_GUIDE.md, TESTING.md, BENCHMARKING.md? - [x] Placeholder for cuIO? - [x] Add section on code and documentation style and formatting Authors: - Mark Harris (@harrism) - Jake Hemstad (@jrhemstad) Approvers: - @nvdbaranec - Conor Hoekstra (@codereport) - Jake Hemstad (@jrhemstad) - David (@davidwendt) URL: rapidsai#6977
[gpuCI] Auto-merge branch-0.18 to branch-0.19 [skip ci]
NumPy 1.20 is [typed](https://numpy.org/devdocs/release/1.20.0-notes.html#numpy-is-now-typed), which exposed a few typing errors in cuDF that this PR addresses. Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - GALI PREM SAGAR (@galipremsagar) - AJ Schmidt (@ajschmidt8) URL: rapidsai#7279
[gpuCI] Auto-merge branch-0.18 to branch-0.19 [skip ci]
This PR prepares the changelog to be automatically updated during releases. Authors: - AJ Schmidt (@ajschmidt8) Approvers: - Keith Kraus (@kkraus14) URL: rapidsai#7272
Fix merge conflicts for rapidsai#7295
Add synchronization in `cleanImpl` and `close` in various places where race conditions could exist, and also within the `MemoryCleaner` to address some concurrent modification issues we've seen in tests while shutting down (i.e. invoking the cleaner) (i.e. NVIDIA/spark-rapids#1797) Authors: - Alessandro Bellina (@abellina) Approvers: - Robert (Bobby) Evans (@revans2) - Jason Lowe (@jlowe) URL: rapidsai#7474
Final step, closes rapidsai#5133 Authors: - Devavret Makkar (@devavret) Approvers: - Nghia Truong (@ttnghia) - Vukasin Milovanovic (@vuule) URL: rapidsai#7510
Reference rapidsai#7285 This PR adds Cython wrappers for `cudf::strings::to_fixed_point`, `cudf::strings::from_fixed_point`, and `cudf::strings::is_fixed_point` libcudf functions. Authors: - David (@davidwendt) Approvers: - GALI PREM SAGAR (@galipremsagar) - Ashwin Srinath (@shwina) - Conor Hoekstra (@codereport) URL: rapidsai#7429
This PR is to support skipping nulls for `collect ` aggregation in JVM by creating a new class `CollectAggregation` who accepts a `NullPolicy ` argument indicating whether to include nulls. Skipping nulls has already been supported by `collect ` aggregation with rolling in native (rapidsai#7264), so this PR just exposes the feaure in JVM. This PR also introduces `NullPolicy ` and updates the related aggregates. Signed-off-by: firestarman <[email protected]> Authors: - Liangcai Li (@firestarman) Approvers: - Robert (Bobby) Evans (@revans2) - MithunR (@mythrocks) URL: rapidsai#7457
…ake (rapidsai#7518) Rename `ARROW_STATIC_LIB` because it conflicts with CMake variable in Arrow's `FindArrow.cmake`. Here's the new way to statically link Arrow with libcudf: ``` cmake -D CUDF_USE_ARROW_STATIC=ON .. ``` Authors: - Paul Taylor (@trxcllnt) Approvers: - Keith Kraus (@kkraus14) URL: rapidsai#7518
Addresses rapidsai#7347 Authors: - Kumar Aatish (@kaatish) Approvers: - David (@davidwendt) - Devavret Makkar (@devavret) - Vukasin Milovanovic (@vuule) URL: rapidsai#7439
This updates the Java build scripts and documentation to use the new CUDF_USE_ARROW_STATIC flag after the rename from ARROW_STATIC_LIB in rapidsai#7518. Authors: - Jason Lowe (@jlowe) Approvers: - Alessandro Bellina (@abellina) - Keith Kraus (@kkraus14) URL: rapidsai#7526
This changes the root directory of the build folder for conda. Instead of generating a random build folder name, it will create a consistent build folder name at the `croot` location. This folder name is unique in CI, as every build has a unique `${WORKSPACE}` that is used. Lots of workarounds added to properly work with Project Flash. Several `mv` commands are added to put build artifacts in a folder Project Flash expects them to be in. Authors: - Dillon Cullinan (@dillon-cullinan) Approvers: - AJ Schmidt (@ajschmidt8) URL: rapidsai#7508
This PR adds a couple of very specialized methods that help us cast columns inside nested types. Authors: - Raza Jafri (@razajafri) Approvers: - Robert (Bobby) Evans (@revans2) - Jason Lowe (@jlowe) - MithunR (@mythrocks) URL: rapidsai#7417
Refactors the bitmask merging functionality to support any binary function, allowing for `bitwise_or` support in addition the existing `bitwise_and` support. Includes changes to the Java api and JNI to access the `bitwise_or` functionality. Authors: - @rwlee Approvers: - Jason Lowe (@jlowe) - Jake Hemstad (@jrhemstad) - Christopher Harris (@cwharris) URL: rapidsai#7406
Closes rapidsai#7320 This PR adds an additional preprocessing step in documentation generation. It traverses through the doctree generated by Sphinx and replaces unresolved type short hands with proper target reference, while keeping the shortened name for display text. An additional preprocessing step is added to ignore internal types to APIs facing both internally and externally, such as `cudf.core.column.string.StringColumn` `cupy` API reference is added to intersphinx. Minor changes: - Fixes a small doc bug in `frame.copy` Authors: - Michael Wang (@isVoid) Approvers: - Ashwin Srinath (@shwina) URL: rapidsai#7416
`dask` and `distributed` are changing their default branches name from `master` to `main`, this will break our dev environments and CI, this PR updates the required files. `distributed` already merged the PR that does the change, `dask` will probably do the same very soon so a PR that updates both seems to be the best approach. Authors: - Dante Gama Dessavre (@dantegd) Approvers: - Keith Kraus (@kkraus14) - AJ Schmidt (@ajschmidt8) URL: rapidsai#7532
Reference rapidsai#5698 This creates a gbenchmark for `cudf::strings::extract` function. The benchmarks measures various sized rows as well as strings lengths. It also has measurements for small, medium, and large regex instructions. The extract performance is effected by the number of instructions in the regex pattern. Authors: - David (@davidwendt) Approvers: - Keith Kraus (@kkraus14) - Karthikeyan (@karthikeyann) - Mark Harris (@harrism) URL: rapidsai#7522
This PR reduces the number of calls to `inclusive_scan` and `exclusive_scan` by using a `null_replace_accessor` that allows non-nullable columns. This reduces the compile time and size of `scan.cu` by half. This PR also includes a scan gbenchmark that shows no change in performance from the original implementation. Authors: - David (@davidwendt) Approvers: - Paul Taylor (@trxcllnt) - Jake Hemstad (@jrhemstad) URL: rapidsai#7516
…i#7535) There were a few renames of master --> main that were missed for the recent dask branch rename, fixed them. Authors: - Keith Kraus (@kkraus14) Approvers: - AJ Schmidt (@ajschmidt8) - GALI PREM SAGAR (@galipremsagar) URL: rapidsai#7535
Fix for issue caused by stale PR issue from rapidsai#7406 Authors: - @rwlee - Keith Kraus (@kkraus14) Approvers: - Keith Kraus (@kkraus14) - Mike Wilson (@hyperbolic2346) - GALI PREM SAGAR (@galipremsagar) - Jake Hemstad (@jrhemstad) - Vukasin Milovanovic (@vuule) - Paul Taylor (@trxcllnt) URL: rapidsai#7533
) Presume that a project is using `cudf` via CPM like the following, and the machine doesn't have cudf installed, but does have rmm. ``` CPMAddPackage(NAME cudf VERSION "0.19.0" GIT_REPOSITORY https://github.com/rapidsai/cudf.git GIT_TAG branch-0.19 GIT_SHALLOW TRUE SOURCE_SUBDIR cpp OPTIONS "BUILD_TESTS OFF" "BUILD_BENCHMARKS OFF" "ARROW_STATIC_LIB ON" "JITIFY_USE_CACHE ON" "CUDA_STATIC_RUNTIME ON" "DISABLE_DEPRECATION_WARNING ON" "AUTO_DETECT_CUDA_ARCHITECTURES ON" ) add_library(cudf_example cudf_example.cu) target_link_libraries(cudf_example PRIVATE cudf::cudf) add_library(rmm_example rmm_example.cu) target_link_libraries(rmm_example PRIVATE rmm::rmm) ``` While CPM will fail to find `cudf`, it will find the local install of `rmm` and use it. This poses a problem as CMake import targets have different default visibility compared to 'real' targets. This means that while `cudf::cudf` can see and resolve `rmm::rmm` the `rmm_example` executable won't be able to. This change makes it possible for users of cudf via CPM to directly access the `rmm::rmm` target Authors: - Robert Maynard (@robertmaynard) Approvers: - Keith Kraus (@kkraus14) URL: rapidsai#7524
replaced by PR 7569 |
No description provided.