forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Branch 22.12 #6
Merged
Merged
Branch 22.12 #6
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Root cause: ```python In [1]: import numpy as np In [2]: x = np.uint8(1) In [3]: y = np.float64(1.0) In [4]: x.__ge__(y) Out[4]: NotImplemented In [8]: x >= y Out[8]: True ``` This is leading to the following error whenever there is a Scalar binary operation involved: ```python python/cudf/cudf/tests/test_series.py:449: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../envs/cudfdev/lib/python3.9/contextlib.py:79: in inner return func(*args, **kwds) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:2988: in describe data = _describe_categorical(self, percentiles) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:152: in _describe_categorical val_counts = obj.value_counts(ascending=False) ../envs/cudfdev/lib/python3.9/contextlib.py:79: in inner return func(*args, **kwds) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:2862: in value_counts res = res.sort_values(ascending=ascending) ../envs/cudfdev/lib/python3.9/contextlib.py:79: in inner return func(*args, **kwds) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:1910: in sort_values return super().sort_values( ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/indexed_frame.py:1916: in sort_values out = self._gather( ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/indexed_frame.py:1523: in _gather if not libcudf.copying._gather_map_is_valid( copying.pyx:67: in cudf._lib.copying._gather_map_is_valid ??? ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/mixins/mixin_factory.py:11: in wrapper return method(self, *args1, *args2, **kwargs1, **kwargs2) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:350: in _binaryop return Scalar(result, dtype=out_dtype) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:56: in __call__ obj = super().__call__(value, dtype=dtype) ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:128: in __init__ self._host_value, self._host_dtype = self._preprocess_host_value( ../envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:222: in _preprocess_host_value value = to_cudf_compatible_scalar(value, dtype=dtype) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ val = NotImplemented, dtype = <class 'numpy.bool_'> def to_cudf_compatible_scalar(val, dtype=None): """ Converts the value `val` to a numpy/Pandas scalar, optionally casting to `dtype`. If `val` is None, returns None. """ if cudf._lib.scalar._is_null_host_scalar(val) or isinstance( val, cudf.Scalar ): return val if not cudf.api.types._is_scalar_or_zero_d_array(val): > raise ValueError( f"Cannot convert value of type {type(val).__name__} " "to cudf scalar" ) E ValueError: Cannot convert value of type NotImplementedType to cudf scalar ../envs/cudfdev/lib/python3.9/site-packages/cudf/utils/dtypes.py:248: ValueError ``` This PR fixes the issue by first trying to call the `op` with `operator` standard library and then try to `getattr` if the `op` is not found in `operator` module. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Lawrence Mitchell (https://github.com/wence-) - https://github.com/brandon-b-miller URL: #11816
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
Compile warning was introduced in #11652 in `bgzip_data_chunk_source.cu`. The warning can be seen here https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/12417/consoleFull (search for `177-D`) ``` /cudf/cpp/src/io/text/bgzip_data_chunk_source.cu(362): warning #177-D: variable "nvtx3_range__" was declared but never referenced ``` The `nvtx3_range__` is part of the `CUDF_FUNC_RANGE()` macro. The warning is incorrect and likely a compiler bug. The workaround in this PR is to add `[[maybe_unused]]` to the variable declaration. I was not able to create a small reproducer for compile bug filing. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Tobias Ribizel (https://github.com/upsj) - MithunR (https://github.com/mythrocks) URL: #11798
We need to actually call the method otherwise we will get false positives for validity of the operands. Fortunately, this seems to have been a benign bug since the host pandas `NAType` handles all of the operations appropriately, so the code was "working" before, but the logic was incorrect. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: #11818
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
Adding some examples to show off the nested type JSON reading Authors: - Gregory Kimball (https://github.com/GregoryKimball) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Matthew Roeschke (https://github.com/mroeschke) URL: #11814
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
## Description Disable the use of nvCOMP DEFLATE because of issues with nvCOMP 2.4. Also fix a Python test (did not block CI because the comparison in the test is only done with `LIBCUDF_NVCOMP_POLICY="ALWAYS"`. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - Jim Brennan (https://github.com/jbrennan333) - GALI PREM SAGAR (https://github.com/galipremsagar) - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr)
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
This PR resolves #10323 and phases out the `gitutils.py` module in favor of a dependency on GitPython that is managed by pre-commit. It fixes the pre-commit check for copyright years so that only modifications between the target branch (`branch-X.Y`) and the current git stage will trigger copyright changes (years will not be updated for unmodified files, or for changes that have not been staged). Additionally, it changes the return code to `1` if changes are requested and applied (if modifications were required, that should be considered a failure). This is the last step to making our entire style check pipeline friendly to pre-commit. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jordan Jacobelli (https://github.com/Ethyling) - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) URL: #11711
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
## Description This switches to using CubinLinker (from PTXCompiler, but CubinLinker uses PTXCompiler internally) for Minor Version Compatibility. This enables support for all Numba features except linking archives with MVC, in support of use cases such as String UDFs (#11319) with MVC. ## Checklist - [X] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [X] New or existing tests cover these changes. - [X] The documentation is up to date with these changes. Authors: - Graham Markall (https://github.com/gmarkall) - https://github.com/brandon-b-miller - Ashwin Srinath (https://github.com/shwina) Approvers: - Ray Douglass (https://github.com/raydouglass)
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
## Description The docstring for `cudf.read_text` did not include the `byte_range` argument ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes. Authors: - Gregory Kimball ([email protected]) Approvers: - Ashwin Srinath (https://github.com/shwina) - Lawrence Mitchell (https://github.com/wence-)
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
etseidl
pushed a commit
that referenced
this pull request
Jun 9, 2023
This implements stacktrace and adds a stacktrace string into any exception thrown by cudf. By doing so, the exception carries information about where it originated, allowing the downstream application to trace back with much less effort. Closes rapidsai#12422. ### Example: ``` #0: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::sorted_order<false>(cudf::table_view, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x446 #1: cudf/cpp/build/libcudf.so : cudf::detail::sorted_order(cudf::table_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x113 #2: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::segmented_sorted_order_common<(cudf::detail::sort_method)1>(cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x66e #3: cudf/cpp/build/libcudf.so : cudf::detail::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x88 #4: cudf/cpp/build/libcudf.so : cudf::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::mr::device_memory_resource*)+0xb9 #5: cudf/cpp/build/gtests/SORT_TEST : ()+0xe3027 #6: cudf/cpp/build/lib/libgtest.so.1.13.0 : void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x8f rapidsai#7: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::Test::Run()+0xd6 rapidsai#8: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestInfo::Run()+0x195 rapidsai#9: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestSuite::Run()+0x109 rapidsai#10: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::internal::UnitTestImpl::RunAllTests()+0x44f rapidsai#11: cudf/cpp/build/lib/libgtest.so.1.13.0 : bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x87 rapidsai#12: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::UnitTest::Run()+0x95 rapidsai#13: cudf/cpp/build/gtests/SORT_TEST : ()+0xdb08c rapidsai#14: /lib/x86_64-linux-gnu/libc.so.6 : ()+0x29d90 rapidsai#15: /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0x80 rapidsai#16: cudf/cpp/build/gtests/SORT_TEST : ()+0xdf3d5 ``` ### Usage In order to retrieve a stacktrace with fully human-readable symbols, some compiling options must be adjusted. To make such adjustment convenient and effortless, a new cmake option (`CUDF_BUILD_STACKTRACE_DEBUG`) has been added. Just set this option to `ON` before building cudf and it will be ready to use. For downstream applications, whenever a cudf-type exception is thrown, it can retrieve the stored stacktrace and do whatever it wants with it. For example: ``` try { // cudf API calls } catch (cudf::logic_error const& e) { std::cout << e.what() << std::endl; std::cout << e.stacktrace() << std::endl; throw e; } // similar with catching other exception types ``` ### Follow-up work The next step would be patching `rmm` to attach stacktrace into `rmm::` exceptions. Doing so will allow debugging various memory exceptions thrown from libcudf using their stacktrace. ### Note: * This feature doesn't require libcudf to be built in Debug mode. * The flag `CUDF_BUILD_STACKTRACE_DEBUG` should not be turned on in production as it may affect code optimization. Instead, libcudf compiled with that flag turned on should be used only when needed, when debugging cudf throwing exceptions. * This flag removes the current optimization flag from compiling (such as `-O2` or `-O3`, if in Release mode) and replaces by `-Og` (optimize for debugging). * If this option is not set to `ON`, the stacktrace will not be available. This is to avoid expensive stracktrace retrieval if the throwing exception is expected. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) - Jason Lowe (https://github.com/jlowe) URL: rapidsai#13298
etseidl
pushed a commit
that referenced
this pull request
Nov 8, 2023
Fix to_datetime with format allowing out-of-range values
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Checklist