`decimal128` Support for `to/from_arrow` #9986

codereport · 2022-01-06T19:49:34Z

Resolves C++ side of #9980.

The reason this PR is breaking is because Arrow only has a notion of decimal128 (see arrow::Type::DECIMAL). We can still support both decimal64 and decimal128 for to_arrow but for from_arrow it only makes sense to support one of them, and decimal128 (now that we have it) is the logical choice. Therfore, the switching of the return type of a column coming from_arrow from decimal64 to decimal128 is a breaking change.

Requires:

codecov · 2022-01-06T22:31:50Z

Codecov Report

Merging #9986 (da7c5e0) into branch-22.02 (967a333) will decrease coverage by 0.09%.
The diff coverage is n/a.

❗ Current head da7c5e0 differs from pull request most recent head e52781c. Consider uploading reports for the commit e52781c to get more accurate results

@@               Coverage Diff                @@
##           branch-22.02    #9986      +/-   ##
================================================
- Coverage         10.49%   10.39%   -0.10%     
================================================
  Files               119      119              
  Lines             20305    20064     -241     
================================================
- Hits               2130     2086      -44     
+ Misses            18175    17978     -197

Impacted Files	Coverage Δ
python/custreamz/custreamz/tests/conftest.py	`71.42% <0.00%> (-7.15%)`	⬇️
python/custreamz/custreamz/tests/test_kafka.py	`38.46% <0.00%> (-4.40%)`	⬇️
...ython/custreamz/custreamz/tests/test_dataframes.py	`96.97% <0.00%> (-2.42%)`	⬇️
python/custreamz/custreamz/kafka.py	`29.16% <0.00%> (-0.63%)`	⬇️
python/dask_cudf/dask_cudf/backends.py	`82.53% <0.00%> (-0.61%)`	⬇️
python/dask_cudf/dask_cudf/sorting.py	`92.30% <0.00%> (-0.61%)`	⬇️
python/dask_cudf/dask_cudf/accessors.py	`92.00% <0.00%> (-0.31%)`	⬇️
python/dask_cudf/dask_cudf/io/tests/test_s3.py	`95.77% <0.00%> (-0.18%)`	⬇️
python/dask_cudf/dask_cudf/io/parquet.py	`93.46% <0.00%> (-0.17%)`	⬇️
python/dask_cudf/dask_cudf/core.py	`70.85% <0.00%> (-0.17%)`	⬇️
... and 57 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1de945...e52781c. Read the comment docs.

codereport · 2022-01-09T03:22:17Z

@galipremsagar Are you able to work on the Python changes now? Looks like cuDF Python doesn't currently have support for Decimal128DType, and that will most likely need to be added.

galipremsagar · 2022-01-10T15:20:36Z

@galipremsagar Are you able to work on the Python changes now? Looks like cuDF Python doesn't currently have support for Decimal128DType, and that will most likely need to be added.

Yup, this PR unblocks #9533.

~~But there are a bunch of pytests(and code-paths) that need changes which I'm currently working on and will update you once done.~~

Update: Python changes ready: #9533

codereport · 2022-01-14T16:21:44Z

Update: Python changes ready: #9533

@galipremsagar I am not exactly what needs to be done, but I pulled down the changes in #9533 but it doesn't seem to fix the orc python test failures. Are there still changes you need to make?

Can you share what the failures are? I'm not able to see any failures with both PRs merged:
pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/ -n 17 --dist=loadscope
============================================ 88438 passed, 2401 skipped, 985 xfailed, 1978 xpassed, 20246 warnings in 389.68s (0:06:29) =============================================

(rapids) rapids@compose:~/cudf/python$ py.test cudf/cudf/tests/test_orc.py 
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=1 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=0)
rootdir: /home/cph/dev/rapids/cudf/python/cudf, configfile: setup.cfg
plugins: forked-1.3.0, hypothesis-6.34.1, benchmark-3.4.1, xdist-2.5.0
collected 207 items                                                                                                                                                                          

cudf/cudf/tests/test_orc.py .....................s.................................................................FF................................................................. [ 74%]
...................................F.................                                                                                                                                  [100%]

========================================================================================== FAILURES ==========================================================================================
_______________________________________________________________________________ test_orc_write_statistics[100] _______________________________________________________________________________

tmpdir = local('/tmp/pytest-of-rapids/pytest-0/test_orc_write_statistics_100_0'), datadir = PosixPath('/home/cph/dev/rapids/cudf/python/cudf/cudf/tests/data/orc'), nrows = 100

    @pytest.mark.parametrize("nrows", [1, 100, 6000000])
    def test_orc_write_statistics(tmpdir, datadir, nrows):
        supported_stat_types = supported_numpy_dtypes + ["str"]
        # Can't write random bool columns until issue #6763 is fixed
        if nrows == 6000000:
            supported_stat_types.remove("bool")
    
        # Make a dataframe
        gdf = cudf.DataFrame(
            {
                "col_" + str(dtype): gen_rand_series(dtype, nrows, has_nulls=True)
                for dtype in supported_stat_types
            }
        )
        fname = tmpdir.join("gdf.orc")
    
        # Write said dataframe to ORC with cuDF
        gdf.to_orc(fname.strpath)
    
        # Read back written ORC's statistics
        orc_file = pa.orc.ORCFile(fname)
        (file_stats, stripes_stats,) = cudf.io.orc.read_orc_statistics([fname])
    
        # check file stats
        for col in gdf:
            if "minimum" in file_stats[0][col]:
                stats_min = file_stats[0][col]["minimum"]
                actual_min = gdf[col].min()
                assert normalized_equals(actual_min, stats_min)
            if "maximum" in file_stats[0][col]:
                stats_max = file_stats[0][col]["maximum"]
                actual_max = gdf[col].max()
                assert normalized_equals(actual_max, stats_max)
            if "number_of_values" in file_stats[0][col]:
                stats_num_vals = file_stats[0][col]["number_of_values"]
                actual_num_vals = gdf[col].count()
                assert stats_num_vals == actual_num_vals
    
        # compare stripe statistics with actual min/max
        for stripe_idx in range(0, orc_file.nstripes):
            stripe = orc_file.read_stripe(stripe_idx)
            # pandas is unable to handle min/max of string col with nulls
            stripe_df = cudf.DataFrame(stripe.to_pandas())
            for col in stripe_df:
                if "minimum" in stripes_stats[stripe_idx][col]:
                    actual_min = stripe_df[col].min()
                    stats_min = stripes_stats[stripe_idx][col]["minimum"]
>                   assert normalized_equals(actual_min, stats_min)
E                   AssertionError: assert False
E                    +  where False = normalized_equals(numpy.datetime64('1971-09-25T05:48:04.902661000'), datetime.datetime(1971, 9, 25, 4, 48, 4, 902000, tzinfo=datetime.timezone.utc))

cudf/cudf/tests/test_orc.py:642: AssertionError
_____________________________________________________________________________ test_orc_write_statistics[6000000] _____________________________________________________________________________

tmpdir = local('/tmp/pytest-of-rapids/pytest-0/test_orc_write_statistics_60000'), datadir = PosixPath('/home/cph/dev/rapids/cudf/python/cudf/cudf/tests/data/orc'), nrows = 6000000

    @pytest.mark.parametrize("nrows", [1, 100, 6000000])
    def test_orc_write_statistics(tmpdir, datadir, nrows):
        supported_stat_types = supported_numpy_dtypes + ["str"]
        # Can't write random bool columns until issue #6763 is fixed
        if nrows == 6000000:
            supported_stat_types.remove("bool")
    
        # Make a dataframe
        gdf = cudf.DataFrame(
            {
                "col_" + str(dtype): gen_rand_series(dtype, nrows, has_nulls=True)
                for dtype in supported_stat_types
            }
        )
        fname = tmpdir.join("gdf.orc")
    
        # Write said dataframe to ORC with cuDF
        gdf.to_orc(fname.strpath)
    
        # Read back written ORC's statistics
        orc_file = pa.orc.ORCFile(fname)
        (file_stats, stripes_stats,) = cudf.io.orc.read_orc_statistics([fname])
    
        # check file stats
        for col in gdf:
            if "minimum" in file_stats[0][col]:
                stats_min = file_stats[0][col]["minimum"]
                actual_min = gdf[col].min()
                assert normalized_equals(actual_min, stats_min)
            if "maximum" in file_stats[0][col]:
                stats_max = file_stats[0][col]["maximum"]
                actual_max = gdf[col].max()
                assert normalized_equals(actual_max, stats_max)
            if "number_of_values" in file_stats[0][col]:
                stats_num_vals = file_stats[0][col]["number_of_values"]
                actual_num_vals = gdf[col].count()
                assert stats_num_vals == actual_num_vals
    
        # compare stripe statistics with actual min/max
        for stripe_idx in range(0, orc_file.nstripes):
            stripe = orc_file.read_stripe(stripe_idx)
            # pandas is unable to handle min/max of string col with nulls
            stripe_df = cudf.DataFrame(stripe.to_pandas())
            for col in stripe_df:
                if "minimum" in stripes_stats[stripe_idx][col]:
                    actual_min = stripe_df[col].min()
                    stats_min = stripes_stats[stripe_idx][col]["minimum"]
                    assert normalized_equals(actual_min, stats_min)
    
                if "maximum" in stripes_stats[stripe_idx][col]:
                    actual_max = stripe_df[col].max()
                    stats_max = stripes_stats[stripe_idx][col]["maximum"]
>                   assert normalized_equals(actual_max, stats_max)
E                   AssertionError: assert False
E                    +  where False = normalized_equals(numpy.datetime64('2001-09-09T01:55:11.684000000'), datetime.datetime(2001, 9, 9, 0, 55, 11, 684000, tzinfo=datetime.timezone.utc))

cudf/cudf/tests/test_orc.py:647: AssertionError
_____________________________________________________________________________ test_writer_timestamp_stream_size ______________________________________________________________________________

datadir = PosixPath('/home/cph/dev/rapids/cudf/python/cudf/cudf/tests/data/orc'), tmpdir = local('/tmp/pytest-of-rapids/pytest-0/test_writer_timestamp_stream_s0')

    def test_writer_timestamp_stream_size(datadir, tmpdir):
        pdf_fname = datadir / "TestOrcFile.largeTimestamps.orc"
        gdf_fname = tmpdir.join("gdf.orc")
    
        try:
            orcfile = pa.orc.ORCFile(pdf_fname)
        except Exception as excpr:
            if type(excpr).__name__ == "ArrowIOError":
                pytest.skip(".orc file is not found")
            else:
                print(type(excpr).__name__)
    
        expect = orcfile.read().to_pandas()
        cudf.from_pandas(expect).to_orc(gdf_fname.strpath)
        got = pa.orc.ORCFile(gdf_fname).read().to_pandas()
    
>       assert_eq(expect, got)

cudf/cudf/tests/test_orc.py:1347: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
cudf/cudf/testing/_utils.py:99: in assert_eq
    tm.assert_frame_equal(left, right, **kwargs)
../../compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pandas/_testing/asserters.py:823: in assert_extension_array_equal
    assert_numpy_array_equal(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

left = array([-7669881092049722624,  7648224533347138688, -6222510175959171008,
        8549737800416241920,  614612853626634..., -9223372036854775808, -3750193145076206464,
       -2160566338078412928, -2660170922871551616,  5272288178396793536])
right = array([-7669881092049722624,  7648228133347138688, -6222510175959171008,
        8549737800416241920,  614613213626634..., -9223372036854775808, -3750193145076206464,
       -2160566338078412928, -2660170922871551616,  5272288178396793536])
err_msg = None

    def _raise(left, right, err_msg):
        if err_msg is None:
            if left.shape != right.shape:
                raise_assert_detail(
                    obj, f"{obj} shapes are different", left.shape, right.shape
                )
    
            diff = 0
            for left_arr, right_arr in zip(left, right):
                # count up differences
                if not array_equivalent(left_arr, right_arr, strict_nan=strict_nan):
                    diff += 1
    
            diff = diff * 100.0 / left.size
            msg = f"{obj} values are different ({np.round(diff, 5)} %)"
>           raise_assert_detail(obj, msg, left, right, index_values=index_values)
E           AssertionError: numpy array are different
E           
E           numpy array values are different (33.22222 %)
E           [index]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...]
E           [left]:  [-7669881092049722624, 7648224533347138688, -6222510175959171008, 8549737800416241920, 6146128536266345152, 2441036433687896768, 3650772260249380608, 2332906905725587072, -9223372036854775808, 4475216653615896768, 2309676104077448384, 708018238214241920, 7349827464912241920, -4205035984398516160, 7950968341204587072, -824084909390412928, 7804156348090448384, -2994625549828067776, -8469696152774516160, -6730562805393171008, 2639706197684932224, -717671797637551616, 7063788662534345152, 3217072016374896768, 7226203069142448384, 6505784724113828992, 8594927921036138688, 8146191118464793536, 1428982696134241920, 1858171360120345152, -9223372036854775808, 7179979598144793536, 3238360557770380608, -5640549174212206464, -8587295148522171008, -585024738495206464, 6573093981656483840, 5200957053455896768, 4144107275066448384, -6363642277141171008, -3969334531150171008, -702067096111412928, 5235343668810345152, -7516128544576171008, -9223372036854775808, -8904699717227412928, -5818148069828067776, 8357824609078241920, -4425198473557103232, 2138011121379241920, 4562302698413000000, -4383109832547067776, -7353441282320654848, -941244422442551616, 943052456681828992, 8634767695118380608, 3473892101459035456, -1867915186675103232, -6158847852176171008, 7316153599335448384, -4226046974394103232, -570712734417654848, -9223372036854775808, 5765365075195138688, -1250550807378551616, 5614747407174587072, -4524629975923171008, 1809270148747448384, -6384125029734861312, 2356792018799138688, -6075396581701067776, -9223372036854775808, 3229649468364138688, -5642672980323964544, 1914574403785345152, -9223372036854775808, 7639640642052896768, 1458591435396587072, 2885227406013483840, 7296288310445483840, 6429541409356828992, -1809733922439516160, 636366145952483840, -8465719059390067776, 3812965091885448384, 5636317058179587072, -6830572464385619392, -9223372036854775808, 5079295996197483840, -7192331781971722624, 359434315528896768, 7764217513921932224, 4276428227991448384, -6281166102142067776, 7453217930120345152, -2968945999044067776, 6352231469177828992, 5548356516984828992, -4012309001064619392, -8709567771225654848, ...]
E           [right]: [-7669881092049722624, 7648228133347138688, -6222510175959171008, 8549737800416241920, 6146132136266345152, 2441040033687896768, 3650775860249380608, 2332906905725587072, -9223372036854775808, 4475220253615896768, 2309679704077448384, 708021838214241920, 7349827464912241920, -4205035984398516160, 7950968341204587072, -824081309390412928, 7804159948090448384, -2994625549828067776, -8469696152774516160, -6730562805393171008, 2639709797684932224, -717671797637551616, 7063788662534345152, 3217072016374896768, 7226203069142448384, 6505784724113828992, 8594931521036138688, 8146191118464793536, 1428986296134241920, 1858171360120345152, -9223372036854775808, 7179983198144793536, 3238364157770380608, -5640549174212206464, -8587295148522171008, -585021138495206464, 6573097581656483840, 5200960653455896768, 4144110875066448384, -6363642277141171008, -3969334531150171008, -702067096111412928, 5235343668810345152, -7516128544576171008, -9223372036854775808, -8904699717227412928, -5818148069828067776, 8357824609078241920, -4425198473557103232, 2138014721379241920, 4562306298413000000, -4383109832547067776, -7353441282320654848, -941244422442551616, 943052456681828992, 8634771295118380608, 3473892101459035456, -1867915186675103232, -6158847852176171008, 7316153599335448384, -4226046974394103232, -570712734417654848, -9223372036854775808, 5765368675195138688, -1250547207378551616, 5614747407174587072, -4524629975923171008, 1809273748747448384, -6384125029734861312, 2356795618799138688, -6075396581701067776, -9223372036854775808, 3229653068364138688, -5642672980323964544, 1914578003785345152, -9223372036854775808, 7639640642052896768, 1458595035396587072, 2885231006013483840, 7296291910445483840, 6429545009356828992, -1809733922439516160, 636366145952483840, -8465719059390067776, 3812968691885448384, 5636320658179587072, -6830572464385619392, -9223372036854775808, 5079295996197483840, -7192331781971722624, 359437915528896768, 7764217513921932224, 4276431827991448384, -6281166102142067776, 7453221530120345152, -2968945999044067776, 6352235069177828992, 5548360116984828992, -4012309001064619392, -8709567771225654848, ...]

../../compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pandas/_testing/asserters.py:735: AssertionError
====================================================================================== warnings summary ======================================================================================
cudf/tests/test_orc.py: 14 warnings
  /home/cph/dev/rapids/compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pyorc/writer.py:100: DeprecationWarning: In future, it will be an error for 'np.bool_' scalars to be interpreted as an index
    self.write(row)

cudf/tests/test_orc.py::test_orc_write_statistics[1]
  /home/cph/dev/rapids/compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pandas/util/__init__.py:15: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
    import pandas.util.testing

cudf/tests/test_orc.py: 17 warnings
  /home/cph/dev/rapids/cudf/python/cudf/cudf/tests/test_orc.py:586: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
    value2 = np.datetime64(value2, "ms")

cudf/tests/test_orc.py::test_orc_writer_lists[data0]
  /home/cph/dev/rapids/compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:499: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
    if np.any(np.asarray(left_value != right_value)):

cudf/tests/test_orc.py::test_orc_writer_lists[data2]
  /home/cph/dev/rapids/compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:499: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
    if np.any(np.asarray(left_value != right_value)):

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================================================================== short test summary info ===================================================================================
FAILED cudf/cudf/tests/test_orc.py::test_orc_write_statistics[100] - AssertionError: assert False
FAILED cudf/cudf/tests/test_orc.py::test_orc_write_statistics[6000000] - AssertionError: assert False
FAILED cudf/cudf/tests/test_orc.py::test_writer_timestamp_stream_size - AssertionError: numpy array are different
============================================================= 3 failed, 203 passed, 1 skipped, 34 warnings in 263.29s (0:04:23) ==============================================================

galipremsagar · 2022-01-14T16:22:53Z

test_orc_write_statistics

These are unrelated failures: #7314

hyperbolic2346

Looks good to me. Just some copyrights and a question.

cpp/src/interop/from_arrow.cu

cpp/src/interop/to_arrow.cu

cpp/tests/interop/from_arrow_test.cpp

jrhemstad · 2022-01-17T16:19:06Z

cpp/tests/interop/arrow_utils.hpp

+[[nodiscard]] auto make_decimal128_arrow_array(std::vector<T> const& data,
+                                               std::optional<std::vector<int>> const& validity,


This is mostly a nit, but I'd prefer to see these be iterators, or at least spans.

Because of the .data() I am not sure I can use iterators here. And cudf::host_span doesn't work with std::vector.

Resolves: #10031 Depends on #9483, #9986 Note: The CI for this PR is not going to pass until #9986 is admin-merged(Admin merge needed since #9986 requires this PR changes too). - [x] Introduced `Decimal128Dtype` and `Decimal128Column`. - [x] Enabled python side support for the above both. - [x] Enables complete support for `Decimal32Column` which is currently lacking. - [x] Enabled orc writer to use decimal128. - [x] Enabled parquet to read a decimal128 type. - [x] Enabled Scalar support for `Decimal128Dtype`. - [x] Covered all decimal types in `string` <-> `decimal` conversions. - [x] **Made `Decimal128Dtype` the default type while reading in a Decimal Series or Scalar. User can specify to choose a specific decimal type by passing a `dtype`.** (Breaking) - [x] **Fixed issues in the binop precision & scale calculation logic to correctly choose a decimal type.** (Breaking) - [x] Fixed type metadata handling issues seen across APIs while making changes. - [x] Added parametrizations for all missing `decimal32` tests. - [x] Added parametrizations for `decimal128` along with existing decimal type-specific tests. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Robert (Bobby) Evans (https://github.com/revans2) - Conor Hoekstra (https://github.com/codereport) Approvers: - Devavret Makkar (https://github.com/devavret) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9533

After #9986 reading Arrow in libcudf now returns DECIMAL128 instead of DECIMAL64. This updates the Java tests to expect DECIMAL128 instead of DECIMAL64 by upcasting the decimal columns in the original table being round-tripped through Arrow before comparing the result. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Rong Ou (https://github.com/rongou) - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #10073

Initial test

a44d66e

codereport added bug Something isn't working 2 - In Progress Currently a work in progress non-breaking Non-breaking change labels Jan 6, 2022

codereport self-assigned this Jan 6, 2022

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 6, 2022

codereport added 2 commits January 6, 2022 15:53

Turn test green :)

16f6b84

Copyright update

b5a0008

codereport added 4 commits January 8, 2022 00:28

Another test

42d9c27

Delete dead code

0e90f3b

Null tests + small cleanup

8e346e0

from_arrow + Tests

62e1b17

codereport added 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond breaking Breaking change and removed 2 - In Progress Currently a work in progress non-breaking Non-breaking change labels Jan 8, 2022

codereport marked this pull request as ready for review January 8, 2022 14:45

codereport requested a review from a team as a code owner January 8, 2022 14:45

codereport requested review from devavret and jrhemstad January 8, 2022 14:45

galipremsagar mentioned this pull request Jan 13, 2022

[REVIEW] Add support for decimal128 in cudf python #9533

Merged

12 tasks

codereport added 4 commits January 12, 2022 23:32

Small refactor

5f36d9b

Use instead of

719539b

to_arrow refactor

46beced

from_arrow refactor

0e8c0d6

hyperbolic2346 requested changes Jan 14, 2022

View reviewed changes

cpp/src/interop/from_arrow.cu Show resolved Hide resolved

cpp/src/interop/to_arrow.cu Show resolved Hide resolved

codereport requested a review from a team as a code owner January 14, 2022 20:40

codereport requested review from brandon-b-miller and marlenezw January 14, 2022 20:40

Update copyright

f74414d

codereport force-pushed the decimal128-to-from-arrow branch from 9000a26 to f74414d Compare January 14, 2022 20:50

codereport removed request for a team, brandon-b-miller and marlenezw January 14, 2022 20:50

jrhemstad reviewed Jan 17, 2022

View reviewed changes

cpp/tests/interop/from_arrow_test.cpp Show resolved Hide resolved

jrhemstad reviewed Jan 17, 2022

View reviewed changes

Addressing PR comments

e52781c

codereport requested review from jrhemstad and hyperbolic2346 January 17, 2022 19:20

hyperbolic2346 approved these changes Jan 17, 2022

View reviewed changes

codereport added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Jan 18, 2022

ajschmidt8 merged commit 45c20d1 into rapidsai:branch-22.02 Jan 18, 2022

jlowe mentioned this pull request Jan 18, 2022

ORC writer API changes for granular statistics #10058

Merged

jlowe mentioned this pull request Jan 18, 2022

Update Java tests to expect DECIMAL128 from Arrow #10073

Merged

codereport mentioned this pull request Feb 6, 2022

[BUG] {to/from}_arrow in libcudf doesn't seem to handle decimal128 types #9980

Closed

codereport mentioned this pull request Aug 21, 2023

[BUG] Decimal128 uses a fixed precision of 18 when converting to an arrow array #13749

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`decimal128` Support for `to/from_arrow` #9986

`decimal128` Support for `to/from_arrow` #9986

codereport commented Jan 6, 2022 •

edited

Loading

codecov bot commented Jan 6, 2022 •

edited

Loading

codereport commented Jan 9, 2022

galipremsagar commented Jan 10, 2022 •

edited

Loading

codereport commented Jan 14, 2022

galipremsagar commented Jan 14, 2022 •

edited

Loading

hyperbolic2346 left a comment

jrhemstad Jan 17, 2022

codereport Jan 17, 2022

		[[nodiscard]] auto make_decimal128_arrow_array(std::vector<T> const& data,
		std::optional<std::vector<int>> const& validity,

decimal128 Support for to/from_arrow #9986

decimal128 Support for to/from_arrow #9986

Conversation

codereport commented Jan 6, 2022 • edited Loading

codecov bot commented Jan 6, 2022 • edited Loading

Codecov Report

codereport commented Jan 9, 2022

galipremsagar commented Jan 10, 2022 • edited Loading

codereport commented Jan 14, 2022

galipremsagar commented Jan 14, 2022 • edited Loading

hyperbolic2346 left a comment

Choose a reason for hiding this comment

jrhemstad Jan 17, 2022

Choose a reason for hiding this comment

codereport Jan 17, 2022

Choose a reason for hiding this comment

`decimal128` Support for `to/from_arrow` #9986

`decimal128` Support for `to/from_arrow` #9986

codereport commented Jan 6, 2022 •

edited

Loading

codecov bot commented Jan 6, 2022 •

edited

Loading

galipremsagar commented Jan 10, 2022 •

edited

Loading

galipremsagar commented Jan 14, 2022 •

edited

Loading