Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't write Parquet file containing Decimal/Fixed Point column(s) #7669

Closed
randerzander opened this issue Mar 22, 2021 · 0 comments · Fixed by #7673
Closed

[BUG] Can't write Parquet file containing Decimal/Fixed Point column(s) #7669

randerzander opened this issue Mar 22, 2021 · 0 comments · Fixed by #7673
Assignees
Labels
cuIO cuIO issue feature request New feature or request Python Affects Python cuDF API.

Comments

@randerzander
Copy link
Contributor

Using the latest rapidsai-nightly conda package ('0.19.0a+248.g5d7767ec2f'):

import cudf
from cudf.core.dtypes import Decimal64Dtype

df = cudf.DataFrame()
df['id'] = [0, 1, 2]
df['val'] = [0.00, 0.01, 0.02]

df['dec_val'] = df['val'].astype(Decimal64Dtype(7,2))
df.to_parquet('test.parquet')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-904034a55c73> in <module>
----> 1 df.to_parquet('test.parquet')

/conda/lib/python3.8/site-packages/cudf/core/dataframe.py in to_parquet(self, path, *args, **kwargs)
   7319         from cudf.io import parquet as pq
   7320 
-> 7321         return pq.to_parquet(self, path, *args, **kwargs)
   7322 
   7323     @ioutils.doc_to_feather()

/conda/lib/python3.8/site-packages/cudf/io/parquet.py in to_parquet(df, path, engine, compression, index, partition_cols, partition_file_name, statistics, metadata_file_path, int96_timestamps, *args, **kwargs)
    320                 )
    321         else:
--> 322             write_parquet_res = libparquet.write_parquet(
    323                 df,
    324                 path=path_or_buf,

cudf/_lib/parquet.pyx in cudf._lib.parquet.write_parquet()

cudf/_lib/parquet.pyx in cudf._lib.parquet.write_parquet()

cudf/_lib/utils.pyx in cudf._lib.utils.generate_pandas_metadata()

cudf/_lib/utils.pyx in cudf._lib.utils.generate_pandas_metadata()

/conda/lib/python3.8/site-packages/cudf/utils/dtypes.py in np_to_pa_dtype(dtype)
    123         # default fallback unit is ns
    124         return pa.duration("ns")
--> 125     return _np_pa_dtypes[np.dtype(dtype).type]
    126 
    127 

TypeError: Cannot interpret 'Decimal64Dtype(precision=7, scale=2)' as a data type
@randerzander randerzander added bug Something isn't working Python Affects Python cuDF API. cuIO cuIO issue labels Mar 22, 2021
@devavret devavret self-assigned this Mar 22, 2021
@devavret devavret added feature request New feature or request and removed bug Something isn't working labels Mar 22, 2021
rapids-bot bot pushed a commit that referenced this issue Mar 23, 2021
Resolves #7669

Authors:
  - Devavret Makkar (@devavret)

Approvers:
  - Keith Kraus (@kkraus14)

URL: #7673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants