Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.read_parquet() and pl.write_parquet() for pl.Decimal #8191

Closed
sslivkoff opened this issue Apr 12, 2023 · 2 comments
Closed

pl.read_parquet() and pl.write_parquet() for pl.Decimal #8191

sslivkoff opened this issue Apr 12, 2023 · 2 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@sslivkoff
Copy link

Problem description

Following the progress on pl.Decimal has been very exciting:

  1. oldest thread on Decimal
  2. PR for physical i128 type
  3. Decimal design discussion
  4. PR for Decimal series

All of these issues have been merged/closed and it is now possible to create Decimal series.

I don't know whether parquet IO functions are the next logical step, but just wanted to create this issue to specifically flag the use of Decimal in parquet.

Minimal example of writing and reading parquet with Decimal:

import decimal
import polars as pl

# create dataframe
data = {
    'hi': [True, False, True, False],
    'bye': [1, 2, 3, decimal.Decimal(47283957238957239875)]
}
df = pl.DataFrame(data)
assert df['bye'].dtype == pl.Decimal

# write file
df.write_parquet('decimal_test.parquet')

# read file
df2 = df.read_parquet('decimal_test.parquet')

# test that pl.Decimal is dtype (this fails, column has dtype pl.Float64)
assert df2['bye'].dtype == pl.Decimal

# check that DataFrames are equal (this fails, equality comparison not implemented)
assert df.frame_equal(df2)

Running pqrs schema decimal_test.parquet reveals that the written parquet file uses column type DOUBLE

@sslivkoff sslivkoff added the enhancement New feature or an improvement of an existing feature label Apr 12, 2023
@ritchie46
Copy link
Member

Did you run pl.Config.activate_decimals()?

@sslivkoff
Copy link
Author

activating decimals makes write_parquet work. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants