Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: keep value types when calling .items() on Series #50147

Closed
wants to merge 2 commits into from
Closed

BUG: keep value types when calling .items() on Series #50147

wants to merge 2 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Dec 9, 2022

I have identified that the .items() method of Series modifies the type of values, before returning them.
One of the major consequences (for me) is that the .to_dict() method unexpectedly returns modified types.
This can be problematic with float32 or float16 types, as .items() would convert them to Python's float, thus changing the actual value of the output by adding some extra decimal digits.

.items() modifies the type because it calls the .__iter__() method of the parent's class base.IndexOpsMixin, which enforces the return of Python types.

From my point, .items() and .to_dict() should not change the type of the values. This can be easily done by refering to Series._values, as proposed in this PR.

@ghost
Copy link
Author

ghost commented Dec 9, 2022

Perhaps I jumped the gun on this one, as it could be intentional to return python type instead of original type? : #20791 (comment)

I obviously don't have an holistic view on this one, but my user experience was not optimal, when I got to_dict() results different from what I would see in the original DF.

@phofl
Copy link
Member

phofl commented Dec 9, 2022

We want to return python types in I/0

@ghost
Copy link
Author

ghost commented Dec 9, 2022

We want to return python types in I/0

I see. Is there anything we can do to improve the conversion to float?
My initial issue came from using .to_dict() after .round(), which under the hood does something like:

np.float32(123.456789).round(3).item()
# 123.45700073242188 when expecting 123.457

Hence I see rounded values all around my work on pandas DF, but when I finally output a dict to feed an API, it has unexpected decimals.

@phofl
Copy link
Member

phofl commented Dec 9, 2022

You'll have to check where things go wrong to see if we can improve. Not familiar with the implementation myself

@ghost
Copy link
Author

ghost commented Dec 12, 2022

OK, it seems like this is going to be unavoidable.
When converting from np.float32 to python's float, I don't see a way to do that more cleanly than it is.
So be it, if to_dict() must return python types, then we have to live with this kind of trouble.

@ghost ghost closed this Dec 12, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant