-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change inner dtypes of structs to tuple lists #851
Conversation
This change is necessary to keep the position of fields predictable.
@philss I expected the order of the I crafted the following tests to test my hypothesis: Mix.install([
{:explorer,
github: "elixir-explorer/explorer", branch: "ps-change-struct-inner-dtypes-to-keylists"}
])
alias Explorer.DataFrame
alias Explorer.Series
ExUnit.configure(exclude: [:skip])
ExUnit.start(autorun: false)
defmodule DTypeTest do
use ExUnit.Case
test "preserves manually provided dtype order" do
assert {:struct, [{"b", :string}, {"a", :string}]} =
[%{"a" => "a", "b" => "b"}, %{"b" => "b", "a" => "a"}]
|> Series.from_list(dtype: {:struct, [{"b", :string}, {"a", :string}]})
|> then(& &1.dtype)
assert {:struct, [{"a", :string}, {"b", :string}]} =
[%{"a" => "a", "b" => "b"}, %{"b" => "b", "a" => "a"}]
|> Series.from_list(dtype: {:struct, [{"a", :string}, {"b", :string}]})
|> then(& &1.dtype)
end
test "can cast dtype order" do
assert {:struct, [{"b", :string}, {"a", :string}]} =
[%{"a" => "a", "b" => "b"}, %{"b" => "b", "a" => "a"}]
|> Series.from_list(dtype: {:struct, [{"a", :string}, {"b", :string}]})
|> Series.cast({:struct, [{"b", :string}, {"a", :string}]})
|> then(& &1.dtype)
end
test "infers correctly ordered dtype from ordered source" do
assert {:struct, [{"b", :string}, {"a", :string}]} =
"""
{"col": {"b": "b", "a": "a"}}
"""
|> DataFrame.load_ndjson!()
|> then(& &1["col"].dtype)
assert {:struct, [{"a", :string}, {"b", :string}]} =
"""
{"col": {"a": "a", "b": "b"}}
"""
|> DataFrame.load_ndjson!()
|> then(& &1["col"].dtype)
end
end
ExUnit.run() Output:
|
@maennchen yeah, I think some sorting I'm doing is not necessary. Let me try to fix it. |
This prevents us to change the order that the user prefers in the case of a series created using "from_list/2" with a given dtype, or a series casted with `cast/2`.
@maennchen please try again. I think it is correct now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philss Looking good :)
Co-authored-by: José Valim <[email protected]>
This change is necessary to keep the position of fields predictable.
This is related to #847