-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Struct json_decode/3 for decoding json from string #841
Conversation
lkarthee
commented
Jan 29, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the string is not valid JSON?
Currently I get error unless I pass Please advise on how to handle this:
** (RuntimeError) DataFrame mismatch.
expected:
names: ["a", "aj"]
dtypes: %{"a" => :string, "aj" => nil}
got:
names: ["a", "aj"]
dtypes: %{"a" => :string, "aj" => {:struct, %{"n" => {:s, 64}}}}
(explorer 0.9.0-dev) lib/explorer/polars_backend/shared.ex:62: Explorer.PolarsBackend.Shared.apply_dataframe/4
iex:3: (file) explorer/lib/explorer/polars_backend/shared.ex Lines 42 to 79 in 8b03726
|
Raises a polars error. iex(6)> d = DF.new([%{a: "{\"n\": 1"}])
iex(7)> DF.mutate(d, aj: json_decode(a, dtype: {:struct, %{"n" => {:s, 64}}}))
** (RuntimeError) Polars Error: error deserializing JSON: json parsing error: 'ExpectedObjectContent at character 8 (']')'
(explorer 0.9.0-dev) lib/explorer/polars_backend/shared.ex:81: Explorer.PolarsBackend.Shared.apply_dataframe/4
iex:7: (file) |
I would start with the simplest API possible:
|
3a83583
to
846eff0
Compare
Update expressions.rs add tests Update series.ex
846eff0
to
f48624d
Compare
@@ -36,6 +36,9 @@ object_store = { version = "0.8", default-features = false, optional = true } | |||
[target.'cfg(not(any(all(windows, target_env = "gnu"), all(target_os = "linux", target_env = "musl"))))'.dependencies] | |||
mimalloc = { version = "*", default-features = false } | |||
|
|||
[patch.crates-io] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context here - pola-rs/polars#14008
3482d27
to
24c4f82
Compare
@@ -172,6 +173,11 @@ defmodule Explorer.PolarsBackend.Expression do | |||
raise ArgumentError, "missing #{inspect(__MODULE__)} nodes: #{inspect(missing)}" | |||
end | |||
|
|||
def to_expr(%LazySeries{op: :from_json, args: _}) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def to_expr(%LazySeries{op: :from_json, args: _}) do | |
def to_expr(%LazySeries{op: :from_json}) do |
@josevalim If i drop args: _
i get a lot of warnings even though all tests pass. Why does this happen ?
Log containing warnings
09:59 $ mix ci
Finished dev [unoptimized + debuginfo] target(s) in 0.42s
Compiling 3 files (.ex)
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
warning: this clause cannot match because a previous clause at line 265 always matches
│
314 │ def to_expr(%LazySeries{op: unquote(op), args: unquote(args)}) do
│ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│
└─ lib/explorer/polars_backend/expression.ex:314
Compiling crate explorer in debug mode (native/explorer)
Compiling explorer v0.1.0 (explorer/native/explorer)
Finished dev [unoptimized + debuginfo] target(s) in 5.26s
Excluding tags: [:cloud_integration]
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................*................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Finished in 2.8 seconds (2.8s async, 0.00s sync)
582 doctests, 1390 tests, 0 failures, 18 excluded, 1 skipped
</details>
Unfortunately I think this is inconsistent. All of our Therefore, if we introduce So, after sleeping on it, I think we should first add read_json/write_json, and then from_json and to_json. If it is too much work, let us know, and @philss can pick it up soon (he pretty much wrote all file integration stuff :D). |
@josevalim I can contribute an hour a day for few more months. Now that I am familiar explorer/polars - it does not take much time for implementing things from polars. Maintainers can feel free to tag me or assign PRs/bugs to me - i will try my best to resolve them. I am confused about your comments and from_json - feel free to correct me. Json Decode
Description about data:
Possible Operations:
Problems with having
Solution:
|
Thank you for the context. I see what you mean and I don't have a good answer for it. The best I can think of is to call it "json_decode" but always expect the "schema"/"dtype" to be given, so we don't have to guess. |
We could but I would like to avoid "branching" for now, if that makes any sense. |
Jose reply is to my question - whether we could have Series only op that way we could have avoid passing dtype? I felt my question was partly answered in his previous reply and deleted it (without seeing him answer it). Posting it again after seeing Jose's reply. |
24c4f82
to
33a31a7
Compare
I removed |
Thank you! It was a bit bumpy but I am glad with where we arrived! |
💚 💙 💜 💛 ❤️ |
Thank you Jose, I don't mind bumpy rides. |