-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add write_json method to cudf/io/json.hpp #11165
Comments
This issue has been labeled |
Thank you @dagardner-nv for sharing this request. Currently our main JSON focus is on reading arbitrarily-nested data types (Nested JSON reader). Writing JSON is complex because there are several common ways to represent tabular data in JSON. Would you please review the EDIT: I see from your link that you are writing JSON Lines as records. Thank you! |
@GregoryKimball thanks, and yes for the purposes of Morpheus most of the time by the time we get around to writing it the data would be flat. I wonder if it could be possible to do something like:
|
Hello @chinmaychandak, would this |
Thanks for pointing me to this @GregoryKimball! Yes, @jsmaupin - anything to add? |
Nothing to add from me. I agree with @chinmaychandak 's comments. |
Adds JSON writer with nested support. It supports numeric, datetime, duration, strings, nested types such as struct and list types. `orient='records'` is only supported now, with `lines=True/False`. Usage: `df.to_json(engine='cudf')` closes #11165 Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - GALI PREM SAGAR (https://github.com/galipremsagar) - David Wendt (https://github.com/davidwendt) - Michael Wang (https://github.com/isVoid) - Robert Maynard (https://github.com/robertmaynard) URL: #12474
Is your feature request related to a problem? Please describe.
Currently the only method for serializing a cudf dataframe to json requires acquiring the Python GIL.
cudf/io/json.hpp contains a
read_json
method, and the correspondingcsv.hpp
header contains bothread_csv
andwrite_csv
methods.Describe the solution you'd like
Add a
write_json
method which is callable from C++ and does not require the Python GIL.Describe alternatives you've considered
Currently we have to use the Python API from C++:
https://github.com/nv-morpheus/Morpheus/blob/branch-22.08/morpheus/_lib/src/io/serializers.cpp#L109
Additional context
This would be a big performance win for us to not require the GIL in our serialize stage.
The text was updated successfully, but these errors were encountered: