Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] write_json fails for struct with zero columns #17413

Closed
karthikeyann opened this issue Nov 22, 2024 · 0 comments · Fixed by #17414
Closed

[BUG] write_json fails for struct with zero columns #17413

karthikeyann opened this issue Nov 22, 2024 · 0 comments · Fixed by #17414
Assignees
Labels
bug Something isn't working cuIO cuIO issue

Comments

@karthikeyann
Copy link
Contributor

Describe the bug
When a table has struct with zero columns, it fails to write using cudf::write_json.

Steps/Code to reproduce bug
case 1:

# write a json with struct with zero columns, sometimes nulls
json_str = """
{"A": {"B": null, "C": 1}}
{"A": {"B": null}}
{"A": {"B": {}}}
{"A": {"B": {}}}
{"A": {}}
{"A": {}}
"""
import cudf
from io import StringIO
df1 = cudf.read_json(StringIO(json_str), lines=True)
df1[["A"]].to_json('tmp', engine='cudf')

Also, another repro (different issue).
case 2:

# write a json with struct with zero columns, sometimes nulls
json_str = """
{"A": null}
{"A": null}
{"A": {}}
{"A": {}}
{}
{}
"""
import cudf
from io import StringIO
df1 = cudf.read_json(StringIO(json_str), lines=True)
df1[["A"]].to_json('tmp', engine='cudf')

Expected behavior
using pandas engine, df1[["A"]].to_json() produces

case 1: '{"A":{"0":{"B":null,"C":1},"1":{"B":null,"C":null},"2":{"B":{},"C":null},"3":{"B":{},"C":null},"4":{"B":null,"C":null},"5":{"B":null,"C":null}}}'
case 2: '{"A":{"0":null,"1":null,"2":{},"3":{},"4":null,"5":null}}'

Additional context
@GregoryKimball found the issue while working with synthesized json files.

@karthikeyann karthikeyann added bug Something isn't working cuIO cuIO issue labels Nov 22, 2024
@karthikeyann karthikeyann self-assigned this Nov 22, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 27, 2024
Closes  #17413

num_rows are passed to ensure empty`{}` is created for zero columns.

Authors:
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17414
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant