Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: read_json(line=True) returns Index instead of RangeIndex in 2.2 #57429

Closed
mroeschke opened this issue Feb 15, 2024 · 1 comment · Fixed by #57439
Closed

BUG: read_json(line=True) returns Index instead of RangeIndex in 2.2 #57429

mroeschke opened this issue Feb 15, 2024 · 1 comment · Fixed by #57439
Labels
IO JSON read_json, to_json, json_normalize Regression Functionality that used to work in a prior pandas version

Comments

@mroeschke
Copy link
Member

In [2]: import pandas as pd; from io import StringIO

In [3]:             data = """\
   ...: {"a": 1, "b": 2}
   ...: {"a": 3, "b": 4}"""

In [4]: pd.read_json(StringIO(data), lines=True).index
Out[4]: Index([0, 1], dtype='int64').  # main
Out[3]: RangeIndex(start=0, stop=2, step=1). # 2.1
@mroeschke mroeschke added Regression Functionality that used to work in a prior pandas version IO JSON read_json, to_json, json_normalize labels Feb 15, 2024
@VISWESWARAN1998
Copy link
Contributor

VISWESWARAN1998 commented Feb 15, 2024

In Line 1163: Index object is explicitly created (which I commented out and used the existing index of newly generated series)

image

After changes:

data = """
{"a": 1, "b": 2}
{"a": 3, "b": 4}
"""

print(pd.read_json(StringIO(data), lines=True).index)

gives me expected results:

[1/1] Generating write_version_file with a custom command
IntervalIndex([ (1.0, 2.0),  (2.0, 3.0),  (3.0, 4.0),  (4.0, 5.0),  (5.0, 6.0),
                (6.0, 7.0),  (7.0, 8.0),  (8.0, 9.0), (9.0, 10.0)],
              dtype='interval[float32, neither]')
RangeIndex(start=0, stop=2, step=1)

rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Feb 21, 2024
`test_order_nested_json_reader` was refactored to use `assert_eq` instead of comparing via pyarrow. This was failing in pandas 2.2 due to pandas-dev/pandas#57429

`test_orc_reader_trailing_nulls` I believe was failing due to a change in how integers are compared with `assert_series_equal`: pandas-dev/pandas#55882. The "casting workaround" doesn't seem necessary in pandas 2.2 so just avoiding it all together

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #15062
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants