Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17641: [python] Fix ParseOptions deserialization of invalid_row_handler #14061

Merged
merged 2 commits into from
Sep 8, 2022

Conversation

krfricke
Copy link
Contributor

@krfricke krfricke commented Sep 7, 2022

No description provided.

@github-actions
Copy link

github-actions bot commented Sep 7, 2022

@github-actions
Copy link

github-actions bot commented Sep 7, 2022

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch!

Comment on lines 658 to 662
parse_opts.invalid_row_handler = InvalidRowHandler('skip')
state = parse_opts.__getstate__()

parse_opts = ParseOptions()
parse_opts.__setstate__(state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling getstate/setstate explicitly, can you pickle/unpickle instead? (like new_parse_opts = pickle.loads(pickle.dumps(parse_opts)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated!

I've also updated __getstate__ to return self.invalid_row_handler instead of self._invalid_row_handler, let me know if this looks good to you.

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 7, 2022

Thanks for fixing this, this is great! @krfricke has a little more context on the bug here: ray-project/ray#28326 (comment)

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jorisvandenbossche jorisvandenbossche merged commit 6ff5224 into apache:master Sep 8, 2022
@krfricke krfricke deleted the csv-parseoptions-deser branch September 8, 2022 08:22
@ursabot
Copy link

ursabot commented Sep 8, 2022

Benchmark runs are scheduled for baseline = 47314c3 and contender = 6ff5224. 6ff5224 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.17% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.55% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.11% ⬆️0.04%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 6ff52243 ec2-t3-xlarge-us-east-2
[Failed] 6ff52243 test-mac-arm
[Failed] 6ff52243 ursa-i9-9960x
[Finished] 6ff52243 ursa-thinkcentre-m75q
[Finished] 47314c39 ec2-t3-xlarge-us-east-2
[Finished] 47314c39 test-mac-arm
[Failed] 47314c39 ursa-i9-9960x
[Finished] 47314c39 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
…_handler (apache#14061)

Authored-by: Kai Fricke <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
…_handler (apache#14061)

Authored-by: Kai Fricke <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants