Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java cleaner synchronization [skip ci] #7474

Merged

Conversation

abellina
Copy link
Contributor

@abellina abellina commented Mar 1, 2021

Add synchronization in cleanImpl and close in various places where race conditions could exist, and also within the MemoryCleaner to address some concurrent modification issues we've seen in tests while shutting down (i.e. invoking the cleaner) (i.e. NVIDIA/spark-rapids#1797)

@github-actions github-actions bot added the Java Affects Java cuDF API. label Mar 1, 2021
@abellina abellina changed the title Java cleaner synchronization Java cleaner synchronization [skip ci] Mar 1, 2021
@abellina abellina added bug Something isn't working non-breaking Non-breaking change labels Mar 1, 2021
@abellina
Copy link
Contributor Author

abellina commented Mar 1, 2021

I am running some tests on this but posting as draft for now.

@abellina abellina marked this pull request as ready for review March 1, 2021 16:48
@abellina abellina requested a review from a team as a code owner March 1, 2021 16:48
@abellina
Copy link
Contributor Author

abellina commented Mar 1, 2021

I am not seeing unit test failures as a result of this change in cudf or the Spark plugin,

@@ -114,7 +114,7 @@ public String toString() {
}

@Override
public void close() {
public synchronized void close() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to synchronize closing a stream, but none of the other classes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the other classes synchronized on close, except Stream and Event. As of this PR, these don't synchronize:

  • HostColumnVectorCore (as far as as I can tell its close is overwritten by subclasses)
  • nvcomp: BatchedMetadata and Metadata. Neither of these are called in a multi-threaded context.

I think we can synchronize close for the nvcomp ones, that seems like it could be a miss later. If you think this is unnecessary let me know and I can revert parts of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • HostColumnVectorCore (as far as as I can tell its close is overwritten by subclasses)

Child columns are instances of this class. So we should probably add synchronized to the close in HostColumnVectorCore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@revans2 did you want me to make a change around this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done. I need to re-run my tests after upmerging. FYI.

@jlowe
Copy link
Contributor

jlowe commented Mar 4, 2021

@abellina Please edit the PR description (which becomes part of the commit message) to drop the mention of a draft and I think this is good to go.

@abellina
Copy link
Contributor Author

abellina commented Mar 4, 2021

@abellina Please edit the PR description (which becomes part of the commit message) to drop the mention of a draft and I think this is good to go.

Done!

@jlowe
Copy link
Contributor

jlowe commented Mar 4, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 85c1f8f into rapidsai:branch-0.19 Mar 4, 2021
@abellina abellina deleted the close_cleaner_synchronization branch March 4, 2021 16:45
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
Add synchronization in `cleanImpl` and `close` in various places where race conditions could exist, and also within the `MemoryCleaner` to address some concurrent modification issues we've seen in tests while shutting down (i.e. invoking the cleaner) (i.e. NVIDIA/spark-rapids#1797)

Authors:
  - Alessandro Bellina (@abellina)

Approvers:
  - Robert (Bobby) Evans (@revans2)
  - Jason Lowe (@jlowe)

URL: rapidsai#7474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants