Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedupe proxy rapids shuffle manager byte code #3602

Conversation

gerashegalov
Copy link
Collaborator

@gerashegalov gerashegalov commented Sep 22, 2021

Stop inheriting ShuffleManaager in base class. Spark ShuffleManager causes bytecode discrepancies in the unshimmed area. Inherit it only in the version-specific code:

Previously, only 3.1+ was bitwise-identical

$ grep ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/*
dist/target/binary-diffs/spark311cdh-spark312.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
dist/target/binary-diffs/spark311cdh-spark320.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class

After the PR, bitwise-identical across all supported 3.x

$ grep ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/*
dist/target/binary-diffs/spark311cdh-spark302.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
dist/target/binary-diffs/spark311cdh-spark312.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
dist/target/binary-diffs/spark311cdh-spark320.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class

Other changes:

  • Shim Dev doc
  • Fixes incorrect path for org.apache.spark.sql.rapids.shims.spark301.RapidsShuffleInternalManager

Signed-off-by: Gera Shegalov [email protected]

Spark ShuffleManager causes bytecode discrepancies in the unshimmed area. Inherit it only
in the shim-protected code:

```
$ grep ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/*
dist/target/binary-diffs/spark311cdh-spark302.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
dist/target/binary-diffs/spark311cdh-spark312.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
dist/target/binary-diffs/spark311cdh-spark320.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class
```

Signed-off-by: Gera Shegalov <[email protected]>
@gerashegalov gerashegalov requested review from tgravescs and abellina and removed request for tgravescs September 22, 2021 14:36
@gerashegalov gerashegalov self-assigned this Sep 22, 2021
@gerashegalov gerashegalov added the bug Something isn't working label Sep 22, 2021
@gerashegalov gerashegalov added this to the Sep 13 - Sep 24 milestone Sep 22, 2021
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments explaining what is happening would be nice. It is a little confusing who calls into whom and what the different levels of indirection are for.

@gerashegalov
Copy link
Collaborator Author

Some comments explaining what is happening would be nice. It is a little confusing who calls into whom and what the different levels of indirection are for.

@revans2 I agree. However, this was a subtle area where I found it safer to make the Proxy to follow the existing implementation in lockstep. I added a shim README to explain the general background. I think we can get rid of some of the intermediate classes. I prefer it to be done on 21.12 branch

@gerashegalov
Copy link
Collaborator Author

build

shims/README.md Outdated Show resolved Hide resolved
shims/README.md Outdated Show resolved Hide resolved
shims/README.md Outdated Show resolved Hide resolved
@gerashegalov
Copy link
Collaborator Author

build

* Trait that makes it easy to check whether we are dealing with the
* a RAPIDS Shuffle Manager
*
* TODO name does not match its function anymore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to fix this TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #3624 for this.

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough README change, that was useful.

@gerashegalov gerashegalov merged commit 384d956 into NVIDIA:branch-21.10 Sep 23, 2021
@gerashegalov gerashegalov deleted the dedupeProxyRapidsShuffleManagerByteCode branch September 23, 2021 13:54
@gerashegalov gerashegalov added the documentation Improvements or additions to documentation label Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants