-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dedupe proxy rapids shuffle manager byte code #3602
Dedupe proxy rapids shuffle manager byte code #3602
Conversation
…pidsShuffleManagerByteCode
Spark ShuffleManager causes bytecode discrepancies in the unshimmed area. Inherit it only in the shim-protected code: ``` $ grep ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/* dist/target/binary-diffs/spark311cdh-spark302.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/spark311cdh-spark312.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class dist/target/binary-diffs/spark311cdh-spark320.identical:org/apache/spark/sql/rapids/ProxyRapidsShuffleInternalManagerBase.class ``` Signed-off-by: Gera Shegalov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments explaining what is happening would be nice. It is a little confusing who calls into whom and what the different levels of indirection are for.
…pidsShuffleManagerByteCode
Signed-off-by: Gera Shegalov <[email protected]>
@revans2 I agree. However, this was a subtle area where I found it safer to make the Proxy to follow the existing implementation in lockstep. I added a shim README to explain the general background. I think we can get rid of some of the intermediate classes. I prefer it to be done on 21.12 branch |
build |
Signed-off-by: Gera Shegalov <[email protected]>
build |
* Trait that makes it easy to check whether we are dealing with the | ||
* a RAPIDS Shuffle Manager | ||
* | ||
* TODO name does not match its function anymore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to fix this TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #3624 for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough README change, that was useful.
Stop inheriting ShuffleManaager in base class. Spark ShuffleManager causes bytecode discrepancies in the unshimmed area. Inherit it only in the version-specific code:
Previously, only 3.1+ was bitwise-identical
After the PR, bitwise-identical across all supported 3.x
Other changes:
org.apache.spark.sql.rapids.shims.spark301.RapidsShuffleInternalManager
Signed-off-by: Gera Shegalov [email protected]