Skip to content

Commit

Permalink
Ant task to automatically convert to a simple shim layout (#7561)
Browse files Browse the repository at this point in the history
Closes #7528

Usage:

```bash
mvn -B clean generate-sources antrun:run@shimplify-shim-sources -pl sql-plugin -Dshimplify=true -Dshimplify.add.base=323 -Dshimplify.add.shim=324
```

- Prints warning if detects a shim containing multiple source roots not shared with other shims. By construction a shim should have only a single directory that is not shared. Example:
```
shimplify - WARNING - Consider consolidating 340, it spans multiple dedicated directories ['/home/user/gits/NVIDIA/spark-rapids/sql-plugin/src/main/340+/scala', '/home/user/gits/NVIDIA/spark-rapids/sql-plugin/src/main/340/scala']
shimplify - WARNING - Consider consolidating 312db, it spans multiple dedicated directories ['/home/user/gits/NVIDIA/spark-rapids/sql-plugin/src/main/312db/scala', '/home/user/gits/NVIDIA/spark-rapids/sql-plugin/src/main/31xdb/scala']
```
- Updates shimplfy json comments without moving the files they are the old locations
- Move files to the canonical fist-shim-affected location
- Clones an existing shim to a new shim
- Removes an existing shim

The goal is to deprecate the range dirs.
 
Signed-off-by: Gera Shegalov <[email protected]>
Co-authored-by: Niranjan Artal <[email protected]>
Co-authored-by: Jason Lowe <[email protected]>
Co-authored-by: Raza Jafri <[email protected]>
  • Loading branch information
4 people authored Feb 23, 2023
1 parent 1449fd0 commit c5b8016
Show file tree
Hide file tree
Showing 4 changed files with 908 additions and 6 deletions.
33 changes: 27 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,22 +186,38 @@ The following acronyms may appear in directory names:
|cdh |Cloudera CDH|321cdh |Cloudera CDH Spark based on Apache Spark 3.2.1|

The version-specific directory names have one of the following forms / use cases:
- `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
- `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,

#### Version range directories

The following source directory system is deprecated. See below and [shimplify.md][1]

* `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
* `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,
only beginning with version Spark 3.1.2, and + signifies there is no upper version boundary
among the supported versions
- `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
* `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
3.2.0 *exclusive*
- `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
* `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).
- `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
* `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).

For each Spark shim, we use Ant path patterns to compute the property
`spark${buildver}.sources` in [sql-plugin/pom.xml](./sql-plugin/pom.xml) that is
picked up as additional source code roots. When possible path patterns are reused using
the conventions outlined in the pom.

#### Simplified version directory structure

Going forward new shim files should be added under:

* `src/main/spark${buildver}`, example: `src/main/spark330db`
* `src/test/spark${buildver}`, example: `src/test/spark340`

with a special shim descriptor as a Scala/Java comment. See [shimplify.md][1]

[1]: ./docs/dev/shimplify.md

### Setting up an Integrated Development Environment

Our project currently uses `build-helper-maven-plugin` for shimming against conflicting definitions of superclasses
Expand Down Expand Up @@ -238,7 +254,12 @@ Known Issues:

* There is a known issue that the test sources added via the `build-helper-maven-plugin` are not handled
[properly](https://youtrack.jetbrains.com/issue/IDEA-100532). The workaround is to `mark` the affected folders
such as `tests/src/test/320+-noncdh-nondb` manually as `Test Sources Root`
such as

* `tests/src/test/320+-noncdh-nondb`
* `tests/src/test/spark340`

manually as `Test Sources Root`

* There is a known issue where, even after selecting a different Maven profile in the Maven submenu,
the source folders from a previously selected profile may remain active. As a workaround,
Expand Down
Loading

0 comments on commit c5b8016

Please sign in to comment.