-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Access test files in resources from Spark source code when running Spark UT #10875
Labels
test
Only impacts tests
Comments
thirtiseven
added
feature request
New feature or request
? - Needs Triage
Need team to review and classify
labels
May 23, 2024
Preference is to use a test jar artifact to include rather than depending on full Spark source. |
gerashegalov
added a commit
to gerashegalov/spark-rapids
that referenced
this issue
May 29, 2024
Closes NVIDIA#10875 Contributes to NVIDIA#10773 Spark UTs need to be able to spark.read data Signed-off-by: Gera Shegalov <[email protected]>
gerashegalov
added a commit
that referenced
this issue
May 30, 2024
Closes #10875 Contributes to #10773 Unjar, cache, and share the test jar content among all test suites from the same jar Test: ```bash mvn package -Dbuildver=330 -pl tests -am -Dsuffixes='.*\.RapidsJsonSuite' ``` Signed-off-by: Gera Shegalov <[email protected]>
Closed by #10946 |
19 tasks
SurajAralihalli
pushed a commit
to SurajAralihalli/spark-rapids
that referenced
this issue
Jul 12, 2024
Closes NVIDIA#10875 Contributes to NVIDIA#10773 Unjar, cache, and share the test jar content among all test suites from the same jar Test: ```bash mvn package -Dbuildver=330 -pl tests -am -Dsuffixes='.*\.RapidsJsonSuite' ``` Signed-off-by: Gera Shegalov <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
Some Spark UT will try to read files in resources from Spark's code. So when we introduce Spark UT to plugin, we can't read those files directly.
For example, "SPARK-31716: inferring should handle malformed input" in
RapidsJsonSuite
got following error (if included):I hope we can find a way to read test files in Spark's resources so we can really test related Spark UT cases.
Describe the solution you'd like
Gluten overrides testFile
where
getWorkspaceFilePath
in SparkGluten is leveraging the system property "spark.test.home" (check here). For running Spark UT on shim 3.x.y, Gluten will prepare the docker container with a source folder filled with 3.x.y's code.
To do the same thing, we need to set up a Spark source folder before running Spark UT CI, and noted how to set up the env when running spark UT locally.
Describe alternatives you've considered
Those resource files are also included in jars, it is possible to read them from the jars if we know where the files are located.
Additional context
copy files to plugin pr: #10864
Spark UT RapidsJsonSuite issue: #10773
The text was updated successfully, but these errors were encountered: