[FEA] Add a pre-merge check to validate that a PR has been committed using git signoff #399

sameerz · 2020-07-22T01:35:31Z

Is your feature request related to a problem? Please describe.

We are introducing a contributor license agreement signoff requirement. We need to ensure that commits are signed off with https://git-scm.com/docs/git-commit#Documentation/git-commit.txt--s . If a PR is not signed off, it should fail a pre-merge check.

Describe the solution you'd like
Pre-merge checks (builds?) fail without seeing a -s signoff from anyone committing to a PR. If the author of a commit adds a single commit with a -s, that will be sufficient for the pre-merge build to pass.

Describe alternatives you've considered
None

Additional context
https://github.com/NVIDIA/spark-rapids/blob/branch-0.2/CONTRIBUTING.md#sign-your-work

jlowe · 2020-08-14T19:17:17Z

Fixed by #439

This PR is the initial version of CUDA fault injection tool to explore and test for correctness of CUDA error handling in fault-tolerant CUDA applications. The tool is designed with automated testing and interactive testing use cases in mind. The tool is a dynamically linked library `libcufaultinj.so` that is loaded by the CUDA process via CUDA Driver API `cuInit` if it's provided via the `CUDA_INJECTION64_PATH` environment variable. As an example it can be used to test RAPIDS Accelerator for Apache Spark. ### Local Mode ```bash CUDA_INJECTION64_PATH=$PWD/target/cmake-build/faultinj/libcufaultinj.so \ FAULT_INJECTOR_CONFIG_PATH=src/test/cpp/faultinj/test_faultinj.json \ $SPARK_HOME/bin/pyspark \ --jars $SPARK_RAPIDS_REPO/dist/target/rapids-4-spark_2.12-22.08.0-SNAPSHOT-cuda11.jar \ --conf spark.plugins=com.nvidia.spark.SQLPlugin ``` ### Distributed Mode ```bash $SPARK_HOME/bin/spark-shell \ --jars $SPARK_RAPIDS_REPO/dist/target/rapids-4-spark_2.12-22.08.0-SNAPSHOT-cuda11.jar \ --conf spark.plugins=com.nvidia.spark.SQLPlugin \ --files ./target/cmake-build/faultinj/libcufaultinj.so,./src/test/cpp/faultinj/test_faultinj.json \ --conf spark.executorEnv.CUDA_INJECTION64_PATH=./libcufaultinj.so \ --conf spark.executorEnv.FAULT_INJECTOR_CONFIG_PATH=test_faultinj.json \ --conf spark.rapids.memory.gpu.minAllocFraction=0 \ --conf spark.rapids.memory.gpu.allocFraction=0.2 \ --master spark://hostname:7077 ``` When we configure the executor environment spark.executorEnv.CUDA_INJECTION64_PATH we have to use a path separator in the value ./libcufaultinj.so with the leading dot to make sure that dlopen loads the library file submitted. Otherwise it will assume a locally installed library accessible to the dynamic linker via LD_LIBRARY_PATH and similar mechanisms. See [dlopen man page](https://man7.org/linux/man-pages/man3/dlopen.3.html) ### Fault injection configuration Fault injection configuration is provided via the `FAULT_INJECTOR_CONFIG_PATH` environment variable. It's a set of rules to apply fault injection when CUDA Drvier or Runtime is matched by function name or callback id with a given probability. There are currently three types of fault injection: - launch a kernel with the PTX `trap` instruction - launch a kernel with a device assert - replace the return code for the CUDA Runtime call Example config: ```json { "logLevel": 1, "dynamic": true, "cudaRuntimeFaults": { "cudaLaunchKernel_ptsz": { "percent": 0, "injectionType": 0, "injectionType_comment": "PTX trap = 0, C assert = 1", "interceptionCount": 1 } }, "cudaDriverFaults": { "cuMemFreeAsync_ptsz": { "percent": 0, "injectionType": 2, "injectionType_comment": "substitute return code", "substituteReturnCode": 999, "interceptionCount": 1 } } } ``` Signed-off-by: Gera Shegalov <[email protected]>

sameerz added the build Related to CI / CD or cleanly building label Jul 22, 2020

sameerz added the P0 Must have for release label Jul 22, 2020

jlowe closed this as completed Aug 14, 2020

sameerz mentioned this issue Aug 25, 2020

[FEA] Add signoff check to gh-pages repository #604

Closed

pxLi pushed a commit to pxLi/spark-rapids that referenced this issue May 12, 2022

fixes the issue with keys in meta not being str (NVIDIA#399)

198a66f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add a pre-merge check to validate that a PR has been committed using git signoff #399

[FEA] Add a pre-merge check to validate that a PR has been committed using git signoff #399

sameerz commented Jul 22, 2020 •

edited

Loading

jlowe commented Aug 14, 2020

[FEA] Add a pre-merge check to validate that a PR has been committed using git signoff #399

[FEA] Add a pre-merge check to validate that a PR has been committed using git signoff #399

Comments

sameerz commented Jul 22, 2020 • edited Loading

jlowe commented Aug 14, 2020

sameerz commented Jul 22, 2020 •

edited

Loading