Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java APIs to fetch CUDA runtime info [skip ci] #8465

Merged
merged 5 commits into from
Jun 11, 2021

Conversation

sperlingxx
Copy link
Contributor

Closes #8084 #6363

Signed-off-by: sperlingxx [email protected]

@sperlingxx sperlingxx requested a review from jlowe June 9, 2021 09:52
@sperlingxx sperlingxx requested a review from a team as a code owner June 9, 2021 09:52
@sperlingxx sperlingxx requested a review from firestarman June 9, 2021 09:53
@github-actions github-actions bot added the Java Affects Java cuDF API. label Jun 9, 2021
@sperlingxx sperlingxx added feature request New feature or request non-breaking Non-breaking change labels Jun 9, 2021
java/src/main/java/ai/rapids/cudf/CudaComputeMode.java Outdated Show resolved Hide resolved
java/src/test/java/ai/rapids/cudf/RmmTest.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Cuda.java Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Cuda.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/CudaComputeMode.java Outdated Show resolved Hide resolved
public enum CudaComputeMode {
cudaComputeModeDefault(0),
cudaComputeModeExclusive(1),
cudaComputeModeProhibited(2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thee enums do not match the java naming convention, which is UPPER_CASE_WITH_UNDERSCORES. Also because all of them are under the CudaComputeMode class we don't need to prefix them all with cudaComputeMode

Could you please change them to

public enum CudaComputeMode {
    DEFAULT(0),
    EXCLUSIVE(1),
    PROHIBITED(2),
    EXCLUSIVE_PROCESS(3);
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the UPPER_CASE_WITH_UNDERSCORES style.

*/
public static CudaComputeMode getComputeMode() {
int nativeMode = Cuda.getNativeComputeMode();
switch (nativeMode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally for other enums we put the mapping from native in the enum class itself. That way all of the info is in a single file and is consistent. We also can make the code a lot smaller if we don't worry about performance. Because this is not something that is going to be called in a tight loop where performance matters, I would much rather have smaller code with guaranteed consistency. You can use the BinaryOp.fromNative as an example.

  static BinaryOp fromNative(int nativeId) {
    for (BinaryOp type : OPS) {
      if (type.nativeId == nativeId) {
        return type;
      }
    }
    throw new IllegalArgumentException("Could not translate " + nativeId + " into a BinaryOp");
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the way of native mapping. For now, it follows above style.

@@ -399,6 +399,14 @@ public void testPoolLimitNonPoolMode() {
() -> Rmm.initialize(RmmAllocationMode.CUDA_DEFAULT, false, 1024, 2048));
}

@Test
public void testGetCudaRuntimeInfo() {
Rmm.initialize(RmmAllocationMode.POOL, false, 1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we run these APIs without initializing the pool? Because in the common case I suspect that is how they are going to be used.

Also why are the tests a part of RMM. They should have nothing to do with RMM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this initialization is used to set the GPU device for the later operations, per the current JNI implementation. But even so, moving these tests to Cuda would be better.

And I am wondering whether all the calls in CudaJni need to set the device first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this case to a newly-created test class CudaTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there is no necessary to initialize RMM before running these APIs.

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java Outdated Show resolved Hide resolved
@@ -399,6 +399,14 @@ public void testPoolLimitNonPoolMode() {
() -> Rmm.initialize(RmmAllocationMode.CUDA_DEFAULT, false, 1024, 2048));
}

@Test
public void testGetCudaRuntimeInfo() {
Rmm.initialize(RmmAllocationMode.POOL, false, 1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this initialization is used to set the GPU device for the later operations, per the current JNI implementation. But even so, moving these tests to Cuda would be better.

And I am wondering whether all the calls in CudaJni need to set the device first.

Signed-off-by: sperlingxx <[email protected]>
Signed-off-by: sperlingxx <[email protected]>
/**
* Compute-exclusive-thread mode
* Only one thread in one process will be able to use cudaSetDevice() with this device.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mode is deprecated. Better to add the same comment here.

(base) liangcail@liangcail-ubuntu18:~/work/projects/on_github/spark-rapids$ nvidia-smi -c 1
Warning: Exclusive_Thread was deprecated! Setting Exclusive_Process instead.
Unable to set the compute mode for GPU 00000000:01:00.0: Insufficient Permissions
Terminating early due to previous errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added warning message for the deprecation.

@sperlingxx
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d3b440e into rapidsai:branch-21.08 Jun 11, 2021
@sperlingxx sperlingxx deleted the jni_cuda_info branch June 11, 2021 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Java APIs to determine CUDA driver and runtime versions
4 participants