-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: MultiGPU for golang bindings #417
Conversation
@jeremyfelder one thing I found really useful when designing Rust-side device slices is the ability to look up device id of any device pointer using CUDA runtime cudaPointerGetAttributes function. This allows to avoid explicitly storing device id alongside the actual data. |
…ll device association managed implicitly
139fcab
to
afa9617
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its the same design as the rust version of multigpu so no comments on design.
The code looks good. no blocking comments so i just added some suggestions.
} | ||
|
||
// CheckDevice is used to ensure that the DeviceSlice about to be used resides on the currently set device | ||
func (d DeviceSlice) CheckDevice() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly the name is confusing i think
return convert{{if .IsG2}}G2{{end}}ProjectivePointsMontgomery(points, true) | ||
} | ||
|
||
func {{if .IsG2}}G2{{end}}ProjectiveFromMontgomery(points *core.DeviceSlice) cr.CudaError { | ||
points.CheckDevice() | ||
return convert{{if .IsG2}}G2{{end}}ProjectivePointsMontgomery(points, false) | ||
} | ||
{{end}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing extra line
return convertScalarsMontgomery(scalars, true) | ||
} | ||
|
||
func FromMontgomery(scalars *core.DeviceSlice) cr.CudaError { | ||
scalars.CheckDevice() | ||
return convertScalarsMontgomery(scalars, false) | ||
}{{- end}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line
Describe the changes
This PR adds multi gpu support in the golang bindings.
Tha main changes are to DeviceSlice which now includes a
deviceId
attribute specifying which device the underlying data resides on and checks for correct deviceId and current device when using DeviceSlices in any operation.In Go, most concurrency can be done via Goroutines (described as lightweight threads - in reality, more of a threadpool manager), however, there is no guarantee that a goroutine stays on a specific host thread. Therefore, a function
RunOnDevice
was added to the cuda_runtime package which locks a goroutine into a specific host thread, sets a current GPU device, runs a provided function, and unlocks the goroutine from the host thread after the provided function finishes. While the goroutine is locked to the hsot thread, the Go runtime will not assign other goroutines to that host thread