-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Vulkan] Device API Multi-streams, multi-queue, and initial multi-thread support #2802
Conversation
…threaded multi stream API
a0045b5
to
384cdcf
Compare
/format |
…into device-api-streams
Shall we fix the empty root buffer first to unblock CI? |
There is this seg fault bug with this PR tho and this also needs to be solved. XD |
/format |
…threaded multi stream API
taichi/backends/device.h
Outdated
virtual void command_sync() = 0; | ||
// Each thraed will acquire its own stream | ||
virtual Stream *get_compute_stream() = 0; | ||
virtual Stream *get_graphics_stream() = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: only declare this for GraphicsDevice..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
TI_ERROR("cannot find a queue"); | ||
} | ||
std::unique_ptr<CommandList> cmd = new_command_list(config); | ||
Stream *stream = get_compute_stream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will get_compute_stream()
always succeed?
compute_stream_[tid] = std::make_unique<VulkanStream>( | ||
*this, compute_queue_, compute_queue_family_index_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute_queue_
and compute_queue_family_index_
might not always be valid :((
Currently in EmbeddedVulkanDevice, if params.is_for_ui
is true, the compute queue won't be created.
This might be a good chance to fix up EmbeddedVulkanDevice
a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just change EmbeddedVulkanDevice
to always create a compute queue for now so it doesn't break? There might be a lot of cleanup work to be done in embedded device, and those should prolly be left for future PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could have a standalone queue for copying buffers and stuff. idk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should in future create dedicated transfer queue, but for all the devices & APIs taichi support, the compute queue will always be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think let's create both queues in EmbeddedVulkanDevice
for now and rename and rework it to VulkanDeviceLoader
or something in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also be happy if you leave it for now and add a TODO, since I don't think ggui uses memcpy anywhere yet (and if it does, we can create a cmd list). This PR is big enough as it is lol.
Seems |
/format |
@AmesingFlank This changes the command list acquire and submit API so GGUI needs adjustments.
In Vulkan each command buffer and command pool must be limited to a single thread, in this case we will have a unique compute / graphics queue per thread, and each queue contains a command pool.
#2736