[Perf] [opengl] Support 'ti.thread_dim(x)' to change invocations per thread for OpenGL #1693
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related issue = unrevert #1676
[Click here for the format server]
Now I realized I was totally misunderstood the term
stride
, I changed it toinvocation
now. Will do name nit in C++ iapr.See #1691 for the new terms:
Taichi programs could be executed in parallel on GPU.
The level of structure of GPU is defined hierarchically.
From small to large, the computation units are:
invocation < thread < block < grid.
invocation:
Invocation is the body of a for-loop.
Each invocation corresponding to a specific
i
value in for-loop.thread:
Invocations are grouped into threads.
Threads are the minimal unit that is parallelized.
All invocations within a thread are executed in serial.
We usually use 1 invocation per thread for maximizing parallel performance.
block:
Threads are grouped into blocks.
All threads within a block are executed in parallel.
Threads within the same block can share their block local storage.
grid:
Blocks are grouped into grids.
Grid is the minimal unit that being launched from host.
All blocks within a grid are executed in parallel.
In Taichi, each parallelized for-loop is a grid.