Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] [opengl] Support 'ti.thread_dim(x)' to change invocations per thread for OpenGL #1693

Closed
wants to merge 8 commits into from

Conversation

archibate
Copy link
Collaborator

@archibate archibate commented Aug 13, 2020

Related issue = unrevert #1676

[Click here for the format server]


Now I realized I was totally misunderstood the term stride, I changed it to invocation now. Will do name nit in C++ iapr.

See #1691 for the new terms:

Taichi programs could be executed in parallel on GPU.
The level of structure of GPU is defined hierarchically.

From small to large, the computation units are:
invocation < thread < block < grid.

  • invocation:
    Invocation is the body of a for-loop.
    Each invocation corresponding to a specific i value in for-loop.

  • thread:
    Invocations are grouped into threads.
    Threads are the minimal unit that is parallelized.
    All invocations within a thread are executed in serial.
    We usually use 1 invocation per thread for maximizing parallel performance.

  • block:
    Threads are grouped into blocks.
    All threads within a block are executed in parallel.
    Threads within the same block can share their block local storage.

  • grid:
    Blocks are grouped into grids.
    Grid is the minimal unit that being launched from host.
    All blocks within a grid are executed in parallel.
    In Taichi, each parallelized for-loop is a grid.

@archibate archibate added the wontfix We won't fix this issue or merge this PR label Aug 17, 2020
@yuanming-hu
Copy link
Member

Thanks. I believe you are using thread_dim as an unrolling factor for the for-loops? I'm not sure if this will help improve performance on GPUs.

I lean towards not introducing one more concept for people to learn.

@archibate
Copy link
Collaborator Author

Closing, we need come to agree with a systematic solution for all these stuffs, block_dim, grid_dim, thread_dim, and make DecoratorRecorder more easy-to-add. FF2 put this into v0.8.0 roadmap.

@archibate archibate closed this Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix We won't fix this issue or merge this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants