Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace nodes with render_features and new jobs framework. #135

Merged
merged 6 commits into from
May 15, 2021

Conversation

DavidVonDerau
Copy link
Collaborator

@DavidVonDerau DavidVonDerau commented May 12, 2021

If you are looking for an example of the new entry points, look at the Mesh feature in the demo.
RenderFeaturePlugin: https://github.com/DavidVonDerau/rafx/blob/dvd/jobs/demo/src/features/mesh/plugin.rs
RenderFeatureExtractJob: https://github.com/DavidVonDerau/rafx/blob/dvd/jobs/demo/src/features/mesh/jobs/extract.rs
RenderFeaturePrepareJob: https://github.com/DavidVonDerau/rafx/blob/dvd/jobs/demo/src/features/mesh/jobs/prepare.rs
RenderFeatureWriteJob: https://github.com/DavidVonDerau/rafx/blob/dvd/jobs/demo/src/features/mesh/jobs/write.rs
FramePacket, SubmitPacket: https://github.com/DavidVonDerau/rafx/blob/dvd/jobs/demo/src/features/mesh/internal/frame_packet.rs

Replace nodes with render_features and new jobs framework.

401a3e5

This is an alternative approach to handling the FramePacket, ViewPacket, and SubmitNode construction. I have tried to closely match the design of the Destiny slides[1]. Each feature now defines custom data for allocation in the frame packet, view packet, and submit packets explicitly. In order to avoid a proliferation of generics, there are type-erased traits (indicated by a RenderFeature prefix) that hide the use of the generic types behind a dyn trait.

This work had two primary goals:

  1. Features should not need to know about threads or tasks in order to be parallelized.
  2. Writing a new feature should be as ergonomic as possible while maintaining the 1st requirement.

Extract, Prepare, and Write jobs now support the same entry points as those defined by Destiny, with the exception of the per game object entry points mentioned as perf & memory optimization. Each feature defines their data format in a FramePacket and SubmitPacket.

The FramePacket is the data extracted from the world. The SubmitPacket is the data prepared for the Write job. Both the FramePacket and SubmitPacket are allocated up-front to minimize allocations and both are accessible to the Write job.

Each FramePacket contains a ViewPacket for each RenderView and each SubmitPacket contains a ViewSubmitPacket for each RenderView. The FramePacket has a RenderObjectInstance for each entity and RenderObject in the current frame and the ViewPacket has a RenderObjectInstancePerView for each RenderObjectInstance in that particular RenderView's visibility. The same layout follow the SubmitPacket and ViewSubmitPacket, but the ViewSubmitPacket also a pre-allocated list of SubmitNodeBlocks for each RenderPhase supported by that RenderFeature.

All of the submit nodes from all features are unioned into ViewPhaseSubmitNodeBlocks according to a shared RenderView and RenderPhase and then sorted by the RenderPhase's defined SubmitNodeSortFunction.

image

A feature's RenderObjects may be stored in a RenderObjectRegistry -- this replaces the NodeSet previously used by the demo's TileLayer, Sprite, and Mesh render features.

The Extract entry points are defined by ExtractJobEntryPoints and the Prepare entry points are defined by PrepareJobEntryPoints. Each entry point receives a Context argument (e.g. ExtractPerFrameContext) to minimize the loss of performance for unneeded data by that render feature. References to the World or other game resources can be cached while creating the Extract, Prepare, or Write jobs. Each job type is tied to an explicit lifetime (e.g. 'extract) representing that stage of the renderer pipeline.

[1] https://advances.realtimerendering.com/destiny/gdc_2015/Tatarchuk_GDC_2015__Destiny_Renderer_web.pdf

Add new collections to rafx-base.

ca335df

  • An OwnedPool<T> is a lightweight pool that moves a Pooled<T> to the user. Pooled<T> implements Deref and DerefMut for T. When the Pooled<T> is dropped, the T is moved back to the OwnedPool<T> via a channel.
  • Added a new key called RawDropSlabKey for the drop slab. The RawDropSlabKey is a type-erased variant of DropSlabKey so that it can be stored as a handle in a type or collection without requiring a generic T.

There are 3 variants of a generic thread-safe storage that don't require default initialization.

  • AtomicOnceCell is a thread-safe variant of OnceCell.
  • AtomicOnceCellArray is an indexed variant representing a contiguous fixed-size array of the cells using an atomic bitvector to track accesses.
  • AtomicOnceCellStack is built on top of AtomicOnceCellArray and provides push / reserve / set semantics using an atomic to track the top of the stack.

All of the Atomic storages will panic if used incorrectly, e.g. by trying to set the same cell simultaneously from multiple threads, or by attempting to read a cell that has not yet been initialized via a call to set or push. AtomicOnceCell and AtomicOnceCellArray contain tests but AtomicOnceCellStack does not because the safety invariants of the AtomicOnceCellStack rely on the underlying AtomicOnceCellArray.

Update Renderer to use new render_features code.

cfeae1a

  • The RenderFeaturePlugin has been expanded with additional functions for calculating and allocating frame or submit packets, creating each job type, and determining if a given RenderView and RenderViewVisibilityQuery is relevant to that feature.
  • Added RendererThreadPool to allow the application to control the parallelization of different stages of the renderer pipeline. RendererThreadPoolNone is provided as a single-threaded default if the application does not provide their own implementation.
  • Seperated RendererAssetPlugin from what was previously just a RendererPlugin containing both feature code (now moved to RenderFeaturePlugin) and asset code.
  • Added many helper functions to Renderer and RenderFrameJob to make the implementation of the RendererThreadPool easier.

Replace nodes_api_design with renderer_triangle example.

9d11f1f

This example builds on the asset_triangle example to show how to use rafx-renderer and a RenderFeaturePlugin to drive rendering.

Update the demo to use the new render_features code.

320909a

  • Every feature has been rewritten to use the new extract and prepare entry points and to define their custom FramePacket or SubmitPacket data.
  • The demo includes an example implementation of RendererThreadPool using bevy-tasks.

@DavidVonDerau DavidVonDerau requested a review from aclysma May 12, 2021 03:03
demo/src/components.rs Outdated Show resolved Hide resolved
demo/src/demo_renderer_thread_pool.rs Outdated Show resolved Hide resolved
demo/src/demo_renderer_thread_pool.rs Show resolved Hide resolved
demo/src/features/debug3d/jobs/extract.rs Outdated Show resolved Hide resolved
demo/src/features/debug3d/jobs/prepare.rs Outdated Show resolved Hide resolved
rafx-renderer/src/render_frame_job.rs Show resolved Hide resolved
@aclysma
Copy link
Owner

aclysma commented May 13, 2021

Great work! I think we should pick just a few of the "most important suggestions" now (like the addition of the rusts-hash crate as a dependency) and follow up on most of the feedback after merging. Almost all of it is very minor/optional suggestion or questions, and I want to make sure we merge this before any other changes so that we avoid merge conflicts on such a huge change.

@DavidVonDerau DavidVonDerau force-pushed the dvd/jobs branch 3 times, most recently from c204536 to 50429bf Compare May 14, 2021 02:18
@aclysma
Copy link
Owner

aclysma commented May 14, 2021

Went through anything that was changed, I trust your judgement on what should be addressed now, and what can be done later. If there's anything you think we should do later but you don't want to do right away (or don't want to do yourself), we can just add an issue for it. It's not a bad thing to have a few issues that are first-contributor-friendly, and some of the minor suggestions (like changing the hashing algorithm) would be good candidates for that.

@DavidVonDerau DavidVonDerau force-pushed the dvd/jobs branch 2 times, most recently from e61ed8c to 81ac7d7 Compare May 15, 2021 00:13
@DavidVonDerau DavidVonDerau changed the title WIP: Replace nodes with render_features and new jobs framework. Replace nodes with render_features and new jobs framework. May 15, 2021
@DavidVonDerau DavidVonDerau marked this pull request as ready for review May 15, 2021 00:15
@DavidVonDerau DavidVonDerau requested a review from aclysma May 15, 2021 00:15
- An `OwnedPool<T>` is a lightweight pool that moves a `Pooled<T>` to the user. `Pooled<T>` implements `Deref` and `DerefMut` for `T`. When the `Pooled<T>` is dropped, the `T` is moved back to the `OwnedPool<T>` via a channel.
- Added a new key called `RawDropSlabKey` for the drop slab. The `RawDropSlabKey` is a type-erased variant of `DropSlabKey` so that it can be stored as a handle in a type or collection without requiring a generic `T`.

There are 3 variants of a generic thread-safe storage that don't require default initialization.

-  `AtomicOnceCell` is a thread-safe variant of `OnceCell`.
-  `AtomicOnceCellArray` is an indexed variant representing a contiguous fixed-size array of the cells using an atomic bitvector to track accesses.
-  `AtomicOnceCellStack` is built on top of `AtomicOnceCellArray` and provides `push` / `reserve` / `set` semantics using an atomic to track the top of the stack.

All of the `Atomic` storages will `panic` if used incorrectly, e.g. by trying to set the same cell simultaneously from multiple threads, or by attempting to read a cell that has not yet been initialized via a call to `set` or `push`. `AtomicOnceCell` and `AtomicOnceCellArray` contain tests but `AtomicOnceCellStack` does not because the safety invariants of the `AtomicOnceCellStack` rely on the underlying `AtomicOnceCellArray`.
This is an alternative approach to handling the `FramePacket`, `ViewPacket`, and `SubmitNode` construction. I have tried to closely match the design of the Destiny slides[1]. Each feature now defines custom data for allocation in the frame packet, view packet, and submit packets explicitly. In order to avoid a proliferation of generics, there are type-erased traits (indicated by a `RenderFeature` prefix) that hide the use of the generic types behind a `dyn` trait.

This work had two primary goals:
1. Features should not need to know about threads or tasks in order to be parallelized.
2. Writing a new feature should be as ergonomic as possible while maintaining the 1st requirement.

`Extract`, `Prepare`, and `Write` jobs now support the same entry points as those defined by Destiny, with the exception of the `per game object` entry points mentioned as perf & memory optimization. Each feature defines their data format in a `FramePacket` and `SubmitPacket`. The `FramePacket` is the data extracted from the world. The `SubmitPacket` is the data prepared for the `Write` job. Both the `FramePacket` and `SubmitPacket` are allocated up-front to minimize allocations and both are accessible to the `Write` job. Each `FramePacket` contains a `ViewPacket` for each `RenderView` and each `SubmitPacket` contains a `ViewSubmitPacket` for each `RenderView`. The `FramePacket` has a `RenderObjectInstance` for each entity and `RenderObject` in the current frame and the `ViewPacket` has a `RenderObjectInstancePerView` for each `RenderObjectInstance` in that particular `RenderView`'s visibility. The same layout follow the `SubmitPacket` and `ViewSubmitPacket`, but the `ViewSubmitPacket` also a pre-allocated list of `SubmitNodeBlock`s for each `RenderPhase` supported by that `RenderFeature`. All of the submit nodes from all features are unioned into `ViewPhaseSubmitNodeBlock`s according to a shared `RenderView` and `RenderPhase` and then sorted by the `RenderPhase`'s defined `SubmitNodeSortFunction`. A feature's `RenderObject`s may be stored in a `RenderObjectRegistry` -- this replaces the `NodeSet` previously used by the demo's `TileLayer`, `Sprite`, and `Mesh` render features.

The `Extract` entry points are defined by `ExtractJobEntryPoints` and the `Prepare` entry points are defined by `PrepareJobEntryPoints`. Each entry point receives a `Context` argument (e.g. `ExtractPerFrameContext`) to minimize the loss of performance for unneeded data by that render feature. References to the `World` or other game resources can be cached while creating the `Extract`, `Prepare`, or `Write` jobs. Each job type is tied to an explicit lifetime (e.g. `'extract`) representing that stage of the renderer pipeline.

[1] https://advances.realtimerendering.com/destiny/gdc_2015/Tatarchuk_GDC_2015__Destiny_Renderer_web.pdf
- The `RenderFeaturePlugin` has been expanded with additional functions for calculating and allocating frame or submit packets, creating each job type, and determining if a given `RenderView` and `RenderViewVisibilityQuery` is relevant to that feature.
- Added `RendererThreadPool` to allow the application to control the parallelization of different stages of the renderer pipeline. `RendererThreadPoolNone` is provided as a single-threaded default if the application does not provide their own implementation.
- Seperated `RendererAssetPlugin` from what was previously just a `RendererPlugin` containing both feature code (now moved to `RenderFeaturePlugin`) and asset code.
- Added many helper functions to `Renderer` and `RenderFrameJob` to make the implementation of the `RendererThreadPool` easier.

The `extract` stage can be seen in `Renderer::try_create_render_job` and the `prepare` or `write` stages are in `RenderFrameJob::do_render_async`.
This example builds on the `asset_triangle` example to show how to use `rafx-renderer` and a `RenderFeaturePlugin` to drive rendering.
- Every feature has been rewritten to use the new `extract` and `prepare` entry points and to define their custom `FramePacket` or `SubmitPacket` data.
- The demo includes an example implementation of `RendererThreadPool` using `bevy-tasks`.
@@ -30,7 +29,26 @@ impl ImGuiRendererPlugin {
}
}

impl RendererPlugin for ImGuiRendererPlugin {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ImGuiRenderFeaturePlugin now?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(lets do any renames after we merge this)

/// 2. back-to-front
/// 3. by feature index
/// 4. unsorted
pub type SubmitNodeSortFunction = fn(&mut Vec<RenderFeatureSubmitNode>);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one day this would be an enum with a few pre-implemented options + a callback for doing something custom

@@ -0,0 +1,345 @@
# Renderer Architecture

`rafx-renderer` and the `rafx-framework` `render_features` were inspired by the 2015 GDC talk "[Destiny's Multithreaded Rendering Architecture](http://advances.realtimerendering.com/destiny/gdc_2015/Tatarchuk_GDC_2015__Destiny_Renderer_web.pdf)".
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably read through this in more detail later

@aclysma aclysma merged commit a7cd227 into aclysma:master May 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants