-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use z-buffer for rejecting opaque fragments. #648
Conversation
This still needs a little bit of cleanup - but it implements the functionality and passes all tests. So it should be fairly close to the final version. |
@@ -286,6 +287,7 @@ struct PrimitiveInstance { | |||
int clip_task_index; | |||
int layer_index; | |||
int sub_index; | |||
int z; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could probably reduce the user_data
to a single int
in order to keep the 32 byte size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think it's fine to do this as a follow up though.
@@ -352,6 +356,7 @@ Primitive load_primitive_custom(PrimitiveInstance pi) { | |||
prim.prim_index = pi.specific_prim_index; | |||
prim.sub_index = pi.sub_index; | |||
prim.user_data = pi.user_data; | |||
prim.z = pi.z; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some compilers may require an explicit cast here.
Moreover, what are you going to do with this Z value? We need to scale it down into [-1, 1]
region before passing into gl_Position
, and I don't see this happening anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the cast. The vertices are transformed by uTransform, which is an orthographic projection matrix at the end of the function. This converts the z values to NDC.
} | ||
} | ||
|
||
pub fn enable_depth(&self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could merge those enable_*/disable_*
for tinier interface with the device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, perhaps as a follow up we should tidy up the whole device interface?
@@ -39,6 +39,7 @@ void main(void) { | |||
#ifdef WR_FEATURE_TRANSFORM | |||
TransformVertexInfo vi = write_transform_vertex(segment_rect, | |||
prim.local_clip_rect, | |||
prim.z, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this point, we might as well want to just pass prim
there directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the shaders generate the local rect separately (as in the case above), which is why I wasn't passing in the prim directly - to avoid bugs where that code accesses the prim.local_rect accidentally. But we could probably tidy up those structure definitions in the shader code.
@@ -123,7 +122,6 @@ const CLIP_FEATURE: &'static str = "CLIP"; | |||
|
|||
enum ShaderKind { | |||
Primitive, | |||
Clear, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweeet!
@@ -990,12 +961,87 @@ impl Renderer { | |||
self.profile_counters.draw_calls.inc(); | |||
} | |||
|
|||
fn submit_batch(&mut self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
let needs_clipping = batch.key.flags.needs_clipping(); | ||
debug_assert!(!needs_clipping || batch.key.blend_mode == BlendMode::Alpha); | ||
self.device.enable_depth(); | ||
self.device.enable_depth_write(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we set the depth function anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had one question
@@ -120,7 +120,6 @@ fn main() { | |||
debug: false, | |||
enable_subpixel_aa: false, | |||
clear_framebuffer: true, | |||
clear_empty_tiles: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not clear empty tiles anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already guaranteed to be cleared in draw_target
. Not sure though, at what point it became redundant/useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have to clear the z-buffer now, I figure we might as well just do a clear of the color buffer as well. This also makes it perform much better on mobile / tiled GPUs which rely on a clear due to the way tiling works. We could definitely re-instate this later as an optimization if it makes sense to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glennw I agree with all the "let's do X as follow up" requests, the PR looks good!
@@ -120,7 +120,6 @@ fn main() { | |||
debug: false, | |||
enable_subpixel_aa: false, | |||
clear_framebuffer: true, | |||
clear_empty_tiles: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already guaranteed to be cleared in draw_target
. Not sure though, at what point it became redundant/useless.
AMD GCN architecture, covering probably most of the AMD cards on the market, has meta-data for clears, fragment masks, and depth/stencil bounds - all per internal tile. So clearing on them is dirt cheap, you just pay a little before sampling from those targets for the first time, since the driver needs to fill up the untouched tiles. It basically does what is being removed from WR now, just more efficiently ;)
… On Dec 22, 2016, at 18:10, Glenn Watson ***@***.***> wrote:
@glennw commented on this pull request.
In replay/src/main.rs:
> @@ -120,7 +120,6 @@ fn main() {
debug: false,
enable_subpixel_aa: false,
clear_framebuffer: true,
- clear_empty_tiles: false,
Since we have to clear the z-buffer now, I figure we might as well just do a clear of the color buffer as well. This also makes it perform much better on mobile / tiled GPUs which rely on a clear due to the way tiling works. We could definitely re-instate this later as an optimization if it makes sense to.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@bors-servo r+ |
📌 Commit 7eb458c has been approved by |
Use z-buffer for rejecting opaque fragments. <!-- Reviewable:start --> This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/648) <!-- Reviewable:end -->
☀️ Test successful - status-travis |
This change is