RFC: High-level GPU pipeline (early, much work remains) #7892

hrydgard · 2015-08-07T20:40:51Z

Very early draft of a new way of drawing PSP graphics. Instead of simply running the state machine, where our only way to do clever drawcall optimizations is to add more flags and states, this takes the command stream and bundles it up into self-contained "CommandPackets" which consists of draw calls that point to state blocks.

The division between the state blocks is pretty similar to what we see in the modern GPU APIs. However, the mapping will of course not be 1:1 as we sometimes must use shaders to simulated blend features, and the whole stencil/alpha mess, etc.

But anyway, the fun part is that now we can do optimizations across drawcalls using straight line code without building more and more impossible-to-understand state machines. For example, the water that GTA stupidly draws using 1000 draw calls, each a single polygon with a different transform matrix, can be trivially detected and bundled up into a single draw call, which should greatly improve performance on mobile.

Another fun thing that this enables is to take skinned meshes and convert them to use modern matrix palette techniques, where we upload the entire skeleton at once and skin the whole mesh on the GPU in a single draw call. Our "software skinning" kind of emulated this on the CPU, but it should be possible get it a lot faster.

New forms of caching, and decoding many draw calls into a single big buffer, binding it once and then drawing with much less buffer switching than before also become possible.

Also it becomes very easy to do things like vertex decoding and texture conversion in parallel.

Unfortunately this is only a very early sketch, what has been (mostly) implemented is the drawcall bundling. The actual drawing backends I've only just started on. Another open question is when to split these commandpackets and send them off for drawing.

Hopefully, this will, once finished, lead to much greater 3D performance on platforms with slow OpenGL drivers, like most Android devices.

Also MSVC project files have not yet been updated.

DonelBueno · 2015-08-10T14:07:37Z

Any chance to implement the feature discussed in issue #7306 (read back from the z-buffer) in this branch?

unknownbrackets · 2015-08-12T12:59:12Z

GPU/High/Command.cpp

+	}
+
+	Command *cmd = &cmdPacket->commands[cmdPacket->numCommands++];
+	if (cmdPacket->numCommands == cmdPacket->maxCommands) cmdPacket->full = true;


Should this be e.g. full = STATE_FRAGMENT;?

-[Unknown]

…ss calculations

…ix a lot of things

Start hooking up the new unimplemented texture cache. Improve command buffer disasm.

…uch)

…ndling.

VIRGINKLM · 2015-10-13T20:54:32Z

Slightly unrelated, but talking about low-level APIs, Android picked Vulkan for Android L and it's going to be mandatory for ARMv8 devices (ARMv7-A and older devices can use both Vulkan [if they get driver support] and fall back to OGL ES incase they don't support it.)

unknownbrackets · 2015-10-14T04:25:39Z

Cool, I'm glad it's mandatory fpr armv8. That's pretty interesting.

-[Unknown]

VIRGINKLM · 2015-10-14T14:16:52Z

It seems they want to move on from older philosophy APIs to match Apple's Metal performance.

hrydgard · 2015-10-14T21:21:30Z

@VIRGINKLM Just curious, where did you see that it will be mandatory on ARMv8? Great news if so, I just hadn't heard it before.

And I'm sure you mean Android M or N as L, Lollipop, has been out for a while.

VIRGINKLM · 2015-10-15T01:18:06Z

They are doing some changes in the AOSP code submittions that indicate that, things may change in the end but they seem they try to do what they did with Dalvik and ART. Yeah you are right it was Android N or atleast a substancial update to Marshmallow, similarly to 4.1/4.2 and 4.3. It seems they want to force SOC companies to create Vulkan drivers because if they don't force them they will procrastinate creating them and they will use OGL for a loooong time lagging Android quality and performance behind competition, so the smartest way to do it is take a relatively new architecture and bind it with an other, in this case ARMv8 and Vulkan so if a company wants to bring out a device that uses ARMv8 architecture they will need to make a Vulkan driver to it otherwise they'll have to fallback to ARMv7a that allows OGL. Clever and fair tbh.

dmpe · 2015-10-17T20:25:27Z

maybe offtopic but https://www.khronos.org/assets/uploads/developers/library/2015-siggraph/3D-BOF-SIGGRAPH_Aug15.pdf page 138f. can be interesting to you

unknownbrackets · 2015-12-25T21:10:37Z

Crazy idea: if we queued most everything, could we keep the CPU going even during a vsync flip, and just block when something tries to render next time?

That's kinda what happens on the PSP itself, since the GPU runs independently. Ideally we want to block for vblank, but other threads can and do continue running.

This could potentially allow audio for example to be a bit smoother even when auto frameskip is happening.

-[Unknown]

VIRGINKLM · 2016-02-18T05:23:28Z

Vulkan 1.0 is now available!
https://www.khronos.org/vulkan/

PS: Qualcomm starts support for it's Adreno 4XX line and Nvidia seems to be jumping "fast adoption" bandwagon too.
https://developer.nvidia.com/vulkan-driver

hrydgard · 2016-02-18T08:39:21Z

@VIRGINKLM fully aware of that :) I pushed my Vulkan work for PPSSPP here: https://github.com/hrydgard/ppsspp/tree/vulkan

As for this branch, I'm no longer sure this is the right approach.

hrydgard · 2016-05-09T09:22:07Z

I don't much believe in this approach any more, I think it can be done in an easier way.

Anuskuss · 2016-05-19T00:43:33Z

@hrydgard What about support for devices which haven't got Vulkan (yet)? I don't know how similar Vulkan and Metal in terms of coding are, but if porting isn't easy, this would be the way to go for iDevices.

hrydgard · 2016-05-19T07:45:39Z

That's definitely a consideration but I no longer think this change has the right approach, I think I can do better. Probably not gonna happen anytime soon though...

unknownbrackets reviewed Aug 12, 2015
View reviewed changes

hrydgard added 26 commits September 2, 2015 14:08

HighGPU WIP

0a088db

More HighGPU WIP. Builds now, sort of.

d2d83f2

Can now run the new backend. It crashes, unsurprisingly.

1d12328

Work towards running. Command translation: Fix index and vertex addre…

b6f1258

…ss calculations

Use the old framebuffer manager, it's nearly "gstate-less".

9a067eb

Things now "run" without crashing. Not really working though.

076aeb1

Implement HighGpu command buffer logging, flush the command buffer, f…

2893051

…ix a lot of things

Command block light fixes. Log what causes the block to be full.

7880e30

CLUT wip

0fbaada

highgpu: Fix crashes in framebuffermanager

5cca66d

Minor bugfixes

c3e0b02

Optimize HighGPU dl parser a bit. More dupe detection (bones, lights).

7790ac3

Start hooking up the new unimplemented texture cache. Improve command buffer disasm.

CLUT fixes

4543a03

Some cleanups, fix use of uninitialized values

d570d39

Fix setting the displayed framebuffer (dangerous to comment out too m…

afabffc

…uch)

Start porting over the texture cache to the new framework.

2562098

Remove more gstate from new texcache. Fix some bugs in light state ha…

50aad41

…ndling.

More state fixes, more repetition detection

1842196

Fix issues with gllostmanager not working with the new backend

dd07f7b

Be more careful with alignment in MemoryArena

206b96a

Work towards porting the shader generators to the High framework.

4f78633

Get rid of more uses of gstate in shadergens

4dc7063

More of the same

6564e2b

Actually call the new frag/vert ID functions

b44a453

Intermediate commit

27b6775

Crashfixes and cleanup

9d6efe1

hrydgard force-pushed the high-gpu branch from 97f85c7 to a7ac675 Compare September 2, 2015 12:33

hrydgard added 3 commits September 2, 2015 14:33

Always load the viewport/region. Necessary for FBO heuristics.

a7ac675

Make the Vim autocompletion plugin "YouCompleteMe" happy

e60cc07

Buildfix

619c153

hrydgard closed this May 9, 2016

hrydgard deleted the high-gpu branch March 18, 2017 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: High-level GPU pipeline (early, much work remains) #7892

RFC: High-level GPU pipeline (early, much work remains) #7892

hrydgard commented Aug 7, 2015

DonelBueno commented Aug 10, 2015

unknownbrackets Aug 12, 2015

VIRGINKLM commented Oct 13, 2015

unknownbrackets commented Oct 14, 2015

VIRGINKLM commented Oct 14, 2015

hrydgard commented Oct 14, 2015

VIRGINKLM commented Oct 15, 2015

dmpe commented Oct 17, 2015

unknownbrackets commented Dec 25, 2015

VIRGINKLM commented Feb 18, 2016

hrydgard commented Feb 18, 2016

hrydgard commented May 9, 2016

Anuskuss commented May 19, 2016

hrydgard commented May 19, 2016

RFC: High-level GPU pipeline (early, much work remains) #7892

RFC: High-level GPU pipeline (early, much work remains) #7892

Conversation

hrydgard commented Aug 7, 2015

DonelBueno commented Aug 10, 2015

unknownbrackets Aug 12, 2015

Choose a reason for hiding this comment

VIRGINKLM commented Oct 13, 2015

unknownbrackets commented Oct 14, 2015

VIRGINKLM commented Oct 14, 2015

hrydgard commented Oct 14, 2015

VIRGINKLM commented Oct 15, 2015

dmpe commented Oct 17, 2015

unknownbrackets commented Dec 25, 2015

VIRGINKLM commented Feb 18, 2016

hrydgard commented Feb 18, 2016

hrydgard commented May 9, 2016

Anuskuss commented May 19, 2016

hrydgard commented May 19, 2016