Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: High-level GPU pipeline (early, much work remains) #7892

Closed
wants to merge 29 commits into from

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Aug 7, 2015

Very early draft of a new way of drawing PSP graphics. Instead of simply running the state machine, where our only way to do clever drawcall optimizations is to add more flags and states, this takes the command stream and bundles it up into self-contained "CommandPackets" which consists of draw calls that point to state blocks.

The division between the state blocks is pretty similar to what we see in the modern GPU APIs. However, the mapping will of course not be 1:1 as we sometimes must use shaders to simulated blend features, and the whole stencil/alpha mess, etc.

But anyway, the fun part is that now we can do optimizations across drawcalls using straight line code without building more and more impossible-to-understand state machines. For example, the water that GTA stupidly draws using 1000 draw calls, each a single polygon with a different transform matrix, can be trivially detected and bundled up into a single draw call, which should greatly improve performance on mobile.

Another fun thing that this enables is to take skinned meshes and convert them to use modern matrix palette techniques, where we upload the entire skeleton at once and skin the whole mesh on the GPU in a single draw call. Our "software skinning" kind of emulated this on the CPU, but it should be possible get it a lot faster.

New forms of caching, and decoding many draw calls into a single big buffer, binding it once and then drawing with much less buffer switching than before also become possible.

Also it becomes very easy to do things like vertex decoding and texture conversion in parallel.

Unfortunately this is only a very early sketch, what has been (mostly) implemented is the drawcall bundling. The actual drawing backends I've only just started on. Another open question is when to split these commandpackets and send them off for drawing.

Hopefully, this will, once finished, lead to much greater 3D performance on platforms with slow OpenGL drivers, like most Android devices.

Also MSVC project files have not yet been updated.

@DonelBueno
Copy link

Any chance to implement the feature discussed in issue #7306 (read back from the z-buffer) in this branch?

}

Command *cmd = &cmdPacket->commands[cmdPacket->numCommands++];
if (cmdPacket->numCommands == cmdPacket->maxCommands) cmdPacket->full = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be e.g. full = STATE_FRAGMENT;?

-[Unknown]

Start hooking up the new unimplemented texture cache. Improve command buffer disasm.
@VIRGINKLM
Copy link
Contributor

Slightly unrelated, but talking about low-level APIs, Android picked Vulkan for Android L and it's going to be mandatory for ARMv8 devices (ARMv7-A and older devices can use both Vulkan [if they get driver support] and fall back to OGL ES incase they don't support it.)

@unknownbrackets
Copy link
Collaborator

Cool, I'm glad it's mandatory fpr armv8. That's pretty interesting.

-[Unknown]

@VIRGINKLM
Copy link
Contributor

It seems they want to move on from older philosophy APIs to match Apple's Metal performance.

@hrydgard
Copy link
Owner Author

@VIRGINKLM Just curious, where did you see that it will be mandatory on ARMv8? Great news if so, I just hadn't heard it before.

And I'm sure you mean Android M or N as L, Lollipop, has been out for a while.

@VIRGINKLM
Copy link
Contributor

They are doing some changes in the AOSP code submittions that indicate that, things may change in the end but they seem they try to do what they did with Dalvik and ART. Yeah you are right it was Android N or atleast a substancial update to Marshmallow, similarly to 4.1/4.2 and 4.3. It seems they want to force SOC companies to create Vulkan drivers because if they don't force them they will procrastinate creating them and they will use OGL for a loooong time lagging Android quality and performance behind competition, so the smartest way to do it is take a relatively new architecture and bind it with an other, in this case ARMv8 and Vulkan so if a company wants to bring out a device that uses ARMv8 architecture they will need to make a Vulkan driver to it otherwise they'll have to fallback to ARMv7a that allows OGL. Clever and fair tbh.

@dmpe
Copy link

dmpe commented Oct 17, 2015

maybe offtopic but https://www.khronos.org/assets/uploads/developers/library/2015-siggraph/3D-BOF-SIGGRAPH_Aug15.pdf page 138f. can be interesting to you

@unknownbrackets
Copy link
Collaborator

Crazy idea: if we queued most everything, could we keep the CPU going even during a vsync flip, and just block when something tries to render next time?

That's kinda what happens on the PSP itself, since the GPU runs independently. Ideally we want to block for vblank, but other threads can and do continue running.

This could potentially allow audio for example to be a bit smoother even when auto frameskip is happening.

-[Unknown]

@VIRGINKLM
Copy link
Contributor

Vulkan 1.0 is now available!
https://www.khronos.org/vulkan/

PS: Qualcomm starts support for it's Adreno 4XX line and Nvidia seems to be jumping "fast adoption" bandwagon too.
https://developer.nvidia.com/vulkan-driver

@hrydgard
Copy link
Owner Author

@VIRGINKLM fully aware of that :) I pushed my Vulkan work for PPSSPP here: https://github.com/hrydgard/ppsspp/tree/vulkan

As for this branch, I'm no longer sure this is the right approach.

@hrydgard
Copy link
Owner Author

hrydgard commented May 9, 2016

I don't much believe in this approach any more, I think it can be done in an easier way.

@hrydgard hrydgard closed this May 9, 2016
@Anuskuss
Copy link
Contributor

@hrydgard What about support for devices which haven't got Vulkan (yet)? I don't know how similar Vulkan and Metal in terms of coding are, but if porting isn't easy, this would be the way to go for iDevices.

@hrydgard
Copy link
Owner Author

That's definitely a consideration but I no longer think this change has the right approach, I think I can do better. Probably not gonna happen anytime soon though...

@hrydgard hrydgard deleted the high-gpu branch March 18, 2017 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants