-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atlas texture and batched rendering #429
Atlas texture and batched rendering #429
Conversation
Reduces video memory usage Remove hack that excluded large textures from texture atlas
Uses texture atlas for GUI/Nuklear stuff as well and doesn't duplicate sprites anymore Some GUI stuff is still a bit broken when texture atlas has more than one layer and/or compression is enabled...
resources/shaders/basic.frag
Outdated
uniform float h_color_a; | ||
uniform float imgW; | ||
uniform float imgH; | ||
uniform sampler2D tex[8]; // NOTE: The size of this array has a large impact on performance... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This array size has the main impact on performance.
With a GL_MAX_TEXTURE_SIZE
of 16384 this only needs to be 2, 8192 is still common though so went with 8.
Using a GL_TEXTURE_2D_ARRAY
avoids the need for this array (uniform sampler2DArray tex;
)
Would be simpler with much better performance if a proper 2D texture array (GL_TEXTURE_2D_ARRAY) was used. I couldn't quite get it to work, parts of the Nuklear GUI would be missing if the array texture was allocated with > 1 depth (layers)... Tidy up added rendering code and gl shaders
cb9676f
to
78bd75a
Compare
Remove erroneous assertion
fad9e90
to
be4f0b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really great, I now get an FPS of ~130 in the dungeon at 4k!
I would suggest that we drop the SpriteCache altogether, and just preload all textures at startup.
For the gui, I wouldn't worry about batching draws, and just do whatever, and the textures that it generates dynamically (eg, fonts) can just be their own textures. I'm not sure how these are working in this PR at the moment.
Left a few other misc comments scattered about as well. Over all though, amazing work! Thank you very much.
@@ -257,6 +257,9 @@ namespace FARender | |||
|
|||
void SpriteCache::evict() | |||
{ | |||
if (!Render::SpriteGroup::canDelete()) | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we can get rid of the whole concept of a Sprite Cache.
I woule be happy with just preloading all the textures we're going to use into the atlas at startup, and if we have problems with this later, then we can figure it out later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yep sweet, will give this a go.
That method name is also terrible, was meant to be canDeleteSprites
Edit: how do we know which sprites we're going to use at startup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could dump a list, by running the game and tracing each load. Or, honestly, we could just preload every CEL/CL2 file in DIABDAT.MPQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wheybags I had a go at preloading all the sprites, it seems there are simply too many.
My debug outputs: 1402 Cel files, 110247 individual sprites.
The 2D bin packer mustn't have a very good Big O and gets slower and slower, but even disabling that just the Cel decoding takes > 60s.
I don't see dumping a list of used Cel files as being very maintainable.
Might have to stick with loading files runtime?
Also I had to disable decoding these 3 Cel files as they cause a crash when decoding (they also crash if clicked in the celview executable):
"monsters/fireman/firema.cl2"
"monsters/unrav/unravw.cel"
"monsters/darkmage/dmagew.cl2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I would do here is keep some list of sprites, and on first run, decode them all into atlases, and save the atlases as pngs in soem temp folder. Then we can just laod the single pngs on subsequent runs. OFC we would need to invalidate that if the sprite list changes.
The 3 mentioned cel files are corrupt so won't decode under normal conditions |
Ahh thanks @AJenbo that makes sense |
8688322
to
617c032
Compare
efadd7e
to
c70fa72
Compare
MaxRects time increases exponentially, after adding 18000 items MaxRects takes 1000 times longer than Skyline. This is unacceptable with ~110,000 sprites. Skyline packing efficiency is pretty good too
c70fa72
to
8d5c74f
Compare
components/render/atlastexture.cpp
Outdated
GLint maxTextureSize; | ||
GLint maxTextures = 8; // Hardcoded in fragment shader, GL 3.x minimum is 16. | ||
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize); | ||
|
||
mTextureWidth = maxTextureSize; | ||
mTextureHeight = maxTextureSize; | ||
|
||
// Limit number of textures to a reasonable level (measured from testing). | ||
// Note: Increasing this has a severe impact on performance. | ||
GLint estimatedRequiredTextures = (1uLL << 29) / (mTextureWidth * mTextureHeight); | ||
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting maxTextureSize = 32768 on my GTX 1050. This makes mTextureWidth * mTextureHeight be equal to 1073741824, which makes the division in estimatedRequiredTextures be equal to 0, making mTextureLayers be equal to 0. This leads to a assertion failure (release_assert(layer < mTextureLayers)).
components/render/atlastexture.cpp
Outdated
// Limit number of textures to a reasonable level (measured from testing). | ||
// Note: Increasing this has a severe impact on performance. | ||
GLint estimatedRequiredTextures = (1uLL << 29) / (mTextureWidth * mTextureHeight); | ||
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures); | |
mTextureLayers = estimatedRequiredTextures > 0 ? std::min(estimatedRequiredTextures, maxTextures) : maxTextures; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing this problem leads to a glError (freeablo/components/render/sdl2backend.cpp:694, 0x0506).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try 1
instead of maxTextures
, e.g:
mTextureLayers = estimatedRequiredTextures > 0 ? std::min(estimatedRequiredTextures, maxTextures) : 1;
Or just:
GLint estimatedRequiredTextures = mtd::max(1, (1uLL << 29) / (mTextureWidth * mTextureHeight));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it is allocating so much memory it freezes my system (I have 8gb of RAM and a 19gb swap partition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be allocating too much vram from your graphics card..
Can you try this instead:
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);
maxTextureSize = std::min(maxTextureSize, 16384);
mTextureWidth = maxTextureSize;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it worked. Getting 370-400 fps. Wow.
I get the error
If you have time, I think the proper answer is either to move to texture arrays, or break the batch when the atlas texture changes. The second option is more scalable IMO, but it would require some intelligent allocation of atlases (at the very least, the ground textures should all be in one atlas). |
I will get around to renderer work eventually myself, if I do before this is merged, I'll just start working on this branch myself. Either way, what's here already is a good chunk of what needs to be done, so thanks a lot! |
Yeah there's some good stuff in here but it's not stable enough to merge. Storing all the textures is a bit heavy on gpu ram. Splitting textures into drawing layer categories (if possible) and only caching the current level tiles would probably be better. All good, hopefully what's here helps you out |
Use a 2D array texture 8192*8192 with 2 layers
I had a thought that clearing the atlas texture when changing levels would be a simple way to reduce the large RAM requirements of storing all the sprites, decided to give it a go.. |
That's a direction we could go, but tbh I think we should be ok to just store the whole game's assets in VRAM. factorio does this (normally it keeps is all on the GPU, it can swap out too if you're low on VRAM, but that's a relatively recent addition for lower end cards) and we should have waaaaaaaay less image data to store. |
…tchRender # Conflicts: # apps/freeablo/engine/threadmanager.cpp # apps/freeablo/faworld/playerbehaviour.cpp # components/render/nuklear_sdl_gl3.h # components/render/sdl2backend.cpp # components/render/sdl_gl_funcs.cpp
# Conflicts: # apps/freeablo/farender/renderer.cpp # apps/freeablo/faworld/actoranimationmanager.cpp
Im going to start wokring on a bit of an overhaul to the rendering code, so I think I will just merge this as-is for now, and use it as a base (also avoids my pr getting absolutely massive if I just branch off this and merge it all together :v) |
Spent ages trying to get this working with a
GL_TEXTURE_2D_ARRAY
but couldn't quite get it to work, parts of the Nuklear GUI would be missing if the array texture was allocated with > 1 depth (layers).Currently it uses an array of textures (up to 8) to store all the images, runs at ~130 FPS on my machine (
GL_TEXTURE_2D_ARRAY
runs at ~180).Needs a test on a few different PCs