Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fixes for Vulkan backend #5

Merged
merged 6 commits into from
Jun 14, 2021
Merged

[WIP] Fixes for Vulkan backend #5

merged 6 commits into from
Jun 14, 2021

Conversation

Spirrwell
Copy link
Contributor

As per suggestion, I'm making a WIP pull request for the Vulkan-related issues I've fixed/am fixing. So far, this should fix issues #2 and #3

I would appreciate verification that there are no regressions and that what's done so far is fine code-wise.

I do want to tackle any remaining critical issues. I've been mostly going by what @Yamagi mentioned in the ref_vk removal commit on yquake2.

If there are specific examples of critical issues that are reproducible I would appreciate any information on that.

There is one last critical issue that I know of that I THINK I've been able to reproduce, and that's some leak regarding models that Yamagi mentioned. I can trigger a "bad model" error almost every time going from a lower resolution windowed mode to fullscreen.

I want to focus on the critical issues for now, and later if I still feel up for it, I can maybe look into porting GL3's lighting as per issue #1

Spirrwell added 4 commits May 29, 2021 05:24
This fixes issue "3" from here: yquake2#3

This differs from GL3 in that it still supports vk_picmip.

Vk_Upload32 is no longer used, but I left it in case it has some use in the future.
Missed this when removing power of 2 scaling for issue yquake2#3
Could just remove Vk_Upload32, but it might be useful in the future.
@Spirrwell
Copy link
Contributor Author

I gotta admit, I feel like I'm going around in circles. I do think a lot of the problems could probably simply be solved by just rebuilding the swap chain instead of restarting/reloading the renderer in place.

I wanted to figure out what was going on before I did that. BUT when stepping through the debugger I might get thrown into different paths depending on where I put a particular breakpoint. I'm running KDE on Manjaro for example, and I think the windowing system will just resize the window whenever, before you even get to the point where SDL could even report a resize event. (which I know SDL isn't used like that here in yquake2)

I think the only appropriate way to handle it is to respond to the vkQueuePresent suboptimal/out of date errors and rebuild the swapchain. I don't think this will be difficult. I've just sort of been spinning my wheels and felt like it was worth mentioning what I've been up to.

So I'll continue to work on that.

@Yamagi
Copy link
Member

Yamagi commented Jun 6, 2021

I think the only appropriate way to handle it is to respond to the vkQueuePresent suboptimal/out of date errors and rebuild the swapchain.

I think that would be easiest way. The current restart / recovery code is a hard to understand spaghetti mess. And at least rebuilding the models after a partial renderer reload (without reloading the renderer dll) is hard and error prone.

Spirrwell added 2 commits June 8, 2021 03:54
To fully explain, for me on KDE, when a window gets created for exclusive fullscreen, the window will be hidden/shown/resized in a weird way that leads to out of data/suboptimal swapchain errors.

The problem with simply recreating the swap chain based on old code was that it was missing QVk_UpdateTextureSampler calls for vk_colorbuffer and vk_colorbufferWarp which update the descriptor sets, which probably lead to a whole lot of problems.
@Spirrwell
Copy link
Contributor Author

Spirrwell commented Jun 8, 2021

Alright, I THINK I've fixed the issues that were happening. I explained it in the commit, but I'll explain here.

I'm running KDE. When we create an SDL window with native/exclusive fullscreen, this CAN immediately lead to suboptimal/out of date KHR errors. If there's enough delay between when the window is created and when we present, it doesn't happen. So this was hard to pin down.

The solution obviously seems to be to recreate the swap chain. But the old QVk_RecreateSwapchain() function did not call QVk_UpdateTextureSampler() for vk_colorbuffer and vk_colorbufferWarp to update their descriptor sets so that they point to the new images. This would lead to those corruption issues.

AFAIK, it's all fixed and should work. I HOPE. That was not fun figuring out lol.

So please, test it, try to break it. If there's any code changes you want me to make, let me know, but this should be the last major issue.

@Yamagi
Copy link
Member

Yamagi commented Jun 12, 2021

AFAIK, it's all fixed and should work. I HOPE. That was not fun figuring out lol.

And it looks good :) Just to small questions:

  • Did you have the chance to test the swapchain recreation on an Intel GPU under Windows? Because most of the restart bugs and crashes triggered only on Intels crappy driver for Windows.
  • You've removed the restart loop prevention and replaced it by vk_recreateSwapchainNeeded = true;. is the new restart logic really loop save? Not that I've ever managed to provoke a restart loop, but it was theoretically possible. And under Windows it's nearly impossible to break out of such a loop...

@Spirrwell
Copy link
Contributor Author

I haven't tested with an Intel GPU, though I do have another system I could test on with an integrated HD530 if you want me to.

I couldn't say with absolute certainty that it's loop safe. I'd say that the best we could do is maybe check if re-creating the swap chain fails on vkCreateSwapchainKHR, then fallback to another backend.

If creating a swap chain succeeds and leads to out of date/suboptimal errors in an infinite loop (which would leave the game running, just with no video), that would indicate to me an issue with the window manager, not the driver.

Was the looping on an Intel GPU under Windows before or after QVk_RecreateSwapchain() was #ifd out and no longer used?

Because the original QVk_RecreateSwapchain() completely locked up my machine because of missing calls to QVk_UpdateTextureSampler()

@Spirrwell
Copy link
Contributor Author

Well, I went ahead and installed Windows on my Intel machine with integrated graphics. It runs pretty much perfectly for me.

The only minor issue I had was that alt+tabbing when fullscreen doesn't hide the window, unless I go into fullscreen using Alt+Enter. This isn't present in the OpenGL backends, though the software renderer stopped displaying anything after alt+tabbing.

I'll try to investigate the Vulkan alt+tab issue since it's inconvenient, but I'm not experiencing any major problems.

@Spirrwell
Copy link
Contributor Author

Spirrwell commented Jun 13, 2021

It looks like the alt+tabbing issues I was having stem from SDL_SetHint(SDL_HINT_VIDEO_MINIMIZE_ON_FOCUS_LOSS, "1"); in glimp_sdl.c from yquake2.

If I comment this out, it alt+tabs as it should. (though without minimizing the window) Which reading the comment there looks like it was put there to workaround the fact it's no longer default in new SDL2 versions.

So honestly, I don't know whether this is an SDL2 issue or a driver issue.

@Spirrwell
Copy link
Contributor Author

I did go ahead and update my drivers to the latest from Intel's website and the alt+tabbing weirdness went away. The ones shipped with Windows 10 were from like 2019, and the ones from Intel are from March of this year.

I never got any critical restart loop/bad behavior. Just the minor alt+tab inconvenience.

If there is a specific driver version that maybe ships on an older version of Windows, I could maybe test that. But otherwise, I don't know what else to do other than say whoever has those issues should update their drivers.

@j4reporting
Copy link

I did test the VK driver briefly on my Intel NUC6i5 (driver 100.9466 ) and could not find any issues. Great work!

Driver versions: Have you tried to update the drivers via Windows Update or Device Manager? Usually Microsoft provides some kind of current drivers, depending on the device of course.

@Spirrwell
Copy link
Contributor Author

I didn't think to do that since I did this from a brand new Windows 10 install and it auto-installed an Intel graphics driver. I could be wrong, but I would assume it's the latest Windows has.

However once I installed the latest from Intel, the weird alt+tab behavior went away. It was specific too, like if I had at least one minimized window open, it would alt+tab correctly. If they weren't minimized, yquake2 would stay visible.

But again getting the latest driver from Intel solved that.

@Yamagi
Copy link
Member

Yamagi commented Jun 14, 2021

Thank you both for testing it under Windows. I haven't had the time until know... I' merge it, tag a new release and upload a new Windows binary. That way everyone interested can test and give feedback.

Was the looping on an Intel GPU under Windows before or after QVk_RecreateSwapchain() was #ifd out and no longer used?

The restart loop was (at least as far as I know) always theoretically. I've never managed to trigger one. So we can leave things as they are. If anyone reports a loop, we can have another look and come up with a real fix.

However once I installed the latest from Intel, the weird alt+tab behavior went away. It was specific too, like if I had at least one minimized window open, it would alt+tab correctly. If they weren't minimized, yquake2 would stay visible.

But again getting the latest driver from Intel solved that.

That's more or less normal. Intels Windows drivers are known to be buggy. Over the years we had a lot of strange problem with them. Not only graphics, but also sound over DisplayPort and such fun things...

@Yamagi Yamagi merged commit a7e53dd into yquake2:master Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants