Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finally AMD delivered the fix to support separate shader object !!! #1508

Merged
merged 1 commit into from
Sep 8, 2016

Conversation

gregory38
Copy link
Contributor

You need latest AMD driver 16.7.3 or 16.8...

@@ -276,9 +276,9 @@ namespace GLLoader {
mesa_amd_buggy_driver = intel_buggy_driver = true;

#ifdef _WIN32
buggy_sso_dual_src = intel_buggy_driver || fglrx_buggy_driver || legacy_fglrx_buggy_driver;
buggy_sso_dual_src = intel_buggy_driver || legacy_fglrx_buggy_driver;

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@Dokman
Copy link
Contributor

Dokman commented Aug 10, 2016

Finally

@mirh
Copy link

mirh commented Aug 13, 2016

So.. I can confirm 16.7.3 and newer works.
I have been testing dll swap, and the fix seems to technically reside inside atioglxx.dll (build is 6.14.10.13447).

Core profile version is reported to be 16.300.2311.0.
If it was to be for me, I'd put everybody before this on the fglrx_buggy_driver path.

Sure, GCN hardware was at least in the pcsx2 cases I could spot fixed since 16.5.2... But I don't see how those people couldn't update drivers.
To simplify things, I'd just merge this situation with non-GCN cards one.

@mirh mirh mentioned this pull request Aug 13, 2016
mirh referenced this pull request Aug 14, 2016
… driver

Legacy GPU:
Older driver will be broken.

Still supported GPU:
Please upgrade to the latest AMD driver 16.5.2 or 16.5.3 (and prey that future driver will still work)
@Nucleoprotein
Copy link

Nucleoprotein commented Aug 14, 2016

Test app works but I think this driver require more tests because of this issue: hrydgard/ppsspp#8698 (nice screenshot after TDR: http://imgur.com/8TuK2Us) This crash is present in 16.8.2, and it can also affects PCSX2.

PS: Crash free are only pre-dual blending fix drivers.
PS2: Test case for that crash will be nice for reporting this to AMD.

@mirh
Copy link

mirh commented Aug 15, 2016

That's another problem I think, if you have to rollback to 15.7 and if atio6axx.dll is needed too.

@gregory38
Copy link
Contributor Author

@Nucleoprotein what means TDR?

So they did 6 months of QA to deliver a bad driver. Well done guys. Feel free to report an AMD bug. On my side, I won't bother to work on it. It would be as fast as to wait the next month driver.

@Nucleoprotein
Copy link

TDR is Timeout Detection and Recovery: https://msdn.microsoft.com/en-us/library/windows/hardware/ff570087(v=vs.85).aspx ie. display driver hang/crash.
For reporting this to AMD I need a easy test case because I think they not check PPSSPP with specific games ... Last free of crash AMD OpenGL (atio6axx.dll) is 6.14.10.13411 (file version), I'm using that for PPSSPP, I will test some my PCSX2 games for crashes but crash is very specific so it can no affect PCSX2.

PS: I noticed many other bugs in Crimson drivers from first 16 version like tessellation bug in DX11 in Max Payne 3, I think they changed compiler and are using more aggressive optimizations and have some regressions. Vulkan driver seems to be stablest one but cannot test it much because lack of software using it - only Dolphin PR that runs about 25% faster than DX12 :p

@mirh
Copy link

mirh commented Aug 15, 2016

So they did 6 months of QA to deliver a bad driver.

I don't think it has anything to do with SSO. Besides, it has been _7_ months d:

like tessellation bug in DX11 in Max Payne 3

There's a "tesselation optimization" setting in Control Panel. You could try experimenting with that.

I think they changed compiler

Don't know for opengl, but at least for DX12 they did.

@Nucleoprotein
Copy link

@mirh

There's an "tesselation optimization" setting in Control Panel. You could try experimenting with that.

Only switching to older usermode DX11 driver (atidxx32.dll) fixes Max Payne 3 problem.

Don't know for opengl, but at least for DX12 they did.

SSE4.1/POPCNT topic on AMD forums is my topic they seems to ignore ...

@mirh
Copy link

mirh commented Aug 15, 2016

I wonder if now that Inspector is FOSS, one day we couldn't get a similar tool to play with AMD blb bits.
There could even possibly be the secret for multi-thread GL there (which I assume has to exist because they aren't getting abysmal performance compared to nvidia in games like NMS or DOOM)
EDIT2: ok perhaps not

EDIT: A nice thing that could be checked is whatever AMD might have done here.
EDIT3: this is very interesting

@Nucleoprotein
Copy link

Nucleoprotein commented Aug 15, 2016

@mirh
I think many compatibility thing lays in atiapfxx.blb, it can be decompiled to XML using atiapfxx.exe, run atiapfxx.exe -hiddenhelp for usage notes.

EDIT: Upps, it seems you already knows that.

@gregory38
Copy link
Contributor Author

Oh I see, so we don't say anymore my PC crash due to the stinky AMD driver but instead we say my PC got a new feature called TDR 😛

For reporting this to AMD I need a easy test case because I think they not check PPSSPP with specific games ...

May I suggest to advice AMD to install AMD's GPU in the driver team workstation ;) Honestly I won't bother, if they manage to reproduce/fix it, you will need to wait 6 months to have the bug fix released. If an AAA game reports the issue, you will get the fix next week.

@Nucleoprotein
Copy link

But I do not think this affect AAA games, most OpenGL programs works well, only PPSSPP crashes.
I think you not very familiar with Windows, mostly Linux programmer ? Because TDR is not new feature, it's feature of WDDM 1.0 ie. Windows Vista and it was created to prevent BSOD - it restarts driver and display notification popup with something like "Display driver stop responding bla bla bla" ie. this guards from BSOD - if works correctly, but I crashed 16.7.x hard (BSOD or black/corrupted dispaly + freeze), most unstable driver in this year I think, 16.8.x are much more stable, 16.8.2 are required for Overwatch players (because 16.8.1 crash ...). But there is your argument - if it affects AAA game like Overwatch is patched almost instantly.
TL;DR - I recommend 16.8.2 for now, most stable driver in this year I think.

@gregory38
Copy link
Contributor Author

Well my last win OS was win2000 (not milineum).

@sirdanielk
Copy link

Is it still valid to test this extension support by override_GL_ARB_separate_shader_objects=1 in actual git version? I wanna test nongc gpu without need to compile.

@Nucleoprotein
Copy link

You can use test app from here: https://community.amd.com/thread/194895 If it's green - it works correctly.

@sirdanielk
Copy link

i see yellow, so legacy gpus not supported :/

@Nucleoprotein
Copy link

Nucleoprotein commented Aug 19, 2016

You can try to use latest OGL by extracting this file: http://www21.zippyshare.com/v/VjXvA48K/file.html to test or PCSX2 directory.

@mirh
Copy link

mirh commented Aug 19, 2016

I already tested on my VLIW laptop, it just crash.
I previously thought then, this could be a trivial fix to "hex edit" port (of course for somebody knowledgeable)... But gregory wasn't wrong when he talked about "different working branches" or something along that.
16.7.3 had a ton of other stuff, aside of ogl fix: new AMF/TAN support, and ogl dll size going from 24MB circa to 26.

@danilaml
Copy link

Is this PR waiting for something?

@FlatOutPS2
Copy link
Contributor

Are you in some sort of rush?

@danilaml
Copy link

@FlatOutPS2, nah. Just wondering.

@gregory38
Copy link
Contributor Author

Well it can be merged. It would still be very interesting to have feedback on it from AMD users. On both performance and rendering quality.

@Nucleoprotein
Copy link

Quality is good but performance seems to be worse than before so better for AMD users is to port all OGL fixes to DX11 ...

@gregory38
Copy link
Contributor Author

Let's me correct you, I think what you mean is "better for AMD users to buy an Nvidia GPU next time".

By slower, you mean this PR. Or globally GSdx is too slow on AMD.

@danilaml
Copy link

danilaml commented Aug 29, 2016

@FlatOutPS2 And? Your arguments are irrelevant, it seem like you are missing the point. @gregory38 has suggested all AMD users to buy Nvidia GPU. I'm just debating with that statement. What "laptop users this" and "hardcore gamers that" has to do with this? If anything, most of this arguments serve in the favour of staying with AMD, since they usually provide much better value per $, both on new cards market and especially on used videocards market (I bought my current R9 290 after bitcoin craze settled down just for little over 100$ (if translating to $ by exchange rate at the time)).

@mirh
Copy link

mirh commented Aug 29, 2016

Soo.. Let's forget about this semantic flame

Can appveyor build be triggered again?
Overriding sso in builds without this commit doesn't seem to impact speed for the records.

@gregory38
Copy link
Contributor Author

@mirh
I rebased the branch on master. It is hard to benchmark the performance as games are generally GPU limited rather than CPU limited.

I manged to get a whole BSOD! And it's all totally reproducible.
I just need to press start on Ace Combat 5 menu, with blending unit accuracy set to none.
Then, bam. I wonder if gs dumps with the lovely reader could trigger this too, for those without the game?

It isn't nice. Do you know if Nvidia/Intel users get this issue too ? I'm not sure we can reproduce it with a gsdump. But we could try.

@mirh
Copy link

mirh commented Aug 29, 2016

I rebased the branch on master. It is hard to benchmark the performance as games are generally GPU limited rather than CPU limited.

I guess my Core 2 Duo does reverse-miracles then 😃
Anyway I can confirm this 40% performance drop is there.

I'm not sure we can reproduce it with a gsdump. But we could try.

Well, I surely could try. But first of all, I don't know where to look for the gsdump "reproducer".
EDIT: just tested this morning build (without this commit). It does still happen there.

@gregory38
Copy link
Contributor Author

perf impact: native or upscaled resolution ? (in short CPU or GPU?)

Well try to do a long gsdump with blending enabled. Start before "press start". Then we can replay it without blending.

@mirh
Copy link

mirh commented Aug 29, 2016

Native. And in short, definitively CPU.
Which-I-mean: EE is like 20%, while GS is 95% and gpu usage is 43%. So totally driver overhead as usual.

Gs dump is here.

@gregory38
Copy link
Contributor Author

@FlatOutPS2 could you try to reproduce the issue on your side with the dump. Thanks you.

@FlatOutPS2
Copy link
Contributor

No thanks, not looking for a BSOD right now. :p I'll give it a try tomorrow.

I don't need a dump btw, I've got all the AC games. ;)

@gregory38
Copy link
Contributor Author

sure, but this way I can try to spend time on my side. If I don't manage to reproduce the issue, I won't know why (nvidia, linux).

@mirh
Copy link

mirh commented Aug 29, 2016

No thanks, not looking for a BSOD right now.

Tbh that happened only after the sixth time I tried to reproduce the issue.
When, for reasons (which could be even CRC hacks set to none, or could just be the bare number or attempts) seemed to enter an unrecoverable TDR loop.
There wasn't literally a blue screen though, and even event viewer reports nothing. So I'm kind of.. dafaq?

But don't want to try further.

@FlatOutPS2
Copy link
Contributor

I didn't manage to reproduce the issue. I've tried both the game and the dump with AMD GPU and Intel iGPU.

@mirh
Copy link

mirh commented Aug 30, 2016

I'm using 16.7.3 and w10 build 10586.545 if it can help.

@gregory38
Copy link
Contributor Author

gregory38 commented Aug 30, 2016

@mirh could you look at the perf without geometry shader override_geometry_shader = 0 both in master and this PR.

Edit: if option is correct set, this message will be printed on the log Overriding geometry shaders detection

@mirh
Copy link

mirh commented Aug 30, 2016

Image is fucked up beyond all reason.
Both PR and and master, and already at "Playstation 2" screen.
Performance once in-game (thankfully spamming start and X is enough to start first level) is the same.

@gregory38
Copy link
Contributor Author

Recent regression. I have a fix but I don't have any internet connection to push it...
By same do you mean without geometry shader, perf is the same with/without SSO. Or perf is the same with/without geometry shader (and sso is still bad on AMD driver)

@mirh
Copy link

mirh commented Aug 31, 2016

I mean geometry shader on/off doesn't make any speed difference.
Sso in pcsx2 is fixed since 16.5 drivers as I said above (tried with the override switch in older builds) and as far as I can remember, there's no performance impact.
Only this branch affects it.

@Nucleoprotein
Copy link

@mirh
There is performance impact, this PR is about 5% slower for me in Kingdom Hearts, but I have not time to test more games - I'm having some health problems right now.

v2: blacklist AMD driver from the start of 2016
Please note that it is highly recommended to upgrade to a stable&working driver
@gregory38
Copy link
Contributor Author

Ok business is back. I fixed the issue with geometry shader.

@mirh this PR really enable SSO. I would see if I can do a couple of test on Nvidia. It used to be faster with SSO. Code has evolved since my latest benchmark. So it could be different now. Or much likely, AMD implementation is rather basic.

@mirh
Copy link

mirh commented Sep 1, 2016

New master seems a little little slower (20 FPS rather than 21) but I guess it's within margin of error.
Disabling geometry shader seems a tad faster on the other hand and now it's all fine.

@gregory38
Copy link
Contributor Author

I need to profile again the geometry shader. Unlike Dx, it is only used to convert 2 vertices sprites to 6 vertices triangles. Potentially the gain are small for few primitives draw calls. However, several driver operations (like a texture upload) touch the geometry shader which trigger a new revalidation on the next draw call.

upload_texture
    disable geometry shader
    do the upload
    enable back geometry shader
draw
   validate geometry shader

@mirh
Copy link

mirh commented Sep 5, 2016

Well, as I said I have CodeXL port always listening to any your call 💟
EDIT: uh, and if that isn't good, I guess PerfStudio could do the trick too

@Nucleoprotein
Copy link

Nucleoprotein commented Sep 8, 2016

16.9.1 drivers are out but nothing changed (OpenGL crashes in PPSSPP and DX12 has SSE 4.2)
// Upps, maybe AMD BSOD issue was better place for this post, sorry.

@danilaml
Copy link

danilaml commented Sep 8, 2016

@Nucleoprotein irrelevant to this PR.

@gregory38
Copy link
Contributor Author

After a lenghtly discussion with mirh, I concluded that we can merge the PR.

@gregory38 gregory38 merged commit f9d8cb9 into master Sep 8, 2016
@gregory38 gregory38 deleted the gsdx-allow-sso-amd branch September 13, 2016 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants