Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta: AMD Issues/Workarounds #1552

Closed
mirh opened this issue Sep 6, 2016 · 226 comments
Closed

Meta: AMD Issues/Workarounds #1552

mirh opened this issue Sep 6, 2016 · 226 comments

Comments

@mirh
Copy link

mirh commented Sep 6, 2016

Follows #1508 and hrydgard/ppsspp#8698
Gs dump is here.

Bissected up to either 16c2baa or 29c97a9
Happens in Ace Combat 5 after "press start" screen only after Blending Unit Accuracy has been set to none in OGL hw.

I'd just complain over at AMD but I'd like to get a more straightforward testcase for them.
Wouldn't be bad if somebody added "Upstream | External" label

List of AMD issues:

  • SSO is broken. SSO is partially disabled in GSdx due to this issue.
  • Disabling blending causes a BSOD. This is related to SSO as the issue does not happen on v1.4 where SSO isn't mandatory and is disabled on AMD GPUs.
  • Dreadful performance. OpenGL can be slower by about 80% compared to Direct3D with same effects emulated.

Links to AMD forum issue threads:
https://community.amd.com/message/2748362
https://community.amd.com/message/2756964

Possible BSOD Citra Workaround
Merged from issue #2362
As Gregory requested so we don't forget about it.

Currently Citra added a workaround for the amdfail driver that fixes the crashing caused by SSO.
The commit is located here https://github.com/citra-emu/citra/pull/3499/commits/0cf6793622b01f3941fbc77fe04c3b68476004ca

Reddit post:
https://www.reddit.com/r/emulation/comments/88vva4/citra_on_twitter_new_update_to_the_hardware/

Idea would be for this to be checked out and maybe implemented.

Some useful info

You unbind everything so you pay extra invalidation
What we need to do is
Create a pipeline by shader combination
So you only bind once stage to a pipeline
And then we bind and rebind pipeline
But I'm not sure it will fix the crash
Potentially citra workaround might not work on ours side.

@gregory38
Copy link
Contributor

There is a way to dump blending setup

https://github.com/PCSX2/pcsx2/blob/master/plugins/GSdx/GSRendererOGL.cpp#L675

Replace this line with if 1

Normally tracing can be enabled with the debug_opengl = 1 on dev/dbg build. As the bonus it will validate openGL functions call.

However, I'm not sure tracing work on windows, @FlatOutPS2 @ssakash @turtleli did one of you manage to make openGL tracing usable on windows?

@turtleli
Copy link
Member

turtleli commented Sep 7, 2016

I fixed it last year (cbd2417), I haven't used the tracing stuff since January/February though so I'm not aware of the current state (I assume it still works).

@Nucleoprotein
Copy link

@mirh
You can use BlueScreenView to get basic info about crash like bugcheck code etc.

@lightningterror
Copy link
Contributor

There is a similar issue with GT4. HW OpenGL , Blending Unit Accuracy set to none.

I didn't get a bsod but the display driver stops working.

There's all kinds of artifacts on the screen. The screen flickers and the display driver is restarted.
After the display driver restarts pcsx2 doesn't respond and it needs to be closed from processes.
GPU load stays at 100% even after closing pcsx2. The only solution to fix the gpu load is a system restart.
An event log is written "Display driver amdkmdap stopped responding and has successfully recovered."

@Nucleoprotein
Copy link

Nucleoprotein commented Sep 9, 2016

@lightningterror
This seems to be exactly as my problem with Star Ocean First Departure and some other games in PPSSPP, maybe both triggers same code in AMD driver.

EDIT: Can you check this by unpacking this file to PCSX2 directory and testing again ? (this is old AMD OpenGL driver from 15.7 driver package)
http://www36.zippyshare.com/v/0NmGmAOF/file.html

@gregory38
Copy link
Contributor

So they fixed the SSO/dual blending issue. Spent 6 months of tests. Finally release it, and boom first test explodes the computer.

Did someone open a report to AMD. Saying that using the dual blending unit crash the whole systems.

@mirh By the way the commit that you found just disable accurate blending on some equations. Initially
Cs*As + Cd, Cs * F + Cd, Cd - Cs*As, Cd - Cs*F were always partially done in software. I.e the multiplication was done in SW and the addition/subtraction in HW.
For example Cs*As + Cd.
Before shader output Cs*As and blending unit was set to Cfrag + Cd
Now shader output Cs and blending unit was set to Cfag * Afrag + Cd (note Afrag comes from the 2nd source).

@mirh
Copy link
Author

mirh commented Sep 9, 2016

Did someone open a report to AMD. Saying that using the dual blending unit crash the whole systems.

I first wanted to have a trace they could use to reproduce the issue before opening a report, but I have no time atm.

@gregory38
Copy link
Contributor

Might not be easy to have a trace. Did you try to replay your gs dump ?

Anyway, they will need 6 months to release a fix (potentially it is already fixed......).

@mirh
Copy link
Author

mirh commented Sep 9, 2016

Did you try to replay your gs dump ?

I don't know how I can do it :p

@Nucleoprotein
Copy link

@gregory38
This crash seem not to be related with SSO at all, it's something else because PPSSPP does not use SSO and crashes in same way.

@gregory38
Copy link
Contributor

I agree with you the bug is related to dual-souce blending. However they fix their codes to support dual-source blending with SSO. There is a huge probability that they introduce another bug/regression in the meantime.

Actually, with the bisected commit of mirh, you can be sure the issue is dual-source blending. Because the commit replaces some blending operation with single-source blending (old code) by dual-source blending (new code) when you disable accurate blending. The goal was to reduce the load on the GPU.

@mirh
Copy link
Author

mirh commented Sep 12, 2016

If only I could have a testcase.. 😝
Does the "gs dump player" (whatever it is and whatever it works) require BIOS?
EDIT: uh, MFW I find tools/GSDumpGUI folder. Inb4 I'll be the first guy happy for a BSOD.

@gregory38
Copy link
Contributor

Player is only an exe that load GSdx.so file. So no bios. Technically the gs dump contains game textures & vertex. But I think a couple of frames can be seen as a fair-use. Honestly, I'm not even sure you need a testcase, you can reports that several projects are broken. Maybe someone will be clever enough to detect that test quality on dual source is bad.

@FlatOutPS2 how do you replay on windows ? How do you update the ini option, is it possible actually ?

@mirh
Copy link
Author

mirh commented Sep 12, 2016

Yes, yes, I just tested it. I guess it it will be quite fine.

Reported.

Honestly, I'm not even sure you need a testcase, you can reports that several projects are broken

It's just I was thinking that if we needed 7 months for something with sources and all, a dumb "closed" test would have been even less useful.

@lightningterror
Copy link
Contributor

I can't really open the thread to check the report. Says access is restricted. Guess I can't see the staff comments on this.

@mirh
Copy link
Author

mirh commented Sep 12, 2016

They still have to approve it prolly.

@Dokman
Copy link
Contributor

Dokman commented Sep 13, 2016

hey i was testing it with looney tunes space race and , butin my case with a R9 290X and driver version 16.7.3 beta and i don't have this bug

@mirh
Copy link
Author

mirh commented Sep 13, 2016

Try my testcase, then report back.
Of course not every game performs the same calls.

@lightningterror
Copy link
Contributor

@mirh I tried it and the issues are the same as with GT4. Now we wait 6-7 months for a fix.

@gregory38
Copy link
Contributor

No 6-7 month delay is only to deliver the fix. They first need to find the bug and then a solution.
At least you can use accurate blending to reduce the crash likelyhood.

@mirh
Copy link
Author

mirh commented Sep 15, 2016

OT, but w/e: just for the records, since a month CodeXL support cross-platform frame analysis (aka see which functions are spending the most CPU or GPU time)

@Nucleoprotein
Copy link

@mirh
Tested your bsod package you posted on AMD site - TDR looks same like in PPSSPP, so this seems to be same issue.

@lightningterror
Copy link
Contributor

"I can confirm we determined this to be a driver issue. Our GL driver team is now working on a fix."

At least they are working on a fix.

@lightningterror
Copy link
Contributor

Amd fixed this. It will be available in the newest drivers.

@FlatOutPS2
Copy link
Contributor

Amd fixed this. It will be available in the newest drivers.

Great, now all we have to do is wait 3 months. :p

@Nucleoprotein
Copy link

Great, now all we have to do is wait 3 months. :p

@dwitczak from AMD:

We have fixed this issue internally. The bug should no longer reproduce in the next driver release, or the one that follows.

Seems so, or even longer...

@gregory38
Copy link
Contributor

Technically, you are at least sure that it will be integrated in the last release of an AMD driver (because none will follow) 😛
The one that can guess the release version that will include the fix get the privilege to report next issue ;)

@Lithium64
Copy link

@mirh On OpenGL extensions viewer the only missing extension is GLSL 4.60

opengl

@gregory38
Copy link
Contributor

The issue isn't about missing extension. AMD povides plenty of them. However the driver is broken period. Crash or render random colours... Drivers have been broken for 5 years (if it isn't more), but still people dream that an angel will magically fix the next driver release...

@Lithium64
Copy link

AMD fixed GL_ARB_separate_shader_objects extension
https://community.amd.com/thread/194895

@Nucleoprotein
Copy link

Nucleoprotein commented Aug 23, 2018

Great - they fixed it when I just bought new PC with GTX1060 🙄
They fixed also TDR or the investigating it still?
Also any noticeable performance change?

@willkuer
Copy link
Contributor

Seems like your time estimation was on point gregory...

@Lithium64
Copy link

@Nucleoprotein
They are still investigating, from what I could understand. Luckily it looks like there have been changes on AMD dev team, they have hired more people and are giving decent support to their OpenGL and Vulkan drivers now.

@mirh
Copy link
Author

mirh commented Aug 23, 2018

We are all following that bug, and it means nothing until they have released a fixed driver - crash-free too.

@Lithium64
Copy link

Lithium64 commented Aug 24, 2018

@mirh The crash only happens if blending is disabled?

@lightningterror lightningterror changed the title Meta: AMD Issues Meta: AMD Issues/Workarounds Oct 15, 2018
@mirh
Copy link
Author

mirh commented Oct 15, 2018

https://community.amd.com/thread/194895#comment-2881932
Could be a good time to prep up that proper version detection thingy

@lightningterror
Copy link
Contributor

I have low expectations. They probably wouldn't include any bsod/crash fixes.

@mirh
Copy link
Author

mirh commented Oct 17, 2018

And instead...
Bets are taken if it'll be released before the 3 years anniversary of my first report or not

@Nucleoprotein
Copy link

Nucleoprotein commented Nov 12, 2018

Seems they fixed the issue with SSO and TDR but not yet public.

@mirh
Copy link
Author

mirh commented Nov 18, 2018

WTF
Didn't anyone noticed SSO got fixed already in 18.10.2??
@lightningterror time to push mirh@fdb3a0d

AFAICT 18.10.1 was 4.5.13541 (25.20.14003.1012) . Fixed should be 4.5.13541 (25.20.14007.1000) (yet again checking for accompanying kernel driver version seems the only viable path).
..unfortunately this mean I'm not sure how to accurately do the check down to .build. position.

@lightningterror
Copy link
Contributor

Seems that both SSO and Dual Source Blending crashes have been fixed.

I haven't been able to reproduce the crash yet (I'm still not fully convinced but we'll see).

mirh added a commit to mirh/pcsx2-xp that referenced this issue Nov 19, 2018
Also account for latest release bug fixes (see PCSX2#1552 for details on the used values)
Legacy_driver variable was left there, because technically speaking that'd be supposed to be yet another case to remotely care
mirh added a commit to mirh/pcsx2-xp that referenced this issue Jan 4, 2019
Also account for latest release bug fixes (see PCSX2#1552 for details on the used values)
Legacy_driver variable was left there, because technically speaking that'd be supposed to be yet another case to remotely care
@Squall-Leonhart
Copy link

Squall-Leonhart commented Feb 21, 2019

@mirh @Nucleoprotein
anyone know if the issues with amd's driver mentioned here is why hyperdimension neptunia (PC) would have rainbow textures?

@MrCK1
Copy link
Member

MrCK1 commented Feb 21, 2019

That's really not related to our project, nor is it our issue. You'd be better off Asking on the AMD forums or searching Google.

@Squall-Leonhart
Copy link

Issue is titled Issues and Workarounds, step off MrCK1.

@RedPanda4552
Copy link
Contributor

You are asking about a PC game in a meta thread about AMD driver problems for PCSX2, a PS2 emulator. It is as CK stated, a separate issue for a different place. Please stay on topic and refrain from attempting to tell, of all people, a community moderator, to step off. Thanks.

@Squall-Leonhart
Copy link

Squall-Leonhart commented Feb 21, 2019

I'm also one of the community moderators (Facebook)

My question was to the OpenGL experts, AMD is not one of them.

Also i know who MrCK1 is and if i was being anything more than sarcastic in my tone i'd have used stronger language and he knows it.

@MrCK1 I'll get mirh on irc or Nucleo on Steam.

@lightningterror
Copy link
Contributor

Closing as the issues are on the driver, plus we already have a wiki page documenting AMD GPU driver issues and workarounds.

@mirh
Copy link
Author

mirh commented Jul 30, 2022

Ladies and gents, after countless time of performance sucking hard, the latest drivers seem to have finally addressed it.
To pick up again my benchmark (unfortunately they archived my thread on the amd forums):

Core 2 Duo E8400 + Radeon 7750: ~19 FPS
Phenom II X4 965 + Radeon RX 480: ~25 FPS
Core 2 Duo 6320 + GeForce GT 430: ~58 FPS
Core i5 6600K + Radeon RX 580 (22.2.1): ~35FPS
Core i5 6600K + Radeon RX 580 (22.7.1): ~100FPS

I also feel like d3d11 may have benefited of some optimization in-between my two tested releases, but I don't have hard numbers.

@stenzek
Copy link
Contributor

stenzek commented Jul 30, 2022

The latest AMD drivers aren't really relevant for us. Vulkan is still significantly faster, and didn't change at all with them (I actually saw a very slight regression).

@HarGabt
Copy link

HarGabt commented Aug 4, 2022

Looks like AMD has finally taken care of OpenGL performance. But some games are still crippled with performance.
Need for Speed: Most Wanted (SLES-53557) even when running on native resolution with combination of Core i5 10400F + Radeon RX 580 with 22.7.1 drivers yields 30 FPS out of required 50. GPU is loaded 100%

@refractionpcsx2
Copy link
Member

As sten said above, Vulkan is still faster and you should use that or DX. Why AMD still lag behind Nvidia for speed I don't know, but it's not anything we're doing, it's driver issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests