Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black screen at startup (bisected to e715fe0) #12049

Closed
hissingshark opened this issue May 19, 2019 · 14 comments
Closed

Black screen at startup (bisected to e715fe0) #12049

hissingshark opened this issue May 19, 2019 · 14 comments
Labels
SDL2 Issue on SDL (or Qt in SDL code) but not all ports.
Milestone

Comments

@hissingshark
Copy link
Contributor

What happens?

Black screen at start up.
Audio and controls continue to work.

What should happen?

Image on screen.

What hardware, operating system, and PPSSPP version? On desktop, GPU matters for graphical issues.

The OSMC box Vero4k (Amlogic S905x). EGL GLES2 FBDEV

We had this a year ago. A work around, which still works here, was to first run:
fbset -fb /dev/fb0 -g 1920 1080 1920 2160 16

It was finally fixed by heuristics in #11188 .

First time building this year and the bug is back. Bisected back to commit e715fe0
Specifically this change (sorry can't seem to quote from commits...):

SDL/SDLGLGraphicsContext.cpp
Line 151
- score += (colorScore + alphaScore) * 10;
+ score += colorScore * 10 + alphaScore * 2;

I never could get my head around the heuristics. Would you accept a PR to bypass this for the platform, or can I help in testing alternative code?

Many Thanks

@hrydgard
Copy link
Owner

Apparently it's more important to get alpha right for your platform then than color.

We really should add some solid debug logging so we can figure out what's chosen and why. I'd very much appreciate a pull request with that.

@hissingshark
Copy link
Contributor Author

I'll set to adding that output to the console.

@unknownbrackets
Copy link
Collaborator

The idea behind the heuristics is, for better or worse -

Every device has various possible options, but not always the same ones. So device A could support 5551/4444/8880, device B might support 4444/4440, and device C might support 5650/8888. And there are other settings they could support too.

PPSSPP can run on a wide variety of these possible configs. Any of those above would work, but some better than others.

If we pick the first config that "would work" we might pick the worst colors possible on a device. Or one that doesn't support the latest GL features.

If we pick the first config meeting unnecessarily high requirements, we'll exclude devices that are weaker.

If we try to catalog every possible device any person might try to run PPSSPP on, we'll end up with a short list of the most popular devices. Less popular (or newer) devices won't be able to run PPSSPP, because no one will have decided manually the best config.

Obviously, those aren't options, so we're left with writing code to make the decision, instead of making a human do it.

So looking at the formula, before it was:

R * 10 + G * 10 + B * 10 + A * 10
8888 = 320 points
444X = 120 points
4444 = 160 points
5650 = 160 points

The change in e715fe0 made it so we care less about alpha. So instead:

R * 10 + G * 10 + B * 10 + A * 2
8888 = 256 points
444X = 120 points
4444 = 128 points
5650 = 160 points

As you can see, before we might've picked 4444, but now we would pick 5650 (since we pick whatever has the most total points.) In this specific case, I think it was actually to avoid a Mali issue we've seen on Android where the compositor doesn't work when the backbuffer has alpha.

In other words, it's a contest and each config is a contestant. We judge them in a bunch of categories, and announce a winner based on best total score. Maybe our judging criteria sucks, though.

-[Unknown]

@hissingshark
Copy link
Contributor Author

@unknownbrackets Thanks for the explanation. Got it now.

We judge them in a bunch of categories, and announce a winner based on best total score. Maybe our judging criteria sucks, though.

Looks like that's a given!
"...a heuristic, is any approach to problem solving or self-discovery that employs a practical method, not guaranteed to be optimal, perfect, logical, or rational..." (Source: Wikipedia, 2019).

If you're mixing contestants from different weight categories or even different sports it's going to fall over eventually. Maybe add a weighting for broad platform types to bias it. Or dreaded #ifdefs to outright force the issue.

I'll finish my debug output and see what we have anyway.

@hissingshark
Copy link
Contributor Author

hissingshark commented Jun 8, 2019

I've added some debug output. Is this what you had in mind? I could add colour for highest scores etc if you like - perhaps use ILOG if that matters.

Anyway, the output from my system for example is:

EGL Config Scores:
Config:	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	
Colour:	16	16	16	15	15	15	12	12	12	24	24	24	24	24	24	0	0	16	15	24	24	24	
Alpha:	8	8	8	7	7	7	4	4	4	0	8	8	8	0	0	0	8	0	1	8	8	0	
Depth:	24	24	24	24	24	24	24	24	24	24	24	24	24	24	24	0	0	0	0	0	0	24	
Stencl:	8	8	8	8	8	8	8	8	8	8	8	8	8	8	8	0	0	0	0	0	0	8	
Level:	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	100	
Sample:	100	0	0	100	0	0	100	0	0	100	100	0	0	0	0	100	100	0	0	0	0	100	
Buffer:	100	0	0	100	0	0	100	0	0	100	100	0	0	0	0	100	100	100	100	100	100	100	
Trans:	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	50	
Caveat:	100	100	50	100	100	50	100	100	50	100	50	50	50	100	50	50	50	100	100	100	100	100	
Surf:	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	
GLES:	80	80	80	80	80	80	80	80	80	80	80	80	80	80	80	0	0	0	0	0	0	80	
GL:	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	
TOTAL:	834	634	584	822	622	572	786	586	536	898	864	664	664	698	648	400	416	510	502	606	606	898	

Heuristics chose config #10

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Jun 8, 2019

Trimmed out the < 700 values then sorted and removed identical rows:

Config 10 22 11 1 4
Colour: 24 24 24 16 15
Alpha: 0 0 8 8 7
Caveat: 100 100 50 100 100
TOTAL: 898 898 864 834 822

It's interesting that two are, from our heuristics perspective, identical. There ought to be something different about them, so maybe we want to pick 22 rather than 10.

As a hack, you could try changing if (score > bestScore) { to if (score >= bestScore) {. This will make it pick the last identical score, rather than the first. If this works, it's not a fix, but it would verify that we're picking the "wrong" 898 and missing a weight/disqualification for something that matters to us.

That said, maybe your device does want alpha (notice that the two highest scores lack alpha.) Maybe a non-comformant config is not so bad after all. In that case, maybe we could try replacing int caveatScore = caveat == EGL_NONE ? 100 : (caveat == EGL_NON_CONFORMANT_CONFIG ? 50 : 0); with int caveatScore = caveat == EGL_NONE ? 100 : (caveat == EGL_NON_CONFORMANT_CONFIG ? 99 : 0);. That would make it select config 11 which looks like maybe the best config there (depending on what "non-comformant" means.)

-[Unknown]

@hissingshark
Copy link
Contributor Author

Presently it selects config #10 and black screen.


With if (score >= bestScore) it selects #22 with a black screen and:

Heuristics chose config #22

ERROR: EGL Error EGL_BAD_MATCH detected in file /home/osmc/ppsspp/SDL/SDLGLGraphicsContext.cpp at line 272 (0x3009)
EGL ERROR: Unable to create EGL surface!
EGL_Init() failed
GL init error ''
E: SDLMain.cpp:590: Failed to open audio: ALSA: Couldn't open audio device: Device or resource busy
...
Segmentation fault

With EGL_NON_CONFORMANT_CONFIG ? 99: we get #11 and a working display.

@unknownbrackets
Copy link
Collaborator

Hm. EGL_NON_CONFORMANT_CONFIG is not very specifically defined. It just means the implementation doesn't pass conformance tests - this could mean it's for speed and just misses one, or that it's a broken pre-alpha implementation and barely works.

I'd be perfectly happy to have it count as 99/100, mostly I'd want to continue avoiding EGL_SLOW_CONFIG where possible (which would stay 0.)

That 10 and 22 behave differently means that we're missing something about 22 that makes it a non-option, though. It'd be nice to figure out which other egl config attribute it has that is different from 10...

-[Unknown]

@unknownbrackets unknownbrackets added the SDL2 Issue on SDL (or Qt in SDL code) but not all ports. label Jun 10, 2019
@hissingshark
Copy link
Contributor Author

I'll have a look at my Ladybird Book of OpenGLES Programming and see what other attributes I can test for. Will let you know what I find.

@hissingshark
Copy link
Contributor Author

Nailed it. The only difference is the bitmasks for EGL_SURFACE_TYPE


Config 10

  • EGL_PBUFFER_BIT
  • EGL_PIXMAP_BIT
  • EGL_SWAP_BEHAVIOR_PRESERVED_BIT
  • EGL_WINDOW_BIT

Config 22

  • EGL_PIXMAP_BIT
  • EGL_SWAP_BEHAVIOR_PRESERVED_BIT

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Jun 11, 2019
See hrydgard#12049:
 * Require EGL_WINDOW_BIT more strongly.
 * Allow EGL_NON_CONFORMANT_CONFIG (but still not EGL_SLOW_CONFIG.)
@unknownbrackets
Copy link
Collaborator

Ah, we definitely want an EGL_WINDOW_BIT config. Boosted its priority significantly in #12097.

For more on why we use heuristics here, this seems to explain it well:
http://directx.com/2014/06/egl-understanding-eglchooseconfig-then-ignoring-it/

-[Unknown]

@hissingshark
Copy link
Contributor Author

Thanks for the reading and the issue resolution!
Your #12097 is now tested and selecting correctly on my platform. With that I shall close this issue.
I just PR'd #12098 a fix for a typo is all.

So with all this behind us, is it still worth me putting the extra debugging that was used here into a PR? With the config attributes and bitmask breakdowns Perhaps as a command line option e.g --egldebug, to avoid spamming the console for regular startups?

@hrydgard
Copy link
Owner

Yeah, I think that's a good idea, if it's not too bothersome.

@hissingshark
Copy link
Contributor Author

Not at all.

@unknownbrackets unknownbrackets added this to the v1.9.0 milestone Jun 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SDL2 Issue on SDL (or Qt in SDL code) but not all ports.
Projects
None yet
Development

No branches or pull requests

3 participants