Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steamwebhelper burns a few CPU cores and doesn't render on ARM64 #1127

Closed
Sonicadvance1 opened this issue Jun 26, 2021 · 13 comments · Fixed by #1404
Closed

Steamwebhelper burns a few CPU cores and doesn't render on ARM64 #1127

Sonicadvance1 opened this issue Jun 26, 2021 · 13 comments · Fixed by #1404
Assignees
Labels
64bit Guest AArch64 Host bug Something isn't working help wanted Extra attention is needed high priority JIT

Comments

@Sonicadvance1
Copy link
Member

Sonicadvance1 commented Jun 26, 2021

steamwebhelper maxes out 4+ CPU cores on ARM64 and fails to render any content.
This looks like for some reason it is just constantly spinning for work and not getting any.
This doesn't happen on x86-64. It even renders some parts of the UI correctly.

This doesn't seem to be optimization pass related. Same problem even disabling optimization passes.
From what I can tell it doesn't look like it is syscall related. I've been combing through syscalls finding weird edge cases, but nothing that has affected the behaviour.
Switching to llvmpipe doesn't resolve the issue, so it isn't freedreno/turnip related.
Switching to interpreter /might/ resolve the issue, but it runs so slowly that it may just not be passing jobs between threads?
So it looks like it might be a JIT bug? It's being a pain to nail down.

This happens with the steamwebhelper 64-bit processes. The 32-bit steam side seems to not care.
This is hard to tell if it happens in a standalone chromium process, it uses a lot of CPU time just idling.

@Sonicadvance1 Sonicadvance1 added bug Something isn't working help wanted Extra attention is needed 64bit Guest high priority JIT AArch64 Host labels Jun 26, 2021
@Sonicadvance1 Sonicadvance1 changed the title Steamwebhelper burns a few CPU cores on ARM64 Steamwebhelper burns a few CPU cores and doesn't render on ARM64 Jun 26, 2021
@Sonicadvance1
Copy link
Member Author

Sonicadvance1 commented Jun 26, 2021

index e58d65c9,e58d65c9..9316571b
--- a/Source/Tests/FEXLoader.cpp
+++ b/Source/Tests/FEXLoader.cpp
@@@ -352,8 -352,8 +352,15 @@@ int main(int argc, char **argv, char **
    std::string Program = Args[0];

    // These layers load on initialization
--  FEXCore::Config::AddLayer(std::make_unique<FEX::Config::AppLoader>(std::filesystem::path(Program).filename(), true));
--  FEXCore::Config::AddLayer(std::make_unique<FEX::Config::AppLoader>(std::filesystem::path(Program).filename(), false));
++  auto ProgramName = std::filesystem::path(Program).filename();
++  FEXCore::Config::AddLayer(std::make_unique<FEX::Config::AppLoader>(ProgramName, true));
++  FEXCore::Config::AddLayer(std::make_unique<FEX::Config::AppLoader>(ProgramName, false));
++
++  if (ProgramName == "steamwebhelper") {
++    while (1) {
++      select(0, nullptr, nullptr, nullptr, nullptr);
++    }
++  }

If you just want steamwebhelper to stop.

@Sonicadvance1
Copy link
Member Author

Sonicadvance1 commented Jun 27, 2021

Double checked the Interpreter. Still has the CPU usage problem there as well.

@Sonicadvance1
Copy link
Member Author

Ran the interpreter a couple of times and managed to get a run where it didn't max out a bunch of threads?

@Sonicadvance1
Copy link
Member Author

Enabling JIT globally and only having the interpreter enabled for a steamwebhelper appconfig also resolves the problem.
This proves that:

  1. It's a problem on the steamwebhelper side (Not the communication on the Steam side)
  2. Interpreter works around the issue so it is definitely an AArch64 JIT problem
  3. Optimization passes don't change behaviour
  4. It takes a while for the steamwebhelper process to calm down, especially on interpreter, but it gets there.

@Sonicadvance1
Copy link
Member Author

Sonicadvance1 commented Jun 27, 2021

Additionally.
Disabling SSE4.1 and SSSE3 in CPUID doesn't work around the issue.
Trying to disable SSE3 makes steam refuse to start since that is the min-spec on Linux side now?

Running Steam on ubuntu 21.04 64-bit
STEAM_RUNTIME is enabled automatically
Pins up-to-date!
Error: Sorry, this computer's CPU is too old to run Steam.

Steam requires at least an Intel Pentium 4 or AMD Opteron, with the following features:
        - x86-64 (AMD64) instruction set (lm in /proc/cpuinfo flags)
        - CMPXCHG16B instruction support (cx16 in /proc/cpuinfo flags)
        - SSE3 instruction support (pni in /proc/cpuinfo flags)

@Sonicadvance1
Copy link
Member Author

More frustrating results.
Added back in interpreter fallback for bisecting differences between JIT and interpreter.
Found a bug with spilling on AArch64 but fixing that doesn't resolve the issue.
Trying to bisect our implementation lead me down a red herring path thinking that it was x87 or vector ops causing the issue.
Turns out that block size directly relates to this failing or not. So bisecting with interpreter fallback was giving erroneous results.

Setting the max instruction count to 1 works around the issue but it is of course slow and consumes a ton of ram.
Setting the max instruction count to 2 is enough to still break it.
This is implying that spilling doesn't really matter here.
Disabling Optimization passes doesn't resolve anything.
Disabling Inline Constants doesn't fix anything.
Seems like we can't disable static register allocation for testing to see if there is a bug there? @skmp How hard would that be to fix for testing?

And a basic app profile for steamwebhelper to do this

{"Config": {"MaxInst": "1"}}

@skmp
Copy link
Contributor

skmp commented Jun 28, 2021

@Sonicadvance1 disabling SRA is a couple of hours of work at best, more involved if you want to mix SRA and non-sra blocks

@Sonicadvance1
Copy link
Member Author

I mostly just want a compile time or runtime config flag for controlling using SRA or not. Shouldn't need to be mixed.

@Sonicadvance1
Copy link
Member Author

Looks like we might be hitting multiple issues here.
I believe SMC is causing one issue, where sometimes the UI comes up and sometimes it doesn't.
Then I believe there might be an issue with the L1 cache lookup, potentially with some sort of invalidation or aliasing problem.
I have L1 cache lookup entirely disabled for testing and it has made things more consistent

@Sonicadvance1
Copy link
Member Author

Confirmed that setting thread affinity of steamwebhelper to 1 core doesn't change behaviour.
Still spins and doesn't render.

@Sonicadvance1
Copy link
Member Author

Additional confirmation that setting thread affinity to 1 core and disabling TSO also doesn't change behaviour.
Unaligned LOCK ops will still hit the signal handler in this case, but will be uncontended and won't tear.

@Sonicadvance1
Copy link
Member Author

Additional. mfence after every instruction and disabling SRA doesn't resolve the issue.

@neobrain
Copy link
Member

neobrain commented Oct 17, 2021

And a basic app profile for steamwebhelper to do this

{"Config": {"MaxInst": "1"}}

For reference, this now also needs to include StallProcess for steamwebhelper to work:

{"Config": {"MaxInst": "1", "StallProcess": "0"}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
64bit Guest AArch64 Host bug Something isn't working help wanted Extra attention is needed high priority JIT
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants