-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[arm64] membarrier causing silent crashes/freezes #12605
Comments
I wonder if the syscall could have a different number in the kernel compiled for Android. |
The syscall number matches the system headers distributed with the Ubuntu container (which may not be correct) and it is decoded properly by strace. |
There must be something strange going on. If you look at the implementation of the syscall in Linux kernel https://elixir.bootlin.com/linux/v4.14.85/source/kernel/sched/membarrier.c#L152, the only case when it could fail with bad system call would be when the flags passed in were non-zero. |
I think it could be the container that the Linux is running in but there's not too much info available about it. If you have some hint what to look for I can check it, or I can try to share access to it. |
Further inspection suggests that it's caused by Seccomp that is turned on in the container (SECCOMP_MODE_FILTER in /proc/self/status). It could be something that is default Android policy. I will try to get more detailed logs, but it's non-trivial. |
Here's a snippet from the log (from my test app to avoid too much output):
|
There's something really weird going on. AOSP has whitelist for Update: Turns out the answer is obvious. It was actually added only in Android Q and this device is still on Android P. This is what the ART source code says
|
Turns out that I hit the same problem with |
On unrelated note, I rebuilt CoreCLR without the |
Issues due to the FlushProcessWriteBuffers would manifest mostly during heavy multi-threaded stress, so reproing in regular apps may take a lot of runs of an application to repro. |
I understand that but it begs the question why it was not used in .NET Core 2.2 or 2.1 and/or why it was not backported if it's such an issue. |
ARM64 Linux is not supported in 2.1 and 2.2. |
Ah, that explains a lot. Maybe it would make sense to mark the fallback path with some warning/assert on ARM64 if it's known not to work correctly. I'm fine with running somewhat broken builds locally until there is a better alternative. |
Samsung just announced that they are killing Linux on DeX so this is a dead end. |
I was trying to run CoreCLR on Galaxy S10 phone in the Linux-on-Dex environment (essentially Ubuntu 16.04 Docker container). Almost every non-trivial operation results in silent crash, including running
dotnet --version
.Running under
strace
seems to suggest that the problem is themembarrier
syscall introduced with PRs dotnet/coreclr#20949 and dotnet/coreclr#23778. The last line I can see in the log is the following:After that the process freezes and it's listed as
<defunct>
inps
.I wrote a test application to verify the assumption about
membarrier
being the culprit. Calling it throughsyscall
in the same way CoreCLR does results inSIGSYS
signal:The underlying kernel version reported by
uname -a
isLinux localhost 4.14.85-15820661 dotnet/coreclr#1 SMP PREEMPT Tue Apr 16 17:32:20 KST 2019 aarch64 aarch64 aarch64 GNU/Linux
./cc @VSadov @janvorli @tmds
The text was updated successfully, but these errors were encountered: