Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting error spawning Chromium browser on centos aarch64 #89382

Closed
liza-mae opened this issue Jan 27, 2021 · 13 comments
Closed

Reporting error spawning Chromium browser on centos aarch64 #89382

liza-mae opened this issue Jan 27, 2021 · 13 comments
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort triage_needed

Comments

@liza-mae
Copy link
Contributor

Kibana version: 7.11.0 BC4

Elasticsearch version: 7.11.0 BC4

Server OS version: CentOS aarch64

Browser version: Chrome latest

Original install method (e.g. download page, yum, from source, etc.):
Staging

Description of the problem including expected versus actual behavior:
Reporting can't spawn chromium, I installed what we have listed as system dependencies except for ipa-gothic-fonts which is not found, not sure if that is the cause of the crash though. Same test works on Ubuntu.

Steps to reproduce:

  1. Install ES/Kibana
  2. Generate report on a dashboard, notice error occurs.

Provide logs and/or server output (if relevant):

 puppeteer:launcher Calling /home/centos/kibana-7.11.0-linux-aarch64/x-pack/plugins/reporting/chromium/headless_shell-linux_arm64/headless_shell --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=TranslateUI --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --enable-blink-features=IdleDetection --user-data-dir=/tmp/chromium-YhuVC7 --headless --hide-scrollbars --mute-audio --disable-translate --disable-extensions --disable-background-networking --safebrowsing-disable-auto-update --disable-sync --metrics-recording-only --disable-default-apps --mute-audio --no-first-run --user-data-dir=/tmp/chromium-YhuVC7 --disable-gpu --headless --hide-scrollbars --window-size=2483,5616 --no-sandbox --disable-setuid-sandbox about:blank --remote-debugging-pipe +0ms
  puppeteer:protocol:SEND ► {"method":"Target.setDiscoverTargets","params":{"discover":true},"id":1} +0ms
  puppeteer:error Error: write EPIPE
  puppeteer:error     at afterWriteDispatched (internal/stream_base_commons.js:156:25)
  puppeteer:error     at writeGeneric (internal/stream_base_commons.js:147:3)
  puppeteer:error     at Socket._writeGeneric (net.js:785:11)
  puppeteer:error     at Socket._write (net.js:797:8)
  puppeteer:error     at writeOrBuffer (internal/streams/writable.js:358:12)
  puppeteer:error     at Socket.Writable.write (internal/streams/writable.js:303:10)
  puppeteer:error     at PipeTransport.send (/home/centos/kibana-7.11.0-linux-aarch64/node_modules/puppeteer/lib/cjs/puppeteer/node/PipeTransport.js:37:25)
  puppeteer:error     at Connection._rawSend (/home/centos/kibana-7.11.0-linux-aarch64/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:78:25)
  puppeteer:error     at Connection.send (/home/centos/kibana-7.11.0-linux-aarch64/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:69:25)
  puppeteer:error     at Function.create (/home/centos/kibana-7.11.0-linux-aarch64/node_modules/puppeteer/lib/cjs/puppeteer/common/Browser.js:95:26)
  puppeteer:error     at ChromeLauncher.launch (/home/centos/kibana-7.11.0-linux-aarch64/node_modules/puppeteer/lib/cjs/puppeteer/node/Launcher.js:101:56)
  puppeteer:error     at Observable._subscribe (/home/centos/kibana-7.11.0-linux-aarch64/x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.js:158:19) +0ms
{"type":"log","@timestamp":"2021-01-27T00:11:33+00:00","tags":["error","plugins","reporting","PNG","execute","kkeob8y40p7f77324b2exwmc"],"pid":32667,"message":"Error: Error spawning Chromium browser!\n    at Observable._subscribe (/home/centos/kibana-7.11.0-linux-aarch64/x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.js:174:24)"}

Describe the feature:

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServices)

@liza-mae liza-mae added the bug Fixes for quality problems that affect the customer experience label Jan 27, 2021
@tsullivan
Copy link
Member

The error is a segmentation fault, which I found in the Kibana server logs:

  log   [18:38:40.791] [error][plugins][reporting] Browser exited abnormally, received code: ,SIGSEGV
  log   [18:38:45.791] [info][plugins][reporting] Successfully sent 'SIGKILL' to browser process (PID: 4818)

This happens when we run the Reporting diagnostic tool, and Kibana just tries to open Chromium to a blank page.

@tsullivan
Copy link
Member

I haven't been able to run headless_shell in a way that produces a chromium.log file generated unfortunately.

I'm attempting to investigate this with gdb on CentOS 8.3: https://www.gnu.org/software/gcc/bugs/segfault.html

  1. sudo dnf group install "Development Tools"

gdb headless_shell
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-12.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from headless_shell...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/centos/kibana-7.11.0-linux-aarch64/x-pack/plugins/reporting/chromium/headless_shell-linux_arm64/headless_shell
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb)

@liza-mae
Copy link
Contributor Author

I also did not get anything from gdb -- I tried valgrind, output below if it helps:

valgrind -v ./headless_shell 
==51249== Memcheck, a memory error detector
==51249== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==51249== Using Valgrind-3.16.0-bf5e647edb-20200519X and LibVEX; rerun with -h for copyright info
==51249== Command: ./headless_shell
==51249== 
--51249-- Valgrind options:
--51249--    -v
--51249-- Contents of /proc/version:
--51249--   Linux version 4.18.0-240.1.1.el8_3.aarch64 ([email protected]) (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)) #1 SMP Thu Nov 19 22:13:39 UTC 2020
--51249-- 
--51249-- Arch and hwcaps: ARM64, LittleEndian, baseline
--51249-- Page sizes: currently 65536, max supported 65536
--51249-- Valgrind library directory: /usr/libexec/valgrind
--51249-- Reading syms from /home/centos/kibana-7.11.0-linux-aarch64/x-pack/plugins/reporting/chromium/headless_shell-linux_arm64/headless_shell
--51249-- ELF section outside all mapped regions
--51249-- Reading syms from /usr/lib64/ld-2.28.so
--51249-- Reading syms from /usr/libexec/valgrind/memcheck-arm64-linux
--51249--    object doesn't have a symbol table
--51249--    object doesn't have a dynamic symbol table
--51249-- Scheduler: using generic scheduler lock implementation.
--51249-- Reading suppressions file: /usr/libexec/valgrind/default.supp
==51249== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-51249-by-centos-on-ip-10-0-0-6.us-west-2.compute.internal
==51249== embedded gdbserver: writing to   /tmp/vgdb-pipe-to-vgdb-from-51249-by-centos-on-ip-10-0-0-6.us-west-2.compute.internal
==51249== embedded gdbserver: shared mem   /tmp/vgdb-pipe-shared-mem-vgdb-51249-by-centos-on-ip-10-0-0-6.us-west-2.compute.internal
==51249== 
==51249== TO CONTROL THIS PROCESS USING vgdb (which you probably
==51249== don't want to do, unless you know exactly what you're doing,
==51249== or are doing some strange experiment):
==51249==   /usr/libexec/valgrind/../../bin/vgdb --pid=51249 ...command...
==51249== 
==51249== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==51249==   /path/to/gdb ./headless_shell
==51249== and then give GDB the following command
==51249==   target remote | /usr/libexec/valgrind/../../bin/vgdb --pid=51249
==51249== --pid is optional if only one valgrind process is running
==51249== 
--51249-- REDIR: 0x7bf7d40 (ld-linux-aarch64.so.1:strlen) redirected to 0x580ce3c4 (???)
--51249-- REDIR: 0x7bf7ac0 (ld-linux-aarch64.so.1:strcmp) redirected to 0x580ce418 (???)
--51249-- REDIR: 0x7bf79b0 (ld-linux-aarch64.so.1:index) redirected to 0x580ce3ec (???)
--51249-- Reading syms from /usr/libexec/valgrind/vgpreload_core-arm64-linux.so
--51249-- Reading syms from /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so
==51249== 
==51249== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==51249==  Access not within mapped region at address 0x8
==51249==    at 0x7BEAE2C: _dl_relocate_object (in /usr/lib64/ld-2.28.so)
==51249==    by 0x7BE3AA3: dl_main (in /usr/lib64/ld-2.28.so)
==51249==    by 0x7BF563B: _dl_sysdep_start (in /usr/lib64/ld-2.28.so)
==51249==    by 0x7BE1903: _dl_start_final (in /usr/lib64/ld-2.28.so)
==51249==    by 0x7BE1B9F: _dl_start (in /usr/lib64/ld-2.28.so)
==51249==    by 0x7BE1087: ??? (in /usr/lib64/ld-2.28.so)
==51249==  If you believe this happened as a result of a stack
==51249==  overflow in your program's main thread (unlikely but
==51249==  possible), you can try to increase the size of the
==51249==  main thread stack using the --main-stacksize= flag.
==51249==  The main thread stack size used in this run was 8388608.
./headless_shell: symbol lookup error: /usr/libexec/valgrind/vgpreload_core-arm64-linux.so: undefined symbol: __libc_freeres
==51249== 
==51249== HEAP SUMMARY:
==51249==     in use at exit: 0 bytes in 0 blocks
==51249==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==51249== 
==51249== All heap blocks were freed -- no leaks are possible
==51249== 
==51249== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
--51249-- 
--51249-- used_suppression:      1 dl-hack4-64bit-addr-1 /usr/libexec/valgrind/default.supp:1263
==51249== 
==51249== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
Segmentation fault (core dumped)

@tsullivan
Copy link
Member

I'm going to try running a new build in CentOS

https://github.com/elastic/kibana/blob/master/x-pack/build_chromium/README.md

@liza-mae liza-mae changed the title Reporting error spawning Chromium browser on aarch64 Reporting error spawning Chromium browser on centos aarch64 Feb 2, 2021
@liza-mae
Copy link
Contributor Author

liza-mae commented Feb 3, 2021

@tsullivan with 7.11.0 release coming up how should we handle reporting this issue?

@droberts195
Copy link
Contributor

Same test works on Ubuntu.

@liza-mae was that Ubuntu on aarch64? So this problem is specific to CentOS 8 on aarch64, not aarch64 in general?

Did 7.10 pass this test on CentOS 8 on aarch64? And does 7.10 still pass on this exact same aarch64 VM?

I am just trying to understand if something changed in CentOS 8, or in Chromium, or if this has always been a problem but we didn't have the test coverage before.

@droberts195
Copy link
Contributor

droberts195 commented Feb 4, 2021

It looks like the same problem as electron/electron#21395 and the very last comment in that also tallies with microsoft/vscode#108509 (comment) and would explain why it affects RHEL/CentOS 8 but not some other distributions.

Search for "the process has hardcoded page size in code" in The definitive guide to make software fail
on ARM64
. It looks like Chromium does exactly that: https://chromium.googlesource.com/chromium/src/base/+/master/allocator/partition_allocator/page_allocator_constants.h

@liza-mae
Copy link
Contributor Author

liza-mae commented Feb 4, 2021

@droberts195 we did not test CentOS in 7.10 so the problem exists there too, thanks for finding the electron issue yes that seems the same, we can try to do a rebuild of chromium after 7.11.0 for now we will update the docs to make it a known issue, I will put up a PR for it.

@tsullivan
Copy link
Member

we can try to do a rebuild of chromium after 7.11.0

I think that in Linux, Chromium does not allow the page size to be defined at run time, only at compile time. That means that recompiling Chromium to support ARM on CentOS/Redhat would break support for ARM on Ubuntu.

On Mac, the system page size is a run-time property for Chromium: https://chromium.googlesource.com/chromium/src/base/+/93bade42cb25ed443d67fb61eb14dda49122cdb4

I have found this issue that requests Chromium not force page size to be 4K on ARM:
https://bugs.chromium.org/p/chromium/issues/detail?id=1127980&q=system%20page%20size%20linux%20arm&can=2

@liza-mae
Copy link
Contributor Author

liza-mae commented Mar 3, 2021

We had a slack discussion that we could generate different binaries for the different page sizes.

@streamich streamich added (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. triage_needed labels May 4, 2021
@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort and removed impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. labels May 13, 2021
@tsullivan
Copy link
Member

The Chromium issue has been moved to: https://issuetracker.google.com/issues/186639159?pli=1

@tsullivan
Copy link
Member

#90385 added documentation that Kibana Reporting is not supported for ARM

Closing this as Kibana is blocked on working on this due to the upstream issue with Chromium.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort triage_needed
Projects
None yet
Development

No branches or pull requests

6 participants