Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add retries + improve logging for bundled package build task #125959

Merged

Conversation

kpollich
Copy link
Member

Fixes #125952
Ref #125851

Makes a few changes to the bundle_fleet_packages build step to improve stability and lower the chances of transient network errors resulting in broken builds or corrupted Fleet packages.

  • Add a skipChecksumCheck option to downloadToDisk so we can leverage Kibana's existing retry and cleanup logic around downloads during builds without the headaches of sorting out checksums for package archives
  • Adds additional logging that outputs the set of configured packages/version and the size in bytes of the downloaded file

@kpollich kpollich added enhancement New value added to drive a business result release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team auto-backport Deprecated - use backport:version if exact versions are needed v8.2.0 ci:build-os-packages labels Feb 17, 2022
@kpollich kpollich requested a review from a team as a code owner February 17, 2022 15:56
@kpollich kpollich self-assigned this Feb 17, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@kpollich kpollich changed the title Add retries + improve logging for bundled package build task [Fleet] Add retries + improve logging for bundled package build task Feb 17, 2022
Copy link
Contributor

@paul-tavares paul-tavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code around retry and logging some known error conditions looks good, so I'm 👍 it.

Just wanted to point a few things out. I don't know the context from where this build script runs, so perhaps these are ok, but:

  1. Using several sync** fs functions which means they are event loop blocking
  2. I saw log.debug() calls, which might not be logged to a CI's job. I don't know what level of verbosity it runs with, so keep that in mind or just change them to log.info() calls

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @kpollich

@kpollich
Copy link
Member Author

Thanks, @paul-tavares - both good call outs.

  1. Using several sync** fs functions which means they are event loop blocking

Definitely valid - I think especially with the stable release of fs/promises in Node v14.0 it'd be worth moving to the Promise API where possible, but I'd consider that out of scope for this PR.

  1. I saw log.debug() calls, which might not be logged to a CI's job. I don't know what level of verbosity it runs with, so keep that in mind or just change them to log.info() calls

Excellent point, and I just wanted to confirm the log level so I went to the jobs in the referenced ES promotion issue and pulled a sample of logs around the Fleet job:

https://buildkite.com/elastic/kibana-elasticsearch-snapshot-verify/builds/850#32d257d0-da93-42d3-9866-ac6e94329909

info [  kibana  ] Cleaning all empty folders recursively
--
  | │ debg Deleting all empty folders and their children recursively starting on  /opt/local-ssd/buildkite/builds/kb-c2-16-16e85f3cfce4cab1/elastic/kibana-elasticsearch-snapshot-verify/kibana/build/kibana
  | │ debg Deleted 1053 empty folders
  | │ succ ✓ 5 sec
  |  
  | info [  kibana  ] Bundling fleet packages
  | │ info Fetching fleet packages from package registry
  | │ info Downloading bundled package from https://epr.elastic.co/epr/apm/apm-8.0.0.zip
  | │ info Downloading bundled package from https://epr.elastic.co/epr/elastic_agent/elastic_agent-1.3.0.zip
  | │ info Downloading bundled package from https://epr.elastic.co/epr/endpoint/endpoint-1.4.1.zip
  | │ info Downloading bundled package from https://epr.elastic.co/epr/fleet_server/fleet_server-1.1.0.zip
  | │ info Downloading bundled package from https://epr.elastic.co/epr/synthetics/synthetics-0.9.0.zip
  | │ succ ✓ 0 sec
  | │
  | │ info [  kibana  ] Creating platform-specific archive source directories
  | │ debg Generic build source copied into linux-x64 specific build directory
  | │ debg Node.js copied into linux-x64 specific build directory
  | │ succ ✓ 2 sec
  | │
  | │ info [  kibana  ] Patching platform-specific native modules
  | │ debg Patching re2 binaries from https://github.com/uhop/node-re2/releases/download/1.16.0/linux-x64-93.gz to /opt/local-ssd/buildkite/builds/kb-c2-16-16e85f3cfce4cab1/elastic/kibana-elasticsearch-snapshot-verify/kibana/build/default/kibana-8.2.0-SNAPSHOT-linux-x86_64/node_modules/re2/build/Release/re2.node
  | │ debg Deleting patterns: [
  | │        '/opt/local-ssd/buildkite/builds/kb-c2-16-16e85f3cfce4cab1/elastic/kibana-elasticsearch-snapshot-verify/kibana/build/default/kibana-8.2.0-SNAPSHOT-linux-x86_64/node_modules/re2/build/Release/re2.node'
  | │      ]
  | │ debg Deleted 1 files/directories
  | │ debg [1/3] Attempting download of https://github.com/uhop/node-re2/releases/download/1.16.0/linux-x64-93.gz sha256
  | │ debg Downloaded https://github.com/uhop/node-re2/releases/download/1.16.0/linux-x64-93.gz and verified checksum
  | │ succ ✓ 0 sec
│ info [  kibana  ] Installing Chromium
--
  | │ info Installing Chromium for linux-x64
  | │ warn Found browser binary checksum for linux/x64 in /opt/local-ssd/buildkite/builds/kb-c2-16-16e85f3cfce4cab1/elastic/kibana-elasticsearch-snapshot-verify/kibana/build/default/kibana-8.2.0-SNAPSHOT-linux-x86_64/x-pack/plugins/screenshotting/chromium/headless_shell-linux_x64/headless_shell is MISSING but 82e80f9727a88ba3836ce230134bd126 was expected. Re-installing...

Looks like we are getting debg and info and warn logs from my cursory look through these logs, so I think we're okay with log.debug here. Thanks again for your review!

@kpollich kpollich merged commit 4c20ea7 into elastic:main Feb 17, 2022
@kpollich kpollich deleted the 125952-improve-bundled-build-stability branch February 17, 2022 20:53
@kibanamachine
Copy link
Contributor

💔 Backport failed

The pull request could not be backported due to the following error:
There are no branches to backport to. Aborting.

How to fix

Re-run the backport manually:

node scripts/backport --pr 125959

Questions ?

Please refer to the Backport tool documentation

@kibanamachine
Copy link
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create backports run node scripts/backport --pr 125959 or prevent reminders by adding the backport:skip label.

@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Feb 21, 2022
@kpollich kpollich added the backport:skip This commit does not require backporting label Feb 22, 2022
@kibanamachine kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label Feb 22, 2022
kpollich added a commit to kpollich/kibana that referenced this pull request Mar 1, 2022
…lastic#125959)

* Add retries + improve logging for bundled package build task

* Update snapshots

* Remove unused imports in bundle task

* Address PR feedback

* Replace throw w/ reject

* Add explicit returns

(cherry picked from commit 4c20ea7)
kpollich added a commit that referenced this pull request Mar 1, 2022
…125959) (#126550)

* Add retries + improve logging for bundled package build task

* Update snapshots

* Remove unused imports in bundle task

* Address PR feedback

* Replace throw w/ reject

* Add explicit returns

(cherry picked from commit 4c20ea7)
lucasfcosta pushed a commit to lucasfcosta/kibana that referenced this pull request Mar 2, 2022
…lastic#125959)

* Add retries + improve logging for bundled package build task

* Update snapshots

* Remove unused imports in bundle task

* Address PR feedback

* Replace throw w/ reject

* Add explicit returns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed backport:skip This commit does not require backporting ci:build-os-packages enhancement New value added to drive a business result release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Fleet] Improve stability and debugging experience around package bundling build step
6 participants