Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reporting] Fix performance of CSV generation #120309

Merged
merged 5 commits into from
Dec 3, 2021

Conversation

tsullivan
Copy link
Member

@tsullivan tsullivan commented Dec 2, 2021

Summary

Closes #120275

Problem: CSV Generation is a CPU-intensive process. It was generating rows in a way that would not allow NodeJS to give control to other pending events: it was blocking the event loop.

Solution: This PR changes the implementation to use the "partitioning" strategy to yield control back to the event loop after every CSV row is generated.

Reference: https://nodejs.org/en/docs/guides/dont-block-the-event-loop/#complex-calculations-without-blocking-the-event-loop

Testing

  1. Vagrant Ubuntu instance running Kibana with 2 cores and 1GB of RAM
  2. Using Kibana config:
logging:
  appenders:
    file:
      type: file
      fileName: logs/kibana.log
      layout:
        type: json
  root:
    appenders: [default, file]
  loggers:
    - name: plugins.reporting
      appenders: [file]
      level: debug
    - name: metrics.ops
      appenders: [file]
      level: debug
xpack.reporting:
  encryptionKey: nonrandomencryptionkey
  csv.maxSizeBytes: 500000000
  csv.scroll.size: 1000
  queue.timeout: 15m
  1. Try to export metricbeat* data that contains 806k hits

CPU utilization and event loop delay gets very high when generating a large CSV export, but Kibana is still able to process other events.

@tsullivan tsullivan added 8.0.0 v7.15.3 v7.16.1 v7.17.0 v8.1.0 (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead release_note:fix labels Dec 3, 2021
@tsullivan tsullivan marked this pull request as ready for review December 3, 2021 00:15
@tsullivan tsullivan requested review from a team as code owners December 3, 2021 00:15
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-reporting-services (Team:Reporting Services)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServicesUx)

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @tsullivan 🍻

Approving to unblock, but I think we should consider adding the suggested perf code change too.


const asyncGenerateRow = async (dataTableRow: Record<string, any>): Promise<string> => {
return new Promise((resolve) => {
setImmediate(() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment explaining why we are using setImmediate here? Perhaps a link to the to nodejs documentation too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* 1. Partition the synchronous process with fewer partitions, by using
* the loop counter to call setImmediate only every N amount of rows.
* Testing is required to see what the best N value for most data will
* be.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a feasible option. Filed #120426

@tsullivan tsullivan merged commit 000ca76 into elastic:main Dec 3, 2021
@tsullivan tsullivan deleted the reporting/fix-csv-event-blocking branch December 3, 2021 22:52
tsullivan added a commit to tsullivan/kibana that referenced this pull request Dec 3, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
tsullivan added a commit to tsullivan/kibana that referenced this pull request Dec 3, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
tsullivan added a commit that referenced this pull request Dec 4, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
tsullivan added a commit that referenced this pull request Dec 4, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
@jportner jportner added v8.0.0 auto-backport Deprecated - use backport:version if exact versions are needed and removed 8.0.0 labels Dec 4, 2021
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 4, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
@kibanamachine
Copy link
Contributor

💔 Backport failed

Status Branch Result
8.0
7.17 The branch "7.17" is invalid or doesn't exist

Successful backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 120309

kibanamachine added a commit that referenced this pull request Dec 4, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment

Co-authored-by: Tim Sullivan <[email protected]>
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
* [Reporting] Fix performance of CSV generation

* use a for loop with 1 operation instead of 3 chained maps

* do without the callback

* update comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead release_note:fix v7.15.3 v7.16.1 v7.17.0 v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Reporting] CSV Export causes Kibana to be unstable
7 participants