Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: inefficient form metadata query leading to excessive memory usage #5763

Merged
merged 1 commit into from
Feb 14, 2023

Conversation

tshuli
Copy link
Contributor

@tshuli tshuli commented Feb 14, 2023

Problem

Solution

  • For load test example, see forms 63e9b7e27e26fb0012795ca8 (200k small responses) and 63e9d46e6d6d6f0012d14402 (20k large responses of >700KB each) on staging (log in using form team email, key in 1pw).
  • It turns out that response download is already efficient. The current implementation returns a cursor which iterates over the matching documents. It appears that mongoDB has implemented this in a non-blocking fashion (from testing, multiple concurrent downloads of large forms can proceed concurrently without spike in disk util).
  • Instead, the reason for the high disk utilisation was inefficient query implementation in the /metadata endpoint, which returns response metadata for storage mode submissions. This is triggered once the admin navigates to response tab.
    • In submission.server.model.ts, the following code retrieves all submission documents corresponding to the form, stores it all in memory (or disk, because allowDiskUse(true)), in order to count the number of responses. This led to excessive disk utilisation and slow query (see NOTE below).
    • To solve this, this has been replaced with a simpler countDocuments query.

Existing code

.facet({
  pageResults: [
    { $skip: numToSkip },
    { $limit: pageSize },
    { $project: { _id: 1, created: 1 } },
  ],
  allResults: [
    { $group: { _id: null, count: { $sum: 1 } } },
    { $project: { _id: 0 } }, // NOTE: Means project ALL fields except _id
  ],
})
// prevents out-of-memory for large search results (max 100MB).
.allowDiskUse(true) 
// NOTE: As the above query required excessive memory, 
// it was written to disk instead and this led to high disk utilisation 
.then((result: MetadataAggregateResult[]) => {
  const [{ pageResults, allResults }] = result
  const [numResults] = allResults
  const count = numResults?.count ?? 0

Improvements:

Before & After Screenshots

BEFORE:

  • Timeout (504) for metadata endpoint for forms with many submissions

timeout

  • 100% disk utilisation on DB

diskutil

AFTER:

  • metadata endpoint resolves in 200+ms

fasteresolve

  • No impact to db disk util

lowdiskutil

Tests

  • Create storage mode form. Submit 20 responses. Check that metadata and pagination shows correctly on results tab.

@tshuli tshuli temporarily deployed to staging-al2 February 14, 2023 02:24 — with GitHub Actions Inactive
@tshuli tshuli temporarily deployed to staging-al2 February 14, 2023 02:26 — with GitHub Actions Inactive
@tshuli tshuli force-pushed the fix/metadata-query branch from bd228b8 to 1528c65 Compare February 14, 2023 02:56
@tshuli tshuli temporarily deployed to staging-al2 February 14, 2023 02:56 — with GitHub Actions Inactive
@mergify mergify bot mentioned this pull request Feb 14, 2023
@tshuli tshuli changed the base branch from develop to release-al2 February 14, 2023 03:23
@tshuli tshuli changed the base branch from release-al2 to develop February 14, 2023 03:24
Copy link
Contributor

@timotheeg timotheeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Great work!

@tshuli tshuli merged commit 38abbc9 into develop Feb 14, 2023
@tshuli tshuli deleted the fix/metadata-query branch February 14, 2023 06:46
@justynoh justynoh mentioned this pull request Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[8] Database read time spikes due to response downloads
2 participants