Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Backend implementation for bulk attachment downloading #555

Merged
merged 2 commits into from
Nov 11, 2020

Conversation

frankchn
Copy link
Contributor

@frankchn frankchn commented Nov 1, 2020

Backend implementation for bulk attachment downloading using existing CSV decryption workers.

@frankchn frankchn requested a review from liangyuanruo November 1, 2020 05:43
Copy link
Contributor

@liangyuanruo liangyuanruo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the Transform will cause issues because of a lack of backpressure management.

We might be able to resolve this by using pipeline, but let me test the implementation with a form with sizeable submission.

@frankchn
Copy link
Contributor Author

frankchn commented Nov 2, 2020

I am not sure if it will create significant backpressure because signing URLs doesn't require additional network calls when you have a valid IAM key. We can test this out in staging though.

@liangyuanruo liangyuanruo force-pushed the frank-refactor-download branch from b1f6365 to 9772217 Compare November 4, 2020 15:31
Copy link
Contributor

@liangyuanruo liangyuanruo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, the implementation looks good. Immediate follow-up is to memory profile the implementation using the node inspector so that we know it's safe, then put in the frontend.

@frankchn
Copy link
Contributor Author

@liangyuanruo, just tested. For signing and downloading 10,000 submissions, the heap usage went from 96 MB up to ~115 MB according to the profiler. The entire download took 15 seconds. I think this means that there are no significant memory pressure issues.

@frankchn frankchn force-pushed the frank-refactor-download branch from 9772217 to 90c5507 Compare November 11, 2020 07:39
@frankchn frankchn force-pushed the frank-refactor-download branch from 90c5507 to 5ce593f Compare November 11, 2020 07:50
@frankchn frankchn merged commit 202587e into develop Nov 11, 2020
@frankchn frankchn deleted the frank-refactor-download branch November 11, 2020 08:08
karrui added a commit that referenced this pull request Nov 16, 2020
this is a refactor of the function recently added in #555
karrui added a commit that referenced this pull request Nov 17, 2020
* feat(EncryptSubModel): add getSubmissionCursorByFormId static method

* feat(EncryptSubSvc): add getSubmissionCursor service fn

* feat(EncryptSubCtl): add handler for streaming encrypted response api

* ref(AdminFormsRoutes): use migrated handler for streaming responses

* feat(AdminFormsRoutes): add celebrate validtn for stream query params

* build: update tsconfig.build to allow inconsistent casing

due to transpiling code, some packages with lowercased type files will encounter some errors like:

TS1261: Already included file name 'path/to/node_modules/@types/JSONStream/index.d.ts' differs from file name 'path/to/node_modules/@types/jsonstream/index.d.ts' only in casing.

* test(EncryptSubSvc): add tests for getSubmissionCursor

* fix: use local copy of jsonstream types

* test(EncryptSubModel): add tests for getSubmissionCursorByFormId

* ref: use StatusCodes enums instead of hardcoded numbers

* feat: remove `end` event handler in stream

the close handler should be enough to handle everything. From testing, the end event is not called anyways

* ref(AdminFormsRoutes): extract out YYYYMMDD regex into variable

* fix(AdminFormsRoutes): use Joi.object.and instead of object.with

`with` allows for a query with only `endDate` since only `startDate` is required, which is wrong

* ref(EncryptSubSvc): add transformAttachmentMetaStream function

this is a refactor of the function recently added in #555

* feat(EncryptSubCtl): insert transform stream into download pipeline

* feat: remove unused stream responses handler

* feat(EncryptSubSvc): add logging to error

also use for of loop for correct callback return

* feat(EncryptSubCtl): add error handler for attachment pipe errors

* test(EncryptSubSvc): add tests for transformAttachmentMetaStream

* chore: trigger travis rebuild
@karrui karrui mentioned this pull request Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants