Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sm/zip reads readables earlier #683

Merged
merged 22 commits into from
Aug 11, 2022

Conversation

mshanemc
Copy link
Contributor

@mshanemc mshanemc commented Aug 4, 2022

What does this PR do?

zip operation is probably the most common thing SDR does (every push/deploy)
ZipWriter passed all the WriteInfos to Archiver as Readable
Before this PR, all those Readable remained open until the zip finalizes.
So that would hit windows file limits but only on the zip operations (source->metadata and metada->source formats worked fine).

This PR moves the reading of those files left in the pipeline, so that they're read into buffers as they come into ZipWriter (using existing utility code) and then passed to Archiver. Once a file is read into buffer, the stream closes and the file doesn't have to stay open.

Besides solving the bug for windows users, this also has some perf benefits for all users (compare the passing scale nut from the 2 circle jobs shown below, or the checked in perf stats for linux/mac) on the larger projects' sourceToZip

fix #578 .

What issues does this PR fix or reference?

@W-11492994@
fix forcedotcom/cli#1555

Functionality Before

https://app.circleci.com/pipelines/github/forcedotcom/source-deploy-retrieve/4345/workflows/e4e5f01c-5eaa-4b9c-becb-1d23dcf358a0/jobs/19752)

Functionality After

https://app.circleci.com/pipelines/github/forcedotcom/source-deploy-retrieve/4351/workflows/f664546c-54fd-430e-a8ab-4a16dfec1d87/jobs/19817

@mshanemc mshanemc requested review from a team as code owners August 4, 2022 17:08
@mshanemc mshanemc requested a review from angela-young August 4, 2022 17:08
@salesforce-cla
Copy link

salesforce-cla bot commented Aug 4, 2022

Thanks for the contribution! Unfortunately we can't verify the commit author(s): Ubuntu <c***@i***.e***.internal>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, sign the Salesforce.com Contributor License Agreement and this Pull Request will be revalidated.

@mshanemc mshanemc changed the base branch from main to sm/perf-scale-nuts August 4, 2022 17:09
Copy link
Member

@WillieRuemmele WillieRuemmele left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't get those tests working?

any idea why the sourceToMdapi time increased across the board? While the other three decreased?

@mshanemc
Copy link
Contributor Author

mshanemc commented Aug 4, 2022

couldn't get those tests working?

got them back now. It's the typical UT scenario--the tests are deeply tied to the implementation. I just wanted the NUTs to run to tell me I wasn't crazy (and they caught a real bug in folder types) before investing the time rewriting the UT

@mshanemc
Copy link
Contributor Author

mshanemc commented Aug 4, 2022

any idea why the sourceToMdapi time increased across the board? While the other three decreased?

whatever it is, it's not affected by this change to the zip writer. anything that's < 1 second of difference on these larger projects I'm gonna consider noise (after observing these for a while).

It's probably that the initial run had some good luck to be on the lower end and everything looks (slightly) worse by comparison.

We could probably run the tests a few times to increase precision. I'm looking more for "wow, that's 2x or 50% less" vs. "didn't change much" on these.

@mshanemc mshanemc requested a review from WillieRuemmele August 4, 2022 22:05
@mshanemc mshanemc mentioned this pull request Aug 10, 2022
Copy link
Contributor

@RodEsp RodEsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was successfully able to run the perf NUTs with the new code and would intermittently get the "too many files open" error with the old code. Something to note though is that in both scenarios they took 30+ minutes to run 😕

I just have two very small comments but after those are addressed feel free to merge this at your convenience @mshanemc.

.circleci/config.yml Show resolved Hide resolved
src/convert/streams.ts Outdated Show resolved Hide resolved
mshanemc and others added 4 commits August 11, 2022 07:29
Co-authored-by: Rodrigo Espinosa de los Monteros <[email protected]>
Co-authored-by: Rodrigo Espinosa de los Monteros <[email protected]>
@RodEsp RodEsp merged commit 602b78d into sm/perf-scale-nuts Aug 11, 2022
@RodEsp RodEsp deleted the sm/zip-reads-readables-earlier branch August 11, 2022 13:46
@mshanemc mshanemc restored the sm/zip-reads-readables-earlier branch August 11, 2022 14:02
@mshanemc mshanemc mentioned this pull request Aug 11, 2022
@mshanemc mshanemc deleted the sm/zip-reads-readables-earlier branch August 19, 2022 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EMFILE error when trying to push source changes EMFILE: too many open files
3 participants