Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pacmak): EMFILE error when running jsii-pacmak #1891

Merged
merged 4 commits into from
Aug 18, 2020

Conversation

RomainMuller
Copy link
Contributor

The findLocalBuildDirs function did not have a protection against
inspecting the same package.json file multiple times, and
asynchronously traverses the whole dependency tree under the built
package. In certain pathological cases, dependencies could be processed
many times around (think about how @aws-cdk/aws-iam is a direct or
transitive dependency of nearly every other @aws-cdk/aws-* module).
The asynchronous nature of the process means that many instances of
the same file could be opened at the same time, occasionally inching
above the maximum file descriptor count limit, hence causing EMFILE (
which is the standard error code for "too many files open").

This change adds a visitedDirectories set to prevent re-visiting the
same dependency instance multiple times (based on the absolute path of
the package root). This incidentally also improves the performance of
the process, since making promises incurs overhead, and not
re-processing directories multiple times cut out a significant chunk of
the promises made in extreme cases.

Special thanks to @richardhboyd for having me look into this particular
problem.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

The `findLocalBuildDirs` function did not have a protection against
inspecting the same `package.json` file multiple times, and
asynchronously traverses the whole dependency tree under the built
package. In certain pathological cases, dependencies could be processed
many times around (think about how `@aws-cdk/aws-iam` is a direct or
transitive dependency of nearly every other `@aws-cdk/aws-*` module).
The asynchronous nature of the process means that *many* instances of
the same file could be opened at the same time, occasionally inching
above the maximum file descriptor count limit, hence causing `EMFILE` (
which is the standard error code for "too many files open").

This change adds a `visitedDirectories` set to prevent re-visiting the
same dependency instance multiple times (based on the absolute path of
the package root). This incidentally also improves the performance of
the process, since making promises incurs overhead, and not
re-processing directories multiple times cut out a significant chunk of
the promises made in extreme cases.

Special thanks to @richardhboyd for having me look into this particular
problem.
@RomainMuller RomainMuller added bug This issue is a bug. effort/small Small work item – less than a day of effort module/pacmak Issues affecting the `jsii-pacmak` module labels Aug 14, 2020
@RomainMuller RomainMuller requested a review from a team August 14, 2020 15:40
@RomainMuller RomainMuller self-assigned this Aug 14, 2020
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Aug 14, 2020
Copy link
Contributor

@nija-at nija-at left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to pull out the recursion and caching logic as a functional method that can be independently tested from the 'result collection' passed as a callback, so that these can be unit tested independently

@RomainMuller RomainMuller requested a review from nija-at August 18, 2020 13:51
@RomainMuller RomainMuller force-pushed the rmuller/limit-fd-usage branch from cea4368 to 9988142 Compare August 18, 2020 13:52
@RomainMuller RomainMuller force-pushed the rmuller/limit-fd-usage branch from 9988142 to 0adfe53 Compare August 18, 2020 13:54
Copy link
Contributor

@nija-at nija-at left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mergify
Copy link
Contributor

mergify bot commented Aug 18, 2020

Thank you for contributing! ❤️ I will now look into making sure the PR is up-to-date, then proceed to try and merge it!

@mergify mergify bot added the pr/ready-to-merge This PR is ready to be merged. label Aug 18, 2020
@mergify
Copy link
Contributor

mergify bot commented Aug 18, 2020

Merging (with squash)...

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-Blkkw9bQFn8A
  • Commit ID: 99b1464
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit 7316b44 into master Aug 18, 2020
@mergify mergify bot deleted the rmuller/limit-fd-usage branch August 18, 2020 21:12
@mergify
Copy link
Contributor

mergify bot commented Aug 18, 2020

Merging (with squash)...

@mergify mergify bot removed the pr/ready-to-merge This PR is ready to be merged. label Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. contribution/core This is a PR that came from AWS. effort/small Small work item – less than a day of effort module/pacmak Issues affecting the `jsii-pacmak` module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants