Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource/github_repository_file: Add path parameter to reduce Github API calls #589

Merged
merged 2 commits into from
Nov 13, 2020

Conversation

cnelissen
Copy link
Contributor

If you have a repository with a very high number of commits, and you try to add a managed file using this provider, you will hit the Github API limit before the state for the file can be refreshed. The code currently returns ALL repository commits (https://github.com/terraform-providers/terraform-provider-github/blob/900aa92730cea0f5438063bfd410330749e89be3/github/resource_github_repository_file.go#L342), and then iterates through them to see if the commit has the managed file in its tree, and if so, returns the commit info. Since Github limits API calls to 5,000 per hour, any repository with ~5,000 commits in its history will not be able to complete even one state refresh before hitting the limit.

Luckily, the Github API supports a path parameter (/repos/:owner/:repo/commits?path=path/to/file), which will only return commits for that file. This will significantly reduce the number of API calls that the provider needs to make to return the correct information for state.

If you have a repository with a high number of commits, you will quickly hit the Github API limit when trying to get all the commits for the repo. Adding the path parameter will limit the commits to only include the file in question, which will significantly reduce the number of API calls.
@ghost ghost added the size/XS label Nov 5, 2020
@jcudit jcudit added this to the v4.0.1 milestone Nov 9, 2020
@jcudit
Copy link
Contributor

jcudit commented Nov 9, 2020

Thanks for this optimization! Seems like a great candidate for v4.0.1. I'll return here with some measurements to confirm the speedup, but this looks promising!

/cc https://github.com/terraform-providers/terraform-provider-github/pull/589

Copy link
Contributor

@jcudit jcudit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v4.0.0 took around 8 minutes to create 100 files in a repo. Destroying failed due to being rate limited for a duration longer than the command timeout.

Apply complete! Resources: 101 added, 0 changed, 0 destroyed.

real	8m44.234s
user	0m4.589s
sys	0m2.522s

Error: GET https://api.github.com/repos/terraformtesting/repository-file-test/branches/main: 403 API rate limit of 5000 still exceeded until 2020-11-12 17:45:30 -0500 EST, not making remote request. [rate reset in 14m41s]

real	30m31.630s
user	0m5.956s
sys	0m3.690s

This new version clocked better on both creates and destroys:

Apply complete! Resources: 101 added, 0 changed, 0 destroyed.

real	4m26.469s
user	0m3.410s
sys	0m1.522s

@cnelissen thanks for this. Welcome any other optimizations you can spot in the future!

@jcudit jcudit merged commit 4c72648 into integrations:master Nov 13, 2020
kfcampbell pushed a commit to kfcampbell/terraform-provider-github that referenced this pull request Jul 26, 2022
…API calls (integrations#589)

* Update resource_github_repository_file.go

If you have a repository with a high number of commits, you will quickly hit the Github API limit when trying to get all the commits for the repo. Adding the path parameter will limit the commits to only include the file in question, which will significantly reduce the number of API calls.

* Fix formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants