Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to control PR description max length + set reasonable defaults for each platform #7487

Merged

Conversation

dwc0011
Copy link
Contributor

@dwc0011 dwc0011 commented Jun 26, 2023

Add option to set the max character length of the PR description. Any descriptions longer than the max will get truncated with a clear message to the user that the remaining description was truncated.

This is mostly a refactor / moving of existing code that had been copy/pasted across several PR creators so that we have a clean generic version that can re-used everywhere.

In addition to being configurable, this also sets reasonable defaults for some of the platforms that :dependabot: supports.

Fix #6976

Copy link
Contributor

@yeikel yeikel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jun 30, 2023

@deivid-rodriguez @jeffwidman if you have chance to review I would appreciate it. There is major change for terraform providers that causes a PR description that exceeds the codecommit limit and is blowing up on us. Thanks!

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 7, 2023

@deivid-rodriguez @jeffwidman @jurre Any estimate on when you all will be able to look at this?

jeffwidman
jeffwidman previously approved these changes Jul 15, 2023
Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Nice thorough test cases too (🎩 💁‍♂️ to @yeikel for the nudge).

common/lib/dependabot/clients/codecommit.rb Outdated Show resolved Hide resolved
@jeffwidman jeffwidman force-pushed the dwc0011/truncate_codecommit_pr_description branch from 14c6d16 to 6d8a525 Compare July 15, 2023 17:37
@jeffwidman jeffwidman changed the title Add truncate_pr_description and align rescue properly in fetch_file_c… Truncate CodeCommit PR descriptions longer than 10,240 characters Jul 15, 2023
@jeffwidman
Copy link
Member

jeffwidman commented Jul 15, 2023

Hmm, I just noticed this already exists in 2 other places:

def create_pull_request
# Limit PR description to MAX_PR_DESCRIPTION_LENGTH (65,536) characters
# and truncate with message if over. The API limit is 262,144 bytes
# (https://github.community/t/maximum-length-for-the-comment-body-in-issues-and-pr/148867/2).
# As Ruby strings are UTF-8 encoded, this is a pessimistic limit: it
# presumes the case where all characters are 4 bytes.
pr_description = @pr_description.dup
if pr_description && pr_description.length > MAX_PR_DESCRIPTION_LENGTH
truncated_msg = "...\n\n_Description has been truncated_"
truncate_length = MAX_PR_DESCRIPTION_LENGTH - truncated_msg.length
pr_description = (pr_description[0, truncate_length] + truncated_msg)
end

def truncate_pr_description(pr_description)
# Azure DevOps only support descriptions up to 4000 characters in UTF-16
# encoding.
# https://developercommunity.visualstudio.com/content/problem/608770/remove-4000-character-limit-on-pull-request-descri.html
pr_description = pr_description.dup.force_encoding(Encoding::UTF_16)
if pr_description.length > MAX_PR_DESCRIPTION_LENGTH
truncated_msg = (+"...\n\n_Description has been truncated_").force_encoding(Encoding::UTF_16)
truncate_length = MAX_PR_DESCRIPTION_LENGTH - truncated_msg.length
pr_description = (pr_description[0..truncate_length] + truncated_msg)
end
pr_description.force_encoding(Encoding::UTF_8)
end

And this PR will add a third... And all the other platforms (BitBucket, GitLab, etc) also have PR limits:

This is going to become a mess quickly if each client has their own implementation.

So could we instead pull this truncation helper into the common PR description creator functionality?

Azure may need to remain a special case since they calculate character limits based on the UTF-16 case, but the others can all be DRY'd up. What about this, you implement the CodeCommit solution via a common/generic solution, and then I'll handle migrating the existing instances to this common/generic solution? No need for you to do that.

A quick peek looks like all the clients in https://github.com/dependabot/dependabot-core/blob/221a6e5599ea1e8c30f6167b0a1bbe48c0b8e15f/common/lib/dependabot/pull_request_creator.rb build their PR descriptions using MessageBuilder.pr_message:

def pr_message
suffixed_pr_message_header + commit_message_intro +
metadata_cascades + prefixed_pr_message_footer
rescue StandardError => e
Dependabot.logger.error("Error while generating PR message: #{e.message}")
suffixed_pr_message_header + prefixed_pr_message_footer
end

So could you take this truncate_message() method and:

  1. move it to the MessageBuilder class
  2. give it a configurable param for the length at which to truncate. And maybe if nil then no truncation...
  3. Then within each client we'd pass in the param for the length for that platform...

You can see a similar idea in this PR that wires up a common param for truncating the length of the PR branch name:

The one bit I'm a little unclear on is how to set a default value, because there's no point in each caller of this code having to manually set the value... most everyone will simply want the default for their platform.

My guess is this call:

pr_description: message.pr_message,

Would look more like:

pr_description: message.pr_message(CODECOMMIT.MAX_PR_DESCRIPTION_LENGTH),

Or you may need to hook it into the memo'ized MessageBuilder, my Ruby foo isn't strong enough to know the best way to do this w/o spending a bit more time on it:

def message
@message ||=
MessageBuilder.new(
source: source,
dependencies: dependencies,
files: files,
credentials: credentials,
commit_message_options: commit_message_options,
pr_message_header: pr_message_header,
pr_message_footer: pr_message_footer,
vulnerabilities_fixed: vulnerabilities_fixed,
github_redirection_service: github_redirection_service,
dependency_group: dependency_group
)
end

Overall though this doesn't look like too much more work here... your existing set of tests should be fine, just move them over to the common code.

If you want to just wire it up for CodeCommit, then I can handle the cleanup of moving the GitHub truncation code to using this new common param.

@jeffwidman jeffwidman dismissed their stale review July 15, 2023 18:44

Realized after I approved that this should be a common feature to all PRs, not just codecommit

@jeffwidman
Copy link
Member

As illustrated by #7564 (comment), some users will probably have valid reasons for truncating the PR description even shorter than their platform's limit. So ideally we both set the per-platform default and also make it a configurable param.

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 17, 2023

I will take a look at this later today or tomorrow. Thanks!

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 21, 2023

@jeffwidman I have moved the truncate and updated the calls.

"my Ruby foo isn't strong enough to know the best way to do this" - My Ruby fu was all gained contributing here to dependabot, I am learning as I go 😁

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 21, 2023

@jeffwidman I also tested this using the dry run script against code commit to see if the default values were properly selected and then if overriding worked, as well as with passing encoding param. After a couple of tweaks it did exactly what I expected it would.

Excited to get this one merged. Hopefully this passes muster.

Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really solid to me! Nice job. I appreciate you going the extra mile to make the encoding arg generic so we can use it beyond Azure.

A few small nits, feel free to push back on any of them if you think I'm missing something or just plain wrong. 😁

What do you think about also setting default values for BitBucket and GitLab?

It'd be super simple to tack onto this PR since it's just setting a few more values / case statements. And I'm not too particular about our inability to test them because if they happen to be wrong other folks who are using :dependabot: on those platforms can always come along and correct it since this is OSS.

common/lib/dependabot/pull_request_creator/azure.rb Outdated Show resolved Hide resolved
common/lib/dependabot/pull_request_creator.rb Outdated Show resolved Hide resolved
common/lib/dependabot/pull_request_creator/github.rb Outdated Show resolved Hide resolved
@@ -88,6 +90,8 @@ def initialize(source:, base_commit:, dependencies:, files:, credentials:,
@provider_metadata = provider_metadata
@message = message
@dependency_group = dependency_group
@pr_message_max_length = pr_message_max_length
Copy link
Member

@jeffwidman jeffwidman Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you picked pr_message... rather than pr_description...?

My thinking is there are multiple messages tied to a PR, for example comments, reviews, etc but only one main description tied to it... Just checked and GitHub, AWS CodeCommit, BitBucket, GitLab, ADO all call it PR/MR Description, so appears that's relatively consistent terminology...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it that as the method called is pr_message in MessageBuilder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha, yeah, it probably wouldn't hurt to rename that to DescriptionBuilder (totally out of scope of this PR)... I'll look into that after this is merged... let's leave this comment unresolved as a reminder to me.

common/lib/dependabot/clients/codecommit.rb Outdated Show resolved Hide resolved
@jeffwidman
Copy link
Member

Just remembered that @mburumaxwell had noticed that force_encoding has a problem:

My inclination is to first land this PR as-is, as it's semi-generic so will have a larger impact, then circle back and rebase/try to his PR after this is merged.

@jeffwidman jeffwidman changed the title Truncate CodeCommit PR descriptions longer than 10,240 characters Add option to control PR description max length + set reasonable defaults for each platform Jul 24, 2023
@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 24, 2023

This looks really solid to me! Nice job. I appreciate you going the extra mile to make the encoding arg generic so we can use it beyond Azure.

A few small nits, feel free to push back on any of them if you think I'm missing something or just plain wrong. 😁

What do you think about also setting default values for BitBucket and GitLab?

It'd be super simple to tack onto this PR since it's just setting a few more values / case statements. And I'm not too particular about our inability to test them because if they happen to be wrong other folks who are using :dependabot: on those platforms can always come along and correct it since this is OSS.

I was going to do this, but I actually use bitbucket and was able to create a PR with a description longer than that value in that ticket, so I was not sure what the real limit was. Since gitlab and bitbucket didnt have values before I just figured leave it as is and if someone comes across an issue then they would add the limit.

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 24, 2023

Just remembered that @mburumaxwell had noticed that force_encoding has a problem:

My inclination is to first land this PR as-is, as it's semi-generic so will have a larger impact, then circle back and rebase/try to his PR after this is merged.

Yeah, changing that in this PR changes the scope even more. Hopefully this gets merged and then he can address with the new changes.

common/lib/dependabot/clients/azure.rb Outdated Show resolved Hide resolved
common/lib/dependabot/pull_request_creator/azure.rb Outdated Show resolved Hide resolved
common/lib/dependabot/pull_request_creator/github.rb Outdated Show resolved Hide resolved
common/lib/dependabot/pull_request_creator/codecommit.rb Outdated Show resolved Hide resolved
common/lib/dependabot/clients/azure.rb Outdated Show resolved Hide resolved
Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied a few more mostly cosmetic changes, we've worked together enough that I was pretty sure you wouldn't mind if it helps this land sooner. 😁

This is my last remaining question--once you address this I'll approve/merge:
https://github.com/dependabot/dependabot-core/pull/7487/files#r1273880291

Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again!

@dwc0011
Copy link
Contributor Author

dwc0011 commented Jul 25, 2023

I applied a few more mostly cosmetic changes, we've worked together enough that I was pretty sure you wouldn't mind if it helps this land sooner. 😁

This is my last remaining question--once you address this I'll approve/merge: https://github.com/dependabot/dependabot-core/pull/7487/files#r1273880291

Is this one of the changes you made?
common/lib/dependabot/pull_request_creator/azure.rb:16:49: C: [Correctable] Layout/TrailingWhitespace: Trailing whitespace detected.
PR_DESCRIPTION_ENCODING = Encoding::UTF_16
^
common/lib/dependabot/pull_request_creator/codecommit.rb:14:6: C: [Correctable] Layout/CommentIndentation: Incorrect indentation detected (column 5 instead of 6).
# https://docs.aws.amazon.com/codecommit/latest/APIReference/API_PullRequest.html

@jeffwidman jeffwidman force-pushed the dwc0011/truncate_codecommit_pr_description branch from 1ee9489 to dbfd0bd Compare July 25, 2023 17:59
@jeffwidman
Copy link
Member

Yeah, I screwed up the whitespace using the GitHub UI ```suggestion tool... I just fixed it.

@jeffwidman jeffwidman enabled auto-merge (squash) July 25, 2023 18:21
@jeffwidman jeffwidman merged commit ce32701 into dependabot:main Jul 25, 2023
@dwc0011 dwc0011 deleted the dwc0011/truncate_codecommit_pr_description branch July 25, 2023 18:52
mburumaxwell added a commit to tinglesoftware/dependabot-azure-devops that referenced this pull request Aug 1, 2023
Fix the character length limit error by setting it in the `MessageBuilder`.

This reacts to changes made in dependabot/dependabot-core#7487
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pull request description exceeds AWS Code Commit max size
4 participants