Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command option for output the codeowners directly. #2245

Merged
merged 3 commits into from
Nov 19, 2021

Conversation

sima-zhu
Copy link
Contributor

@sima-zhu sima-zhu commented Nov 10, 2021

The test is in pipeline with the custom tests run.

Here is the test without custom test run:
https://dev.azure.com/azure-sdk/internal/_build/results?buildId=1203228&view=results

@sima-zhu sima-zhu requested a review from a team as a code owner November 10, 2021 05:45
@sima-zhu sima-zhu requested review from weshaggard and removed request for a team November 10, 2021 17:46
@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
eng/common/scripts/get-codeowners.ps1 Outdated Show resolved Hide resolved
@benbp
Copy link
Member

benbp commented Nov 11, 2021

I think that adding all this data parsing and vso variable setting logic outside of the direct tool introduces some unnecessary complexity, and will also make it hard for us to detect and make breaking changes to the underlying tool in the future.

I would suggest either:

  1. Move ALL the logic into the codeowners parser tool and support some sort of flag like --set-code-owners-pipeline-variable
  2. If we don't want to set pipeline variables in the C# tool, I would still like to avoid having to parse non-structured output via magic fields in the logs. We could follow a similar approach to what git does with its porcelain vs. plumbing strategy, meaning the tool could take a flag like --porcelain or -o json to specify the output needs to be structured. Then you could have properties like owners and log, and you can parse the json from owners and write-host the log.

With the dotnet tool approach we're moving towards, I think a local or remote script user should be able to use the tools directly and not have to rely too much on wrapper scripts, otherwise we lose some of the benefits of centralized CLI tooling.

@weshaggard
Copy link
Member

@benbp we should chat more about options and trade-offs here. @sima-zhu is implementing it this was based on guidance from me and exploring the options in different worlds.

The biggest trouble comes when we want to share this code between our devops steps and other tools that need to parse codeowners. If we have the tool set the devops variable then we still have to parse that variable contents and we also have to run it as an independent devops step, which blocks scenarios such as calling it in a loop in another powershell script context. So while I agree with you parsing the output isn't great it essentially is the same as setting the devops variable because that is how those get handled as well (although by DevOps which gives us less flexibility).

@benbp
Copy link
Member

benbp commented Nov 11, 2021

The biggest trouble comes when we want to share this code between our devops steps and other tools that need to parse codeowners. If we have the tool set the devops variable then we still have to parse that variable contents and we also have to run it as an independent devops step, which blocks scenarios such as calling it in a loop in another powershell script context. So while I agree with you parsing the output isn't great it essentially is the same as setting the devops variable because that is how those get handled as well (although by DevOps which gives us less flexibility).

@weshaggard As you say it's not realistic to do away entirely with wrapper scripts from a pipelines perspective. I guess my issue is more around introducing too much business logic and tight coupling between scripts and the tool output, especially with parsing structured data out of log lines. I think the reverse (printing log lines and extracting info from structured data) is a better pattern to follow for us.

In this scenario we would have minimal script code:

function getCodeOwnersEntryFromCommand() {
  return & "$ToolPath/retrieve-code-owners" `
        --target-directory "$PathToOwners" `
        --root-directory "$WorkingDirectory" `
        -o json
}

function getCodeOwners() { 
  $result = getCodeOwnersEntryFromCommand

  if ($LASTEXITCODE -ne 0) {
    Write-Host $result.Output
    return $null
  }

  $codeOwners = ($result.CodeOwners | ConvertFrom-Json) -join ","
  Write-Host "##vso[task.setvariable variable=$VsoVariable;]$codeOwners"
  return $codeOwners
}

Alternatively, instead of parsing $result.Output you could return the JSON directly as stdout and print all potential issues to stderr, which you can then handle separately and print to the pipeline.

@weshaggard
Copy link
Member

@sima-zhu, @benbp and I chatted more offline and came to the conclusion that it would be best if we simply have the tool write only the json content to the console in any cases where there is no errors. If there is an error then dump whatever you need to and exit non-zero. That means we can simplify our consumption a little but we will want to go ahead and remove our console logging from the codeowners library. We should also make sure we handle any errors in our conversion from json in case some other output other then the json ends up in the output.

@benbp
Copy link
Member

benbp commented Nov 12, 2021

but we will want to go ahead and remove our console logging from the codeowners library.

Or just change it to stderr and print it out regardless of exit code (i.e. get rid of the 2>&1 redirect).

@weshaggard
Copy link
Member

Lets start by removing it and if we find it is useful in some scenario we can figure out how to plumb through a logger.

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@sima-zhu sima-zhu requested review from benbp and weshaggard November 12, 2021 19:12
@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

Comment on lines 27 to 30
if (!$codeOwnersJson) {
Write-Host "No code owners returned from the path: $CodeOwnerPathExpression"
return ""
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's not a huge deal here, but as a larger principal, this log statement uses knowledge of how the underlying GetCodeOwnersEntryFromCommand function is implemented (using $CodeOwnerPathExpression), a function which isn't even called by this function. It would be better to log this message from within the GetCodeOwnersEntryFromCommand function since it's specific to its implementation.

I would prefer to delete these lines and throw from the GetCodeOwnersEntryFromCommand function instead, which makes the code simpler.

Copy link
Contributor Author

@sima-zhu sima-zhu Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We fairly get into this condition.
The tool command returns error or json with real context.

if (codeOwnerEntry == null)
{
    Console.Error.WriteLine(String.Format("We cannot find any closest code owners from the target path {0}", targetDirectory));
    return 1;
}
else
{
    var codeOwnerJson = JsonSerializer.Serialize<CodeOwnerEntry>(codeOwnerEntry);
    Console.WriteLine(codeOwnerJson);
    return 0;
}

The error handling here is for this line

$codeOwnersJson = $codeOwnersString | ConvertFrom-Json

ConvertFrom-Json could go wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is we could set $ErrorActionPreference to Stop for the script context to handle these cases since ConvertFrom-Json will throw. If the caller has this set in their shell, they'll skip your log statement anyway.

}
else
{
var codeOwnerJson = JsonSerializer.Serialize<CodeOwnerEntry>(codeOwnerEntry);
Copy link
Member

@benbp benbp Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to do something like below to get pretty printing of the json, or is it already handled?

Suggested change
var codeOwnerJson = JsonSerializer.Serialize<CodeOwnerEntry>(codeOwnerEntry);
var codeOwnerJson = JsonSerializer.Serialize<CodeOwnerEntry>(codeOwnerEntry, new JsonSerializerOptions { WriteIndented = true });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have.

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

$VsoVariable = "" # target devops output variable
[string]$CodeOwnerPathExpression, # Code path to code owners. e.g sdk/core/azure-amqp
[string]$ToolVersion = "", # Placeholder. Will update in next PR
[string]$ToolPath = "$env:AGENT_TOOLSDIRECTORY", # The place to check the tool existence. Put $(Agent.ToolsDirectory) as default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to actually consider a temp directory for this. That is what @danieljurek is doing for cspell, that would allow for easier running locally as well. We can always change it in DevOps if we want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this for cross-platform temp directory lookup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using something similar:

[Parameter()]
[string] $WorkingDirectory = (Join-Path ([System.IO.Path]::GetTempPath()) ([System.IO.Path]::GetRandomFileName())),

This generates a temporary folder name which can then be created later if it doesn't already exist.

$TargetDirectory, # should be in relative form from root of repo. EG: sdk/servicebus
$RootDirectory, # ideally $(Build.SourcesDirectory)
$VsoVariable = "" # target devops output variable
[string]$CodeOwnerPathExpression, # Code path to code owners. e.g sdk/core/azure-amqp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the path to code owners this should be called "relativePathInRepoToFindOwners" or something like that.

@azure-sdk
Copy link
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

}
return & "$ToolPath/retrieve-code-owners" --target-directory "$CodeOwnerPathExpression" --code-owner-file-path "$CodeOwnerFileLocation"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this even work? I ask because I don't see any processing that would handle the --target-directory, I would expect you would need to pass the parameters via position instead of by name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

We have the parameter and I tested locally

Copy link
Member

@benbp benbp Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the Dragonfruit dependency is what's providing the magic conversion from the function argument names to the CLI flag syntax?

Dragonfruit magic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty cool but it might be good to put a code comment above the main method describing that the arguments will be magically handled by the Dragonfruit dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is interesting and I agree we probably should add a comment there isn't anything to really stop us from removing this dependency and breaking this command line arg parsing.


$codeOwners = $codeOwnersJson.Owners -join ","
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this join inside of the VSOVariable block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line is still useful even VSO not set.


InstallRetrieveCodeOwnersTool

$codeOwnerToolOutput = GetCodeOwnersEntryFromCommand
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do this in GetCodeOwners function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you aren't really calling these functions more than once you can probably eliminate the functions and just have the code inline.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with moving this code inside GetCodeOwners. Re: eliminating functions, I disagree. I think a lot of our scripts have grown larger over time, and the lack of functions at the start makes our scripts tend to evolve into spaghetti code (New-TestResources.ps1 as the best example of this). I think if we try to keep all logic in functions except for a single entrypoint function call at the end of the script, it's easier to modify the code AND easier to test because you can dot source individual functions and execute them locally (provided you filter on invocation name).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes testing is definitely a great reason to have these functions.


$codeOwnerToolOutput = GetCodeOwnersEntryFromCommand
# Failed at the command of fetching code owners.
if ($LASTEXITCODE -ne 0) {
return ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably return an empty list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we return string with ","
I feel like string is much easy to deliver between yaml and scripts. It is also easy to parse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to keep both the object format and the yaml<-->script format, this could be a JSON stringified empty list, i.e. "[]" or @() | ConvertTo-Json.

return ""
}
GetCodeOwners $codeOwnerToolOutput
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be a return statement so it is clear.

@check-enforcer-staging
Copy link

This pull request is protected by Check Enforcer.

What is Check Enforcer?

Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass.

Why am I getting this message?

You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged.

What should I do now?

If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows:
/check-enforcer evaluate
Typically evaulation only takes a few seconds. If you know that your pull request is not covered by a pipeline and this is expected you can override Check Enforcer using the following command:
/check-enforcer override
Note that using the override command triggers alerts so that follow-up investigations can occur (PRs still need to be approved as normal).

@sima-zhu sima-zhu merged commit 7724333 into Azure:main Nov 19, 2021
@sima-zhu sima-zhu deleted the output_psmodule branch November 19, 2021 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants