ARROW-17621: [CI] Audit workflows #14155

assignUser · 2022-09-16T17:58:18Z

In this PR I:

reduced the scope of the automatically generated GITHUB_TOKEN as much as possible (technically contents:none would be the minimum but it is a bit unintuitive as it does not prevent checkout of public repos, I set contents:read in those cases)
update all actions used to the newest version (checking for breaking changes, only case is actions/github-script which remains on v3 for that reason -> follow up)
move the creation of envvars containing secrets as close to their usage as possible (-> the step they are used in), this duplicates them in workflows with multiple jobs but is safer.

I have opted NOT to pin the different actions by SHA as recommended in some places as the con outweigh the possible protection in my opinion. The main danger with pinning tags or branches is that a malicious actor changes the commit the tag points to and exfiltrates secrets (either repository secrets or in case of private repos code/ip) or takes some other damaging action like deleting branches, rewriting history etc..

We only ever pass actions the GITHUB_TOKEN which is ephemeral (deleted after workflow is finished) and scope limited so exfiltration of that token would worst case allow an attacker to create/delete labels and pr comments as well as modify PR branches (if the submitter activated the checkbox for maintainer access). Actions can not access secrets without the workflow author explicitly passing them as input (envvars might reveal them though)

The Apache Org limits the actions that can be used in repos, so we only use well known allow-listed actions, while this does of course not prevent malicious actions it reduces the risk substantially.

Pinning SHAs would mitigate these risks (provided the action at that sha was audited...) but would also necessitate regularly checking + re-auditing the actions as to not miss security patches in these actions (e.g. here). IMHO that would be a considerable effort (+ needing real expertise in typescript/node to spot any malicious additions outside of blatant secret exfiltration or nuking) resulting in a small gain.

assignUser · 2022-09-16T18:02:00Z

It looks like Dependabot can also scan & update actions used in workflows: https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot

This would remove most of the workload of updating the actions, while this would still not protect against properly obfuscated attacks combined with the other mitigations in this PR it might be a good compromise. @amol- @raulcd thoughts?

github-actions · 2022-09-16T18:06:57Z

https://issues.apache.org/jira/browse/ARROW-17621

github-actions · 2022-09-16T18:06:58Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

amol- · 2022-09-20T11:58:14Z

It looks like Dependabot can also scan & update actions used in workflows

Dependabot makes a PR to suggest the update right?
My concern would only be to contain the risk for sudden breaking changes, but if those updates are conducted in a dedicated PR, then the PR would prove that the CI still passes before getting merged.

assignUser · 2022-09-20T12:05:42Z

Yes Dependabot opens a PR, I think we can even configure a PR title format via a config yaml to match our needs.

assignUser · 2022-09-20T17:29:46Z

(I'll create a follow up jira for dependabot)

kou

Could you rebase on the master to resolve known CI failures?

.github/workflows/dev_pr.yml

.github/workflows/go.yml

.github/workflows/java.yml

.github/workflows/r.yml

.github/workflows/cpp.yml

Set envvars with secrets only in the step they are needed in

kou

+1

kou · 2022-09-23T04:25:01Z

.github/workflows/dev_pr.yml

-        uses: actions/[email protected]
+          (github.event.action == 'opened' ||
+           github.event.action == 'synchronize')
+        uses: actions/labeler@4


We need to use @v4 here...

This is a follow-up of ARROW-17621 / apache#14155.

This is a follow-up of ARROW-17621 / #14155. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

ursabot · 2022-09-23T10:32:20Z

Benchmark runs are scheduled for baseline = 44ae852 and contender = 36928ec. 36928ec is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️1.02% ⬆️0.07%] test-mac-arm
[Failed ⬇️0.82% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.25% ⬆️0.04%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 36928ec3 ec2-t3-xlarge-us-east-2
[Failed] 36928ec3 test-mac-arm
[Failed] 36928ec3 ursa-i9-9960x
[Finished] 36928ec3 ursa-thinkcentre-m75q
[Finished] 44ae8523 ec2-t3-xlarge-us-east-2
[Failed] 44ae8523 test-mac-arm
[Failed] 44ae8523 ursa-i9-9960x
[Finished] 44ae8523 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

In this PR I: - reduced the scope of the automatically generated `GITHUB_TOKEN` as much as possible (technically `contents:none` would be the minimum but it is a bit unintuitive as it does not prevent checkout of public repos, I set `contents:read` in those cases) - update all actions used to the newest version (checking for breaking changes, only case is actions/github-script which remains on v3 for that reason -> follow up) - move the creation of envvars containing secrets as close to their usage as possible (-> the step they are used in), this duplicates them in workflows with multiple jobs but is safer. I have opted **NOT** to pin the different actions by SHA as recommended in some places as the con outweigh the possible protection in my opinion. The main danger with pinning tags or branches is that a malicious actor changes the commit the tag points to and exfiltrates secrets (either repository secrets or in case of private repos code/ip) or takes some other damaging action like deleting branches, rewriting history etc.. We only ever pass actions the `GITHUB_TOKEN` which is ephemeral (deleted after workflow is finished) and scope limited so exfiltration of that token would worst case allow an attacker to create/delete labels and pr comments as well as modify PR branches (if the submitter activated the checkbox for maintainer access). Actions can not access secrets without the workflow author explicitly passing them as input (envvars might reveal them though) The Apache Org limits the actions that can be used in repos, so we only use well known allow-listed actions, while this does of course not prevent malicious actions it reduces the risk substantially. Pinning SHAs would mitigate these risks (provided the action at that sha was audited...) but would also necessitate regularly checking + re-auditing the actions as to not miss security patches in these actions (e.g. [here](https://github.com/matlab-actions/setup-matlab/releases/tag/v1.1.1)). IMHO that would be a considerable effort (+ needing real expertise in typescript/node to spot any malicious additions outside of blatant secret exfiltration or nuking) resulting in a small gain. Lead-authored-by: Jacob Wujciak-Jens <[email protected]> Co-authored-by: assignUser <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

This is a follow-up of ARROW-17621 / apache#14155. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

In this PR I: - reduced the scope of the automatically generated `GITHUB_TOKEN` as much as possible (technically `contents:none` would be the minimum but it is a bit unintuitive as it does not prevent checkout of public repos, I set `contents:read` in those cases) - update all actions used to the newest version (checking for breaking changes, only case is actions/github-script which remains on v3 for that reason -> follow up) - move the creation of envvars containing secrets as close to their usage as possible (-> the step they are used in), this duplicates them in workflows with multiple jobs but is safer. I have opted **NOT** to pin the different actions by SHA as recommended in some places as the con outweigh the possible protection in my opinion. The main danger with pinning tags or branches is that a malicious actor changes the commit the tag points to and exfiltrates secrets (either repository secrets or in case of private repos code/ip) or takes some other damaging action like deleting branches, rewriting history etc.. We only ever pass actions the `GITHUB_TOKEN` which is ephemeral (deleted after workflow is finished) and scope limited so exfiltration of that token would worst case allow an attacker to create/delete labels and pr comments as well as modify PR branches (if the submitter activated the checkbox for maintainer access). Actions can not access secrets without the workflow author explicitly passing them as input (envvars might reveal them though) The Apache Org limits the actions that can be used in repos, so we only use well known allow-listed actions, while this does of course not prevent malicious actions it reduces the risk substantially. Pinning SHAs would mitigate these risks (provided the action at that sha was audited...) but would also necessitate regularly checking + re-auditing the actions as to not miss security patches in these actions (e.g. [here](https://github.com/matlab-actions/setup-matlab/releases/tag/v1.1.1)). IMHO that would be a considerable effort (+ needing real expertise in typescript/node to spot any malicious additions outside of blatant secret exfiltration or nuking) resulting in a small gain. Lead-authored-by: Jacob Wujciak-Jens <[email protected]> Co-authored-by: assignUser <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

This is a follow-up of ARROW-17621 / apache#14155. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

assignUser marked this pull request as ready for review September 16, 2022 18:16

pitrou requested a review from kou September 20, 2022 09:08

kou reviewed Sep 20, 2022

View reviewed changes

assignUser added 17 commits September 22, 2022 13:44

restrict token permissions

b1efb22

remove matrix strategy from non-matrix job

1d49133

fix trigger for dev_pr

de9bf34

remove secrets from global env

36eb35d

Set envvars with secrets only in the step they are needed in

enable concurrency for dev_pr

2cdf90e

fix key

55dbad5

move workflow to front

cf2ce3c

update actions/cache to v3

25c675f

update actions/setup-java to v3

9eef89e

update actions/setup-node to v3

e2b3a9f

update actions/setup-dotnet to v2

c2fc66b

update actions/upload-artifacts to v3

4556922

update actions/download-artifact to v3

08ab41c

unindent

4cf9146

add docker creds to run steps to avoid api restriction

09bdb89

fix line endings

d7fca31

alphabetize

88c73ce

assignUser force-pushed the arrow-wf-audit branch from 0162b03 to 88c73ce Compare September 22, 2022 11:45

assignUser requested a review from kou September 22, 2022 11:46

kou approved these changes Sep 22, 2022

View reviewed changes

kou merged commit 36928ec into apache:master Sep 22, 2022

kou reviewed Sep 23, 2022

View reviewed changes

kou added a commit to kou/arrow that referenced this pull request Sep 23, 2022

MINOR: [Dev] Fix actions/labeler's tag

d9c1560

This is a follow-up of ARROW-17621 / apache#14155.

kou mentioned this pull request Sep 23, 2022

MINOR: [Dev] Fix actions/labeler's tag #14215

Merged

kou added a commit that referenced this pull request Sep 23, 2022

MINOR: [Dev] Fix actions/labeler's tag (#14215)

45cbb58

This is a follow-up of ARROW-17621 / #14155. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

lidavidm mentioned this pull request Oct 4, 2022

ci: update action versions, restrict token scopes apache/arrow-adbc#146

Merged

zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022

MINOR: [Dev] Fix actions/labeler's tag (apache#14215)

b364f5f

This is a follow-up of ARROW-17621 / apache#14155. Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARROW-17621: [CI] Audit workflows #14155

ARROW-17621: [CI] Audit workflows #14155

assignUser commented Sep 16, 2022 •

edited by kou

Loading

assignUser commented Sep 16, 2022 •

edited

Loading

github-actions bot commented Sep 16, 2022

github-actions bot commented Sep 16, 2022

amol- commented Sep 20, 2022

assignUser commented Sep 20, 2022

assignUser commented Sep 20, 2022

kou left a comment

kou left a comment

kou Sep 23, 2022

ursabot commented Sep 23, 2022

ARROW-17621: [CI] Audit workflows #14155

ARROW-17621: [CI] Audit workflows #14155

Conversation

assignUser commented Sep 16, 2022 • edited by kou Loading

assignUser commented Sep 16, 2022 • edited Loading

github-actions bot commented Sep 16, 2022

github-actions bot commented Sep 16, 2022

amol- commented Sep 20, 2022

assignUser commented Sep 20, 2022

assignUser commented Sep 20, 2022

kou left a comment

Choose a reason for hiding this comment

kou left a comment

Choose a reason for hiding this comment

kou Sep 23, 2022

Choose a reason for hiding this comment

ursabot commented Sep 23, 2022

assignUser commented Sep 16, 2022 •

edited by kou

Loading

assignUser commented Sep 16, 2022 •

edited

Loading