Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Producing attestations fails when attempting publishing from unsupported reusable workflows (when enabled) #283

Closed
efriis opened this issue Oct 30, 2024 · 43 comments
Labels
invalid This doesn't seem right question Further information is requested

Comments

@efriis
Copy link

efriis commented Oct 30, 2024

Howdy! We just had to turn off release attestations to get our releases running in langchain-ai/langchain. Haven't had a chance to dig deeper into attestation configuration in order to see what we need to fix, and thought I'd file an issue in case others run into the same thing!

langchain-ai/langchain#27765

Release Workflow

We run releases from the two workflow files edited in ^ that PR

  • _release.yml, which calls _test_release.yml, and then publishes to pypi
  • _test_release.yml for publishing to test.pypi

Error

We were seeing errors in your releases, e.g. in this workflow run: https://github.com/langchain-ai/langchain/actions/runs/11602468120/job/32307568692

Configuration of test release - 2 main things that look weird are /legacy/ and repository_url (we configure repository-url per docs)

Run pypa/gh-action-pypi-publish@release/v1
  with:
    packages-dir: libs/core/dist/
    verbose: true
    print-hash: true
    repository-url: https://test.pypi.org/legacy/
    skip-existing: true
    user: __token__
    repository_url: https://upload.pypi.org/legacy/
    packages_dir: dist
    verify_metadata: true
    skip_existing: false
    print_hash: false
    attestations: true
  env:
    POETRY_VERSION: 1.7.1
    PYTHON_VERSION: [3](https://github.com/langchain-ai/langchain/actions/runs/11602468120/job/32307568692#step:5:3).10

Logs - partially redacted

Checking libs/core/dist/langchain_core-0.3.14-py3-none-any.whl: PASSED
Checking libs/core/dist/langchain_core-0.3.14.tar.gz: PASSED
Notice: Generating and uploading digital attestations
Fulcio client using URL: https://fulcio.sigstore.dev
TUF metadata: /root/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev
TUF targets cache: /root/.cache/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev
Found and verified trusted root
Generating ephemeral keys...
Requesting ephemeral certificate...
Retrieving signed certificate...
Found <Name(O=sigstore.dev,CN=sigstore-intermediate)> as issuer, verifying if it is a ca
attempting to verify SCT with key ID xxx
Successfully verified SCT...
DSSE PAE: xxx
proposed: xxx
integrated: xxx
Transparency log entry created with index: 145293525
DSSE PAE: xxx
proposed: xxx
integrated: xxx
Transparency log entry created with index: 145293526
Showing hash values of files to be uploaded:
/github/workspace/libs/core/dist/langchain_core-0.3.14-py3-none-any.whl

SHA256: xxx
MD5: xxx
BLAKE2-256: xxx

/github/workspace/libs/core/dist/langchain_core-0.3.14.tar.gz

SHA256: xxx 
MD5: xxx
BLAKE2-256: xxx

/github/workspace/libs/core/dist/langchain_core-0.3.14-py3-none-any.whl.publish.attestation

SHA256: xxx
MD5: xxx
BLAKE2-256: xxx

/github/workspace/libs/core/dist/langchain_core-0.3.14.tar.gz.publish.attestation

SHA256: xxx
MD5: xxx
BLAKE2-256: xxx

Uploading distributions to https://test.pypi.org/legacy/
INFO     libs/core/dist/langchain_core-0.3.14-py3-none-any.whl (399.1 KB)       
INFO     libs/core/dist/langchain_core-0.3.14.tar.gz (320.2 KB)                 
INFO     password set by command options                                        
INFO     username: __token__                                                    
INFO     password: <hidden>                                                     
Uploading langchain_core-0.3.14-py3-none-any.whl
INFO     Response from https://test.pypi.org/legacy/:                           
         400 Bad Request                                                        
INFO     <html>                                                                 
          <head>                                                                
           <title>400 Could not verify the uploaded artifact using the included 
         attestation: Verification failed: 0 of 2 policies succeeded</title>    
          </head>                                                               
          <body>                                                                
           <h1>400 Could not verify the uploaded artifact using the included    
         attestation: Verification failed: 0 of 2 policies succeeded</h1>       
           The server could not comply with the request since it is either      
         malformed or otherwise incorrect.<br/><br/>                            
         Could not verify the uploaded artifact using the included attestation: 
         Verification failed: 0 of 2 policies succeeded                         
                                                                                
                                                                                
          </body>                                                               
         </html>                                                                
ERROR    HTTPError: 400 Bad Request from https://test.pypi.org/legacy/

Temporary Fix

langchain-ai/langchain#27765

We turned off attestations with attestations: false

@webknjaz
Copy link
Member

FYI reusable workflows are still not properly supported on the PyPI side. Some things happen to work almost by accident. There's a tracking issue somewhere in the Warehouse repo discussing this.

cc @woodruffw

@webknjaz webknjaz added invalid This doesn't seem right question Further information is requested labels Oct 30, 2024
@kesara
Copy link

kesara commented Oct 31, 2024

This worked for me when pubslishing for test PyPI, but I got the same error when the GHA tries to publish to real PyPI.

Test PyPI:

Checking dist/svgcheck-0.9.0.tar.gz: PASSED
Notice: Generating and uploading digital attestations
Fulcio client using URL: https://fulcio.sigstore.dev/
TUF metadata: /root/.local/share/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev
TUF targets cache: /root/.cache/sigstore-python/tuf/https%3A%2F%2Ftuf-repo-cdn.sigstore.dev
Found and verified trusted root
Generating ephemeral keys...
Requesting ephemeral certificate...
Retrieving signed certificate...
Found <Name(O=sigstore.dev,CN=sigstore-intermediate)> as issuer, verifying if it is a ca
attempting to verify SCT with key ID dd3d306ac6c7113263191e1c99673702a24a5eb8de3cadff878a72802f29ee8e
Successfully verified SCT...

PyPI:

Checking dist/svgcheck-0.9.0.tar.gz: PASSED
Checking dist/svgcheck-0.9.0.tar.gz.publish.attestation: ERROR    InvalidDistribution: Unknown distribution format:                      
         'svgcheck-0.9.0.tar.gz.publish.attestation'

kesara added a commit to ietf-tools/svgcheck that referenced this issue Oct 31, 2024
@webknjaz
Copy link
Member

Thanks for the report. It would be useful to have a link to the log. And I even more useful if the job is restarted in debug mode.
Meanwhile, we'll have to wait for @woodruffw to see this and take a look from his side.

@kesara
Copy link

kesara commented Oct 31, 2024

@webknjaz
Copy link
Member

Thanks! I wonder where this is coming from..

Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'

@webknjaz
Copy link
Member

Extracting the artifacts from that job for history (they will be garbage-collected in a while) and further inspection:
artifacts.zip

@kesara
Copy link

kesara commented Oct 31, 2024

Thanks! I wonder where this is coming from..

Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'

Looks that's from a left over old test file, It's unrelated to this issue.

@woodruffw
Copy link
Member

woodruffw commented Oct 31, 2024

Thanks for the details @kesara! This one is weird: that error is coming from twine itself, not from PyPI.

From a quick look, I think what happened here is that I forgot to add a skip for attestations in twine check. I can root cause that a bit more in a moment.

Edit: Yep, that's what it looks like. I'll work on a fix now.

woodruffw added a commit to woodruffw-forks/twine that referenced this issue Oct 31, 2024
This fixes a bug that I accidentally introduced with
attestations support: `twine upload` learned the difference
between distributions and attestations, but `twine check`
didn't.

As a result, `twine check dist/*` would fail with
an `InvalidDistribution` error whenever attestations are
present in the dist directory, like so:

```
Checking dist/svgcheck-0.9.0.tar.gz: PASSED
Checking dist/svgcheck-0.9.0.tar.gz.publish.attestation: ERROR    InvalidDistribution: Unknown distribution format:
         'svgcheck-0.9.0.tar.gz.publish.attestation'
```

This fixes the behavior of `twine check` by having it
skip attestations in the input list, like it does with
`.asc` signatures. To do this, I reused the `_split_inputs`
helper that was added with pypa#1095, meaning that `twine upload`
and `twine check` now have the same input splitting/filtering
logic.

See pypa/gh-action-pypi-publish#283
for some additional breakage context.

Signed-off-by: William Woodruff <[email protected]>
@woodruffw
Copy link
Member

pypa/twine#1172 should fix the behavior observed by @kesara!

For @efriis: @webknjaz is right that reusable workflow support is currently unfortunately limited on PyPI. My colleague @facutuesca and I are currently working on improving that, but in the mean time most users should stick to non-reusable workflows for Trusted Publishing.

sigmavirus24 pushed a commit to pypa/twine that referenced this issue Oct 31, 2024
This fixes a bug that I accidentally introduced with
attestations support: `twine upload` learned the difference
between distributions and attestations, but `twine check`
didn't.

As a result, `twine check dist/*` would fail with
an `InvalidDistribution` error whenever attestations are
present in the dist directory, like so:

```
Checking dist/svgcheck-0.9.0.tar.gz: PASSED
Checking dist/svgcheck-0.9.0.tar.gz.publish.attestation: ERROR    InvalidDistribution: Unknown distribution format:
         'svgcheck-0.9.0.tar.gz.publish.attestation'
```

This fixes the behavior of `twine check` by having it
skip attestations in the input list, like it does with
`.asc` signatures. To do this, I reused the `_split_inputs`
helper that was added with #1095, meaning that `twine upload`
and `twine check` now have the same input splitting/filtering
logic.

See pypa/gh-action-pypi-publish#283
for some additional breakage context.

Signed-off-by: William Woodruff <[email protected]>
@efriis
Copy link
Author

efriis commented Oct 31, 2024

Got it. Would the recommendation be to combine the release and test release workflows just into a single one? I believe we separated them to make sure the test release step could never publish to pypi.org. When we registered our publishers, we have the test.pypi entries locked to _test_release.yml, and pypi entries locked to _release.yml.

@efriis
Copy link
Author

efriis commented Oct 31, 2024

appreciate the quick look @woodruffw !

@facutuesca
Copy link
Contributor

facutuesca commented Oct 31, 2024

Got it. Would the recommendation be to combine the release and test release workflows just into a single one? I believe we separated them to make sure the test release step could never publish to pypi.org. When we registered our publishers, we have the test.pypi entries locked to _test_release.yml, and pypi entries locked to _release.yml.

@efriis I think you can keep them in separate files, and make _release.yml depend on _test_release.yml by using the workflow_run event. That way, _test_release is no longer called by _release, but rather _release runs after _test_release is successful, which means _test_release is no longer used as a reusable workflow.

That way you should be able to keep the scope for the PyPI Trusted Publisher to just _release.yml, and still be able to run the test workflow before it.

@facutuesca
Copy link
Contributor

There's a tracking issue somewhere in the Warehouse repo discussing this

For visibility: pypi/warehouse#11096

@webknjaz
Copy link
Member

Got it. Would the recommendation be to combine the release and test release workflows just into a single one? I believe we separated them to make sure the test release step could never publish to pypi.org. When we registered our publishers, we have the test.pypi entries locked to _test_release.yml, and pypi entries locked to _release.yml.

@efriis you can achieve this by naming the environments pypi and testpypi, which is something I strongly recommend.

@webknjaz
Copy link
Member

webknjaz commented Nov 1, 2024

Thanks! I wonder where this is coming from..

Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'coverage'

Looks that's from a left over old test file, It's unrelated to this issue.

@kesara I'm confused since it appears under the step running this action. And it's running in a container. Are you saying that you're leaving something on disk that ends up in the container as well?

Oh, wait… You're running the build within the same job as publishing. Giving OIDC privileges to your build tool chain is insecure and may lead to various impersonation / privilege elevation scenarios. Using trusted publishing this way is discouraged.

FWIW, this suggests that you may be leaving various incompatible artifacts in the dist/ folder, which is known to break twine in the past. So it's kinda related.

Try following the guide closely, only keeping the download step and the upload step in the publishing job: https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/#publishing-the-distribution-to-pypi. Perhaps, using a clean environment would help you avoid the issue of illegal files in dist/ breaking things.

@webknjaz
Copy link
Member

webknjaz commented Nov 1, 2024

@kesara if what William suggests is correct, https://github.com/marketplace/actions/pypi-publish#disabling-metadata-verification might be a workaround too.

@webknjaz
Copy link
Member

webknjaz commented Nov 1, 2024

@woodruffw so is this report caused only caused by reusable workflows? Can you confirm that the logs are consistent with this explanation, so I could go ahead and close the issue?

jnooree added a commit to seoklab/nurikit that referenced this issue Nov 1, 2024
@kesara
Copy link

kesara commented Nov 1, 2024

@webknjaz Thanks. I'll review publishing workflow.

kesara added a commit to ietf-tools/svgcheck that referenced this issue Nov 1, 2024
@woodruffw
Copy link
Member

@woodruffw so is this report caused only caused by reusable workflows? Can you confirm that the logs are consistent with this explanation, so I could go ahead and close the issue?

Yep, that's correct -- what's happening here is that attestations against the "build config" identity, which is the top-level initiating workflow. In non-reusable cases that might be something like release-event -> release.yml, but in the reusable workflow case it might be something like release-event -> release.yml -> pypi-publish.yml, where the attestation expects release.yml but the user has pypi-publish.yml configured as their trusted publisher.

@woodruffw
Copy link
Member

Thanks @davidism! Yeah, this should be fixed in twine on main, and just needs a new release. I can ask to see if they'd be willing to make one.

@webknjaz
Copy link
Member

@woodruffw that wouldn't be enough since you added an explicit check for this in our action to also fail on pre-existing attestations on disk.

@davidism in general, I recommend having separate jobs. But an immediate workaround would be to add a step between the action removing the attestation files that the first step produced.

@jpmckinney
Copy link

in general, I recommend having separate jobs. But an immediate workaround would be to add a step between the action removing the attestation files that the first step produced.

In the case of using the action twice in the same job – will this workaround still be needed once there is a new release of twine? (>5.1.0)

jpmckinney added a commit to open-contracting/lib-cove-oc4ids that referenced this issue Nov 25, 2024
@asmeurer
Copy link

asmeurer commented Nov 25, 2024

According to the previous comment yes because there's also a check in this workflow itself.

It would be great if you could fix the workflow so that this just works here, even if it isn't recommended. I have a lot of release workflows that upload to testpypi then upload to real pypi that used to work just fine and it would be nice if they could continue to work.

@kesara
Copy link

kesara commented Nov 25, 2024

I agree with @asmeurer. If a workflow going to break things that used to work, major version update should have been better.

@webknjaz
Copy link
Member

Maybe if we'd known before. But that ship has sailed. This wasn't a public API change, so SemVer expectations don't apply. You can clean the artifacts between the runs if you want, but that's the extent of it. The validation exists to catch a security condition of accidentally leaving incorrect artifacts on disk produced by something outside the action invocation. And it did its job — it found misconfiguration. So it works as one would expect. The only action item I see is documenting this, as it's probably not very obvious.

kesara added a commit to ietf-tools/xml2rfc that referenced this issue Nov 25, 2024
@asmeurer
Copy link

I'm not following how the ship has sailed, especially if it's not a public API change.

I will say that if your goal here is to push attestations as a "good thing" for the ecosystem, then this problem goes against that goal, because from a high level, the appearance here is that things worked before attestations and they broke after attestations. It's not a great feeling or a great look. Telling people that they now need to rewrite their complex workflows that used to work "just because" isn't exactly meeting people where they are.

I freely admit I don't understand how attestations work, and your description doesn't even make sense to me (how does something get "left on disk" in a GitHub action? Aren't github actions runs self contained?). But that's the point. I'm sure most people are like me in that their eyes glaze over when faced with technical descriptions of why their GitHub Actions started breaking for no reason. It just feels like things used to work and now they don't, and it's clearly because of a change made in this repo, because they stopped working when dependabot updated this action.

kesara added a commit to ietf-tools/xml2rfc that referenced this issue Nov 26, 2024
* ci: Refactor publish input

* ci: Add ability skip test PyPI publishing

* ci: Set attestations: false

See pypa/gh-action-pypi-publish#283
@woodruffw
Copy link
Member

I freely admit I don't understand how attestations work, and your description doesn't even make sense to me (how does something get "left on disk" in a GitHub action?

FWIW, this is because the steps within a single job on GitHub actions all share the same filesystem (and all other machine state). The filesystem only disappears in between jobs, because each job gets scheduled on its own VM.

I think @webknjaz has a fair point about public API surface here (and I'll note that the change here isn't intrinsically related to attestations; it's just the first feature to surface it). At the same time, I think it's unfortunate that we caused any degree of public workflow breakage here.

In the short term, you can work around this by setting attestations: false on the TestPyPI step -- attestations don't really matter for TestPyPI anyways, and that'll prevent the file conflict. I'll think some more about a longer-term solution tonight (I have ideas around enforcing domain separation between PyPI and TestPyPI artifacts that would potentially allow us to handle these kinds of cases).

@webknjaz
Copy link
Member

webknjaz commented Nov 26, 2024

(timeline race condition note: I started typing this in before William's comment but sent later)

@asmeurer I hear you! Yes, it's frustrating. Ironically, the post you linked was written by the very same person who added this check...

I think it's fair to want things to be explained more simply and clearly. Although, from the position of a maintainer, I feel like there's a lot of pressure to “get things right”, which turns into an unreasonably long list of feature requests demands; especially for stuff that should've never worked in the first place / worked by an unhappy accident. This is why I'm so protective of my world-writable TODO-list.

The official guide has existed from the very first day of this action's release: https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/. It changed with the action's requirements and reflected what needed to be done as new ways appeared. William also made an effort to update GitHub's docs (https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-pypi#updating-your-github-actions-workflow) and starter workflows (https://github.com/actions/starter-workflows/blob/main/ci/python-publish.yml), which is a cumbersome process, so that people wouldn't get incorrect recommendations from GH itself.

I understand that maybe the guide wasn't explaining the necessity to have job separation at first, but that improved over time. There's been discussions / issues in-repo about this for so long that I was under the impression that the guide mentioned this too...

I guess my point is that many people chose not to follow the documented way of use, and I don't feel like they should be in position to request that things work a certain way if they opted out of the supported method.

As for the possible improvement, perhaps we could turn it into a big fat warning for some while and delete the attestations from disk instead. But give some deadline for people to migrate. That is, if @woodruffw would say it's secure enough.

I freely admit I don't understand how attestations work, and your description doesn't even make sense to me (how does something get "left on disk" in a GitHub action?

Addressing this separately: we check if there's no attestations in the dist folder, we call pypi_attestations to do the signing magic, which then puts attestation files on disk, right next to the dists, into the same folder. And then, we upload all that.
So when you call the action for the second time, the attestations are present from the previous run and that first sanity check fails. I think William thought that this would indicate that the end-users attempted to create these files by hand somehow, in ways that wouldn't work.

Aren't github actions runs self contained?

To an extent. Most of the actions have a ton of side effects. I initially chose the Docker-based action type to reduce the number of side effects. So it wouldn't normally affect your Python install / virtualenv. But still, the workdir is mounted into the container, allowing the action to read the dists you're feeding into it and put more “output” files in there.

@woodruffw
Copy link
Member

As for the possible improvement, perhaps we could turn it into a big fat warning for some while and delete the attestations from disk instead. But give some deadline for people to migrate. That is, if @woodruffw would say it's secure enough.

I was discussing this with @facutuesca, who reached a similar conclusion as a workaround. I think this is fine from a security perspective (since the action itself is responsible for producing attestations), but it could be a source of user confusion in terms of why files are disappearing/being clobbered.

So, I'd be okay with this as a warning + a fixed disablement timeline, as long as there's consensus that this will minimize any user confusion and doesn't make it harder than necessary for people to consume attestations 🙂. I think that latter part should be fine, since we're planning on stuffing the attestations inside of output variables anyways.

I think William thought that this would indicate that the end-users attempted to create these files by hand somehow, in ways that wouldn't work.

Yep, that was the majority of the rationale! The other rationale is that even if it would work, having a conflict between attestations indicates a confused state somewhere (as it does in this case!), and that we'd be better off failing and trying to understand it rather than silently letting it through.

That's what ended up happening here and now we understand why it's happening, so I'm OK with tweaking the behavior now that it's a "known known" instead of a "known unknown."

@webknjaz
Copy link
Member

@woodruffw do we want to keep producing the artifacts as outputs in the dist directory, provided we don't currently set any outputs? If not, we could even go as far as writing them into a temporary directory instead of the workdir and then passing that to twine upload so they wouldn't leak into the host filesystem. That's what I mentioned to @facutuesca like 15 minutes ago on IRC... I just didn't remember if we discussed files being “promised outputs” explicitly.

I think we should make said warning very verbose and outline very specific steps to take to “migrate”. This would tie nicely with #305 (comment), I imagine.

@ThiefMaster
Copy link

So I just got bitten in the ass by this bug as well, and to quote myself (even after having read about attestations a while ago), my rant to my coworker was literally "what was this attestation BS again that just broke my release".

Would have been nice, if twine had had a release containing the fix for this...

TL;DR: Attestations are probably a useful feature, but bugs like this just make things annoying for everyone. Especially since fixing those require cleaning up and force-pushing the tag, and likely even enabling skipping of existing releases for the test pypi publish job since that one succeeded of course...

ThiefMaster added a commit to indico/flask-multipass that referenced this issue Nov 27, 2024
@woodruffw
Copy link
Member

This project doesn't control twine's release schedule.

TL;DR: Attestations are probably a useful feature, but bugs like this just make things annoying for everyone.

I understand your frustration, and I'm sorry that this bug bit you. At the same time, I don't think it's true that this makes things "annoying for everything": public upload statistics indicate that >99% of workflows were upgraded seamlessly.

Of the workflows that are experiencing problems, the majority are of a shape that was already known to be problematic (reusable workflows), and a minority are of a new shape (stacking publish steps on top of each other). It's worth noting that the top-level issue here is for the majority case, not the case you're experiencing.

We're going to improve the failure mode here as well (I'm going to be working on it in the coming days), but I don't think there's much value in overstating the case.

@webknjaz webknjaz unpinned this issue Dec 10, 2024
jtbandes added a commit to foxglove/schemas that referenced this issue Dec 11, 2024
Running a publish to TestPyPI immediately followed by a publish to the
"real" PyPI triggered [an
error](https://github.com/foxglove/schemas/actions/runs/12283736365/job/34277883747)
due to a file already existing. The workaround recommended in
pypa/gh-action-pypi-publish#283 (comment)
is to set `attestations: false` on the TestPyPI publish.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right question Further information is requested
Projects
None yet
Development

No branches or pull requests

9 participants