Allow external dependencies #61

hauntsaninja · 2022-09-22T07:38:52Z

Resolves python/typeshed#5768 (on stub_uploader side)

In python/typeshed#5768 (comment) consensus was seemingly reached on Akuli's idea, that external dependencies for stub packages are fine, as long as we validate that the external dependency is a dependency of the upstream package.

The most important part of the implementation is the validation we perform. This happens in verify_typeshed_req and verify_external_req. To help lock things down, Metadata does not expose access to elements of requires without validation.

We rely on PyPI's API to find the dependencies of the upstream. I believe this might not work if our stub has external dependencies and the upstream does not publish wheels. This is not the case currently (as proven by test), does not seem too likely, and it's unclear how to do safely, so will leave as a TODO for the future.

We use uploaded_packages.txt as the source of truth for what is a typeshed dependency. This is important to avoid potential badness around addition and deletion, for instance, if something is added to typeshed, but the distribution is created by someone else before stub_uploader uploads it.

The other set of changes is that I delete most of the graph code that existed previously. The graph code was added in
#1 and was previously load bearing for security. The idea being to ensure that the graph of transitive dependencies was fully contained within typeshed. This is no longer the case, so we can remove most of it.

I still have some graph code in here, but it's no longer load bearing for security. I keep it around to better preserve uploading semantics, since it seems like it could matter in some edge case scenarios (such as multiple packages being uploaded for the first time that depend on each other). Since we don't have custom needs, we can get away with using the new-ish graphlib from stdlib.

While the graph code has been removed, note that we do still run validation on transitive dependencies for each package. This is accomplished by recursive_verify. (I think the non-transitive validation is probably sufficient, but running this before uploading each package can't hurt)

I added some special-casing for types-gdb. As Akuli pointed out, this avoids us accidentally trusting a gdb package on PyPI if ever someone attempts to add external dependencies to types-gdb.

New code paths have tests. I audited test coverage to make sure of this.

Resolves python/typeshed#5768 In python/typeshed#5768 (comment) consensus was reached on Akuli's idea, that external dependencies for stub packages are fine, as long as we validate that the external dependency is a dependency of the upstream package. The most important part of the implementation is the validation we perform. This happens in `verify_typeshed_req` and `verify_external_req`. To help lock things down, Metadata does not expose access to elements of `requires` without validation. We rely on PyPI's API to find the dependencies of the upstream. I believe this might not work if our stub has external dependencies and the upstream does not publish wheels. This is not the case currently (as proven by test) and does not seem too likely, so will leave as a TODO for the future. We use uploaded_packages.txt as the source of truth for what is a typeshed dependency. This is important to avoid potential badness around addition and deletion, for instance, if something is added to typeshed, but the distribution is created by someone else before stub_uploader uploads it. The other set of changes is that I delete most of the graph code that existed previously. The graph code was added in typeshed-internal#1 and was previously load bearing for security. The idea being to ensure that the graph of transitive dependencies was fully contained within typeshed. This is no longer the case, so we can remove most of it. I still have some graph code in here, but it's no longer load bearing for security. I keep it around to better preserve uploading semantics, since it seems like it could matter in some edge case scenarios (such as multiple packages being uploaded for the first time that depend on each other). Since we don't have custom needs, we can get away with using the new-ish graphlib from stdlib. While the graph code has been removed, note that we do still run validation on transitive dependencies for each package. This is accomplished by `recursive_verify`. I think the non-transitive validation is sufficient, but running this before uploading each package doesn't hurt. I added some special-casing for types-gdb. As Akuli pointed out, this avoids us accidentally trusting a gdb package on PyPI if ever someone attempts to add external dependencies to types-gdb. New code paths have tests. I audited test coverage to make sure of this.

stub_uploader/metadata.py

JelleZijlstra · 2022-09-23T02:53:38Z

stub_uploader/metadata.py

+
+    def read(self) -> set[str]:
+        if self._cached is not None:
+            return set(map(canonical_name, self._cached))


Why not cache the canonicalized set?

Currently uploaded_packages.txt uses pretty case and I figured we might want to keep it that way

stub_uploader/metadata.py

ilevkivskyi · 2022-09-24T21:03:42Z

stub_uploader/metadata.py

+    def upstream_distribution(self) -> Optional[str]:
+        # TODO: add a field to METADATA.toml if the stubs are for a package
+        # that does not exist on PyPI
+        if self._alleged_upstream_distribution == "gdb":


This makes me think about another possible attack. What if someone creates a package foo and adds types-foo to typeshed. It can be a totally benign package. Then later they make a PR that makes types-foo a dependency of something commonly used, like types-requests (or maybe something less suspicious), e.g. because the library has some custom type that works as an argument for some requests function. Much later they add some malicious package bad as a dependency of foo, and submit a "trivial" PR adding it also as a dependency of types-foo. Now this malicious package bad will be installed for a lot of users (executing arbitrary code during install), despite the fact that no-one may actually use foo.

Of course this is a tricky scenario, but someone may still try it, taking into account that potential impact will be very high. You may say that something similar is also possible for a regular runtime package, but the difference is that maintainers of runtime packages usually deal with only few dependencies, and adding another one is a really big deal, while typeshed maintainers will (hopefully) deal with hundreds or thousands packages each with their own deps.

So, I would say we will still need an explicit allowlist in this repo, in addition to the logic proposed in this PR.

I think that can be dealt with in a similar way to the current logic for external dependencies, that is, in order to add types-xyz as a dependency of types-requests, xyz needs to be a dependency of requests.

That's probably more robust too, because whatever process we have for adding things to the allowlist will be used to add things to the allowlist. To spell things out:

Package foo exists, that depends on bar

Add types-foo, add dependency on bar

In order to make types-foo useful, we need bar, so we do some security review of bar / the bar maintainer and add it to the allowlist

bar (gets hacked and) adds a dependency on evil

types-foo users are affected, while this isn't good we're okay with it, since those are assumed to be a subset of foo users

Someone sneaks past adding a dependency on types-foo to types-requests

types-requests users are affected, this is really bad

So adding a step 3 can help in practice, but we should prevent step 6. I can write this check, should be pretty easy to extend the existing logic.

Note that I'm not against adding an allowlist, but I'd like to think of it more as a fallback. I'd also want us to clearly document the criteria for additions to the allowlist.

Okay, I wrote the logic for preventing step 6 locally, but it doesn't pass cleanly against existing packages. There are four failures, two of which are addressed by python/typeshed#8792 and python/typeshed#8793.

The other two are trickier, so would need to make that change later in another PR (might need new METADATA.toml fields). In the meantime I've added the allowlist as proposed, which should unblock this PR.

but it doesn't pass cleanly against existing packages

Yeah, I felt it may happen. Anyway, the PR looks good now.

For typeshed-internal/stub_uploader#61 (comment)

stub_uploader/metadata.py

For typeshed-internal/stub_uploader#61 (comment)

hauntsaninja · 2022-10-11T07:18:44Z

Bump :-)

ilevkivskyi · 2022-10-11T11:15:58Z

@hauntsaninja Hm, it looks like one of added tests is broken on master, could you please check?

hauntsaninja mentioned this pull request Sep 22, 2022

Allow non-types dependencies python/typeshed#5768

Closed

JelleZijlstra reviewed Sep 23, 2022

View reviewed changes

hauntsaninja added 4 commits September 22, 2022 23:14

error if distribution is not on pypi

d6a0683

use removeprefix

65c1cdd

hoist packages out of loop

ea736a8

add comment about sdists

b5e1279

ilevkivskyi reviewed Sep 24, 2022

View reviewed changes

hauntsaninja mentioned this pull request Sep 26, 2022

boto: drop dependency on types-six python/typeshed#8792

Merged

hauntsaninja added a commit to hauntsaninja/typeshed that referenced this pull request Sep 26, 2022

google-cloud-ndb: drop dependency on types-six

2316bb2

For typeshed-internal/stub_uploader#61 (comment)

hauntsaninja mentioned this pull request Sep 26, 2022

google-cloud-ndb: drop dependency on types-six python/typeshed#8793

Merged

hauntsaninja added 2 commits September 25, 2022 22:06

add allowlist

9bd6241

add comment about requires_dist

26f8fcd

AlexWaygood reviewed Sep 26, 2022

View reviewed changes

stub_uploader/metadata.py Show resolved Hide resolved

hauntsaninja commented Sep 26, 2022

View reviewed changes

stub_uploader/metadata.py Outdated Show resolved Hide resolved

Update stub_uploader/metadata.py

e6fd04e

ilevkivskyi approved these changes Sep 26, 2022

View reviewed changes

hauntsaninja added a commit to python/typeshed that referenced this pull request Sep 27, 2022

google-cloud-ndb: drop dependency on types-six (#8793)

9544447

For typeshed-internal/stub_uploader#61 (comment)

Avasam mentioned this pull request Oct 11, 2022

opencv-python stubs python/typeshed#8879

Closed

ilevkivskyi merged commit 5aae098 into typeshed-internal:main Oct 11, 2022

AlexWaygood mentioned this pull request Oct 11, 2022

regex: add support for RegexFlag class python/typeshed#8881

Merged

hauntsaninja deleted the allow-external branch October 11, 2022 18:49

hauntsaninja mentioned this pull request Oct 14, 2022

Uploads broken #68

Closed

hauntsaninja mentioned this pull request Feb 16, 2023

Allowing more external dependencies for stubs #90

Open

Akuli mentioned this pull request Jun 11, 2024

Add tree-sitter to the list of allowed external packages #138

Merged

Akuli mentioned this pull request Jul 15, 2024

Add documentation of security implications #140

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow external dependencies #61

Allow external dependencies #61

hauntsaninja commented Sep 22, 2022 •

edited

Loading

JelleZijlstra Sep 23, 2022

hauntsaninja Sep 23, 2022 •

edited

Loading

ilevkivskyi Sep 24, 2022

hauntsaninja Sep 24, 2022 •

edited

Loading

hauntsaninja Sep 26, 2022

ilevkivskyi Sep 26, 2022

hauntsaninja commented Oct 11, 2022

ilevkivskyi commented Oct 11, 2022

Allow external dependencies #61

Allow external dependencies #61

Conversation

hauntsaninja commented Sep 22, 2022 • edited Loading

JelleZijlstra Sep 23, 2022

Choose a reason for hiding this comment

hauntsaninja Sep 23, 2022 • edited Loading

Choose a reason for hiding this comment

ilevkivskyi Sep 24, 2022

Choose a reason for hiding this comment

hauntsaninja Sep 24, 2022 • edited Loading

Choose a reason for hiding this comment

hauntsaninja Sep 26, 2022

Choose a reason for hiding this comment

ilevkivskyi Sep 26, 2022

Choose a reason for hiding this comment

hauntsaninja commented Oct 11, 2022

ilevkivskyi commented Oct 11, 2022

hauntsaninja commented Sep 22, 2022 •

edited

Loading

hauntsaninja Sep 23, 2022 •

edited

Loading

hauntsaninja Sep 24, 2022 •

edited

Loading