Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

universal-lock: clean up lock file format #3611

Closed
9 tasks done
Tracked by #3347
BurntSushi opened this issue May 15, 2024 · 7 comments
Closed
9 tasks done
Tracked by #3347

universal-lock: clean up lock file format #3611

BurntSushi opened this issue May 15, 2024 · 7 comments
Labels
preview Experimental behavior

Comments

@BurntSushi
Copy link
Member

BurntSushi commented May 15, 2024

In #3314, an initial version of a universal lock file format was added. It came with a lot of TODOs in the code and some uncertainties with the data model. Before we declare the lock file ready to use by users, we should take another pass over it and smooth things out. Here is a non-exhaustive list of things:

  • Audit handling of hashes in the lockfile #4924
  • There are some redundancies in the current format where we repeat information. For example, if we have a path dependency to a wheel, the path is encoded in the source (part of the distribution's ID) and then again in a wheel table. The same happens for sdists I believe. I think we would ideally not have wheel or sdist tables for distributions that have a path or directory or git source. I think the rule should be that an sdist and wheel table are only present if it's possible for more than one of them to exist. I think that only happens with registry dependencies.
  • Don't store absolute paths for path dependencies (iron out how to represent path dependencies in the universal lock file #3506)
  • The actual TOML serialization format is a bit bulky right now. I think it uses table arrays a bit too much. I'd love to, for example, squash wheels down into an inline table or an array of strings.
  • Consider indenting [distribution.dependencies] to make the file easier to scan.
  • Improve the formatting of source, specifically paths, e.g. we currently have source = "editable+." which looks bad.
  • The git source could use an audit and a potential refactor. We need to decide on how we want to encode the various git information (revision, tag, branch, sub-directory, URL).
  • We should be able to drop the source (and possibly also version) from a distribution.dependency entry when there is only one distribution with that package name.
  • Explore whether merge conflicts are created in basic cases. For example, if two different PRs add different dependencies and one gets merged, does the other PR wind up with conflicts that must be merged by hand?
@charliermarsh
Copy link
Member

charliermarsh commented May 18, 2024

Another piece here: we currently lose the file size (reported by the registry), which is used in downloading to facilitate prioritization (i.e., we start larger downloads earlier).

(Done in #3652.)

charliermarsh added a commit that referenced this issue May 19, 2024
ibraheemdev added a commit that referenced this issue May 30, 2024
## Summary

This PR changes the lock-file format to use inline tables for wheels and
source distributions, which currently use separate tables that make the
file harder to follow.

```diff
[[distribution]]
name = "typing-extensions"
version = "4.10.0"
source = "registry+https://pypi.org/simple"

- [distribution.sdist]
- url = "https://files.pythonhosted.org/packages/16/3a/0d26ce356c7465a19c9ea8814b960f8a36c3b0d07c323176620b7b483e44/typing_extensions-4.10.0.tar.gz"
- hash = "sha256:b0abd7c89e8fb96f98db18d86106ff1d90ab692004eb746cf6eda2682f91b3cb"
- size = 77558
-
- [[distribution.wheel]]
- url = "https://files.pythonhosted.org/packages/f9/de/dc04a3ea60b22624b51c703a84bbe0184abcd1d0b9bc8074b5d6b7ab90bb/typing_extensions-4.10.0-py3-none-any.whl"
- hash = "sha256:69b1a937c3a517342112fb4c6df7e72fc39a38e7891a5730ed4985b5214b5475"
- size = 33926

+ sdist = { url = "https://files.pythonhosted.org/packages/16/3a/0d26ce356c7465a19c9ea8814b960f8a36c3b0d07c323176620b7b483e44/typing_extensions-4.10.0.tar.gz", hash = "sha256:b0abd7c89e8fb96f98db18d86106ff1d90ab692004eb746cf6eda2682f91b3cb", size = 77558 }
+ wheel = [{ url = "https://files.pythonhosted.org/packages/f9/de/dc04a3ea60b22624b51c703a84bbe0184abcd1d0b9bc8074b5d6b7ab90bb/typing_extensions-4.10.0-py3-none-any.whl", hash = "sha256:69b1a937c3a517342112fb4c6df7e72fc39a38e7891a5730ed4985b5214b5475", size = 33926 }]
```

The downside is that the inline-tables end up quite long and TOML
doesn't support line breaks in inline tables, yet.

Part of #3611.
konstin added a commit that referenced this issue Jun 27, 2024
…tion.optional-dependencies` and `distribution.dev-dependencies`.

The new style is more concise (see examples below) and it makes the association between a distribution and its dependencies clearer (previously, they were both individual `[[...]]` blocks separated by newlines). The style is optimized for small, meaningful diffs by placing each dependency on a single line with a final trailing comma. Whenever a dependency is added, removed or changed, there should be a one line diff in `distribution.dependencies`. The final trailing comma ensures that adding a dependency doesn't change the line ahead.

Part of #3611

## Examples

### Simple workspace package

Before:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"

[[distribution.dependencies]]
name = "anyio"

[[distribution.dependencies]]
name = "seeds"
```

After:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"
dependencies = [
    { name = "anyio" },
    { name = "seeds" },
]
```

### Flask

Before:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[[distribution.dependencies]]
name = "blinker"

[[distribution.dependencies]]
name = "click"

[[distribution.dependencies]]
name = "itsdangerous"

[[distribution.dependencies]]
name = "jinja2"

[[distribution.dependencies]]
name = "werkzeug"

[distribution.optional-dependencies]

[[distribution.optional-dependencies.dotenv]]
name = "python-dotenv"
```

After:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
dependencies = [
    { name = "blinker" },
    { name = "click" },
    { name = "itsdangerous" },
    { name = "jinja2" },
    { name = "werkzeug" },
]
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[distribution.optional-dependencies]
dotenv = [
    { name = "python-dotenv" },
]
```

### Forking

Before:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."

[[distribution.dependencies]]
name = "package-a"
version = "4.3.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'darwin'"

[[distribution.dependencies]]
name = "package-a"
version = "4.4.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-b"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-c"
marker = "sys_platform == 'darwin'"
```

After:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."
dependencies = [
    { name = "package-a", version = "4.3.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'darwin'" },
    { name = "package-a", version = "4.4.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'linux'" },
    { name = "package-b", marker = "sys_platform == 'linux'" },
    { name = "package-c", marker = "sys_platform == 'darwin'" },
]
```
konstin added a commit that referenced this issue Jun 27, 2024
Use indented inline tables for `distribution.dependencies`, `distribution.optional-dependencies` and `distribution.dev-dependencies`.

The new style is more concise (see examples below) and it makes the association between a distribution and its dependencies clearer (previously, they were both individual `[[...]]` blocks separated by newlines). The style is optimized for small, meaningful diffs by placing each dependency on a single line with a final trailing comma. Whenever a dependency is added, removed or changed, there should be a one line diff in `distribution.dependencies`. The final trailing comma ensures that adding a dependency doesn't change the line ahead.

Part of #3611

## Examples

### Simple workspace package

Before:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"

[[distribution.dependencies]]
name = "anyio"

[[distribution.dependencies]]
name = "seeds"
```

After:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"
dependencies = [
    { name = "anyio" },
    { name = "seeds" },
]
```

### Flask

Before:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[[distribution.dependencies]]
name = "blinker"

[[distribution.dependencies]]
name = "click"

[[distribution.dependencies]]
name = "itsdangerous"

[[distribution.dependencies]]
name = "jinja2"

[[distribution.dependencies]]
name = "werkzeug"

[distribution.optional-dependencies]

[[distribution.optional-dependencies.dotenv]]
name = "python-dotenv"
```

After:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
dependencies = [
    { name = "blinker" },
    { name = "click" },
    { name = "itsdangerous" },
    { name = "jinja2" },
    { name = "werkzeug" },
]
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[distribution.optional-dependencies]
dotenv = [
    { name = "python-dotenv" },
]
```

### Forking

Before:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."

[[distribution.dependencies]]
name = "package-a"
version = "4.3.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'darwin'"

[[distribution.dependencies]]
name = "package-a"
version = "4.4.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-b"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-c"
marker = "sys_platform == 'darwin'"
```

After:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."
dependencies = [
    { name = "package-a", version = "4.3.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'darwin'" },
    { name = "package-a", version = "4.4.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'linux'" },
    { name = "package-b", marker = "sys_platform == 'linux'" },
    { name = "package-c", marker = "sys_platform == 'darwin'" },
]
```
@BurntSushi
Copy link
Member Author

For testing whether conflicts happen with independent updates to the lock file, I did something like this. To start:

$ cat > pyproject.toml <<EOF
[project]
name = 'project'
version = '0.1.0'
requires-python = '>=3.12'
dependencies = [
  'anyio<4.4',
  'click',
  'Flask<3.0.3',
]
EOF
$ uv lock

Then I made two branches from this point: one where I changed anyio<4.4 to anyio<5 and re-locked, and another where I changed Flask<3.0.3 to Flask<4 and re-locked. In both cases, the lock file is updated to new versions of packages. I then went back to master and tried to merge each of the branches, one after the other (in both orders). The result is that they both merge cleanly.

I then tried this same example with Poetry. Starting with:

$ cat > pyproject.toml <<EOF
[tool.poetry]
name = "project"
version = "0.1.0"
description = ""
authors = ["someone"]

[tool.poetry.dependencies]
python = "^3.12"
anyio = "<4.4"
click = "*"
Flask = "<3.0.3"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
$ poetry lock

I then repeated the same process as above: creating two different branches from master, updating anyio in one and Flask in another and then re-locking. I then went back to master and tried to merge the anyio update branch, which worked, but then trying to merge the Flask update branch results in a merge conflict:

$ git merge --no-ff ag/update-flask
Auto-merging poetry.lock
CONFLICT (content): Merge conflict in poetry.lock
Auto-merging pyproject.toml
Recorded preimage for 'poetry.lock'
Automatic merge failed; fix conflicts and then commit the result.

$ git diff
diff --cc poetry.lock
index d5a5b19,acd9943..0000000
--- a/poetry.lock
+++ b/poetry.lock
@@@ -217,4 -217,4 +217,8 @@@ watchdog = ["watchdog (>=2.3)"
  [metadata]
  lock-version = "2.0"
  python-versions = "^3.12"
++<<<<<<< HEAD
 +content-hash = "a84aedf401179b44c4e263992f3be48a231859fbe39edb2d8e315db13cc7f2ce"
++=======
+ content-hash = "5ebe45498202bf09fe0279c1096870f7f6e766ff0b09cd255f80d4aa31f723a4"
++>>>>>>> ag/update-flask

konstin added a commit that referenced this issue Jun 27, 2024
Use indented inline tables for `distribution.dependencies`,
`distribution.optional-dependencies` and
`distribution.dev-dependencies`.

The new style is more concise (see examples below) and it makes the
association between a distribution and its dependencies clearer
(previously, they were both individual `[[...]]` blocks separated by
newlines). The style is optimized for small, meaningful diffs by placing
each dependency on a single line with a final trailing comma. Whenever a
dependency is added, removed or changed, there should be a one line diff
in `distribution.dependencies`. The final trailing comma ensures that
adding a dependency doesn't change the line ahead.

Part of #3611

## Examples

### Simple workspace package

Before:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"

[[distribution.dependencies]]
name = "anyio"

[[distribution.dependencies]]
name = "seeds"
```

After:
```toml
[[distribution]]
name = "bird-feeder"
version = "1.0.0"
source = "editable+packages/bird-feeder"
dependencies = [
    { name = "anyio" },
    { name = "seeds" },
]
```

### Flask

Before:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[[distribution.dependencies]]
name = "blinker"

[[distribution.dependencies]]
name = "click"

[[distribution.dependencies]]
name = "itsdangerous"

[[distribution.dependencies]]
name = "jinja2"

[[distribution.dependencies]]
name = "werkzeug"

[distribution.optional-dependencies]

[[distribution.optional-dependencies.dotenv]]
name = "python-dotenv"
```

After:
```toml
[[distribution]]
name = "flask"
version = "3.0.2"
source = "registry+https://pypi.org/simple"
sdist = { url = "https://files.pythonhosted.org/packages/3f/e0/a89e8120faea1edbfca1a9b171cff7f2bf62ec860bbafcb2c2387c0317be/flask-3.0.2.tar.gz", hash = "sha256:822c03f4b799204250a7ee84b1eddc40665395333973dfb9deebfe425fefcb7d", size = 675248 }
dependencies = [
    { name = "blinker" },
    { name = "click" },
    { name = "itsdangerous" },
    { name = "jinja2" },
    { name = "werkzeug" },
]
wheels = [{ url = "https://files.pythonhosted.org/packages/93/a6/aa98bfe0eb9b8b15d36cdfd03c8ca86a03968a87f27ce224fb4f766acb23/flask-3.0.2-py3-none-any.whl", hash = "sha256:3232e0e9c850d781933cf0207523d1ece087eb8d87b23777ae38456e2fbe7c6e", size = 101300 }]

[distribution.optional-dependencies]
dotenv = [
    { name = "python-dotenv" },
]
```

### Forking

Before:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."

[[distribution.dependencies]]
name = "package-a"
version = "4.3.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'darwin'"

[[distribution.dependencies]]
name = "package-a"
version = "4.4.0"
source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-b"
marker = "sys_platform == 'linux'"

[[distribution.dependencies]]
name = "package-c"
marker = "sys_platform == 'darwin'"
```

After:
```toml
[[distribution]]
name = "project"
version = "0.1.0"
source = "editable+."
dependencies = [
    { name = "package-a", version = "4.3.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'darwin'" },
    { name = "package-a", version = "4.4.0", source = "registry+https://astral-sh.github.io/packse/0.3.29/simple-html/", marker = "sys_platform == 'linux'" },
    { name = "package-b", marker = "sys_platform == 'linux'" },
    { name = "package-c", marker = "sys_platform == 'darwin'" },
]
```
@sbidoul
Copy link

sbidoul commented Jul 13, 2024

Hi, do you have any plan regarding locking of build dependencies (build-system.requires) for sdists and source trees ?

@charliermarsh
Copy link
Member

Not currently, but it would be a natural thing for us to support... It's somewhat expensive because it means we have to download the source distribution for all packages regardless of whether we can extract the metadata from a wheel alone.

@sbidoul
Copy link

sbidoul commented Jul 13, 2024

And will there be a mean for the sync/install command to abort if it has to download a build dependency that is not locked?

@konstin
Copy link
Member

konstin commented Jul 15, 2024

@sbidoul Could you add a separate issue with some background on the motivation?

@charliermarsh
Copy link
Member

All the issues above have been crossed off. Lets close this out for now and we can track new issues separately as they come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Experimental behavior
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants