-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Migrate to pyproject.toml and use poetry for deterministic builds #201
base: master
Are you sure you want to change the base?
[CI] Migrate to pyproject.toml and use poetry for deterministic builds #201
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
f7bec24
to
7489e74
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
2 similar comments
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
b7ddd49
to
2855fda
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
2855fda
to
82deefb
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
82deefb
to
2a376c6
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
2a376c6
to
030fe75
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
1 similar comment
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
c3830b4
to
39f857c
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
1 similar comment
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
- name: "Install pre-commit" | ||
run: pip install pre-commit | ||
|
||
- name: "Run pre-commit checks" | ||
run: pre-commit run --all-files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally keep linting and static analysis jobs separate from jobs which run tests. Makes it easier to see what failed.
It looks like now pre-commit runs as part of the test jobs themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar the reason is to try to adhere to the DRY principle and ensure that local and remote CI are consistent. In this instance rather than explicitly calling out each test/step in the GitHub actions this is now all defined by tox. Any updates to tox.ini
will be reflected in the remote CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that is a reasonable benefit it has it's downsides as well - namely that in case of failures you need to go to the CI run logs and see what exactly failed and unless you know what to search/grep for it's painful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are pros and cons with everything. In general adhering to the KISS principle—simplifying logic and ensure consistency—likely out weights the cons. It's worth noting the errors are present in the logs, albeit maybe having to toggle an additional chevron.
run: | | ||
pytest -s tests/ | ||
poetry run tox --parallel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is --parallel
needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar rather than multiple GitHub jobs this workflow is configured to have only one job and thus from a wall-clock perspective having the relevant tox environments run in parallel is more desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having single job makes it harder to inspect which env is failing without having to go to the actions page, and check the logs. Any way we can preserve current matrix behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #201 (comment). I'm really not certain whether the hypothesized concerns are problematic in reality. There's very little friction in introspecting the GitHub workflow logs to determine why CI is failing.
README.md
Outdated
engine = create_engine("trino://<username>@<host>:<port>/<catalog>/<schema>?access_token=<jwt_token>") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract no-op changes to separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar this is likely a default from my editor. I'll revert it. Adding the prettier
pre-commit hook helps to ensure files are formatted in a consistent manner which long term would prevent these types of no-op changes.
only unit tests, type: | ||
|
||
``` | ||
$ pytest tests/unit | ||
$ poetry run tox -e <environment> -- tests/unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to have reasonable default so that when developing locally contributors don't need to think about what env
to use. CI will do the exhaustive tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar local development likely wants to run multiple environments, i.e., py39
, pre-commit
, etc. Personally having used tox
for a number of years I have no issue with explicitly invoking the Python specific environment when running the unit/integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reduce friction for contributors (another benefit being that everyone running tox
will see same env unless they specify explicitly). Generally people shouldn't need to care what env they are testing against unless the issue is env specific (then they can provide their own choice of env).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure explicitly specifying an environment is friction. It's worth noting that local CI likely requires multiple environments, i.e., a common workflow could be to first run tox -e pre-commit
to ensure linting, typing, etc. is correct before running the test via tox -e py39
say.
One could argue having a these environments—a subset of which can be run via the -e
argument—actually reduces friction as it obfuscates the actual commands from the user, i.e., rather than the user needing to know they need to run the pytest
, pre-commit
, etc. commands (along with the relevant flags/arguments) they simply invoke the respective environment through the tox
CLI which standardizes testing workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps adding some examples like poetry run tox -e py39 -- tests/unit
could be a reasonable consensus? Contributors usually follow README first, or we can have a sample local testing workflow described like run pre-commit then running tests etc.
rm -rf dist/ && | ||
./setup.py sdist bdist_wheel && | ||
twine upload dist/* && | ||
poetry publish --build && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any possibility of clearing old state/artifacts for safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashhar I'm unsure. There's python-poetry/poetry#1329, however it has been open for some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the rm -rf dist/
in the steps - I verified that Poetry indeed doesn't clean up old artifacts.
.pre-commit-config.yaml
Outdated
@@ -13,3 +13,8 @@ repos: | |||
additional_dependencies: | |||
- "types-pytz" | |||
- "types-requests" | |||
|
|||
- repo: "https://github.com/python-poetry/poetry" | |||
rev: "1.2.0b3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's switch to stable release before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hovaesco the reason for including a pre-release it it exposes dependency groups. It's not clear from python-poetry/poetry#5586 when the 1.2.0 release is scheduled.
Are there any major concerns with using a pre-release in the interim especially if it seems to fulfill our needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-releases can always change and impact our current approach. Testing with pre-release makes lots of sense but I would opt to use official release in the project.
README.md
Outdated
$ python3 -m venv .venv | ||
$ . .venv/bin/activate | ||
$ pip install . | ||
$ curl -sSL https://install.python-poetry.org | python3 - --preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait for the official release before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment.
pyproject.toml
Outdated
@@ -0,0 +1,66 @@ | |||
[build-system] | |||
requires = ["poetry-core>=1.1.0b3"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be 1.2.0b3
as for now?
pyproject.toml
Outdated
|
||
[tool.poetry] | ||
name = "trino-python-client" | ||
version = "0.314.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some changes in versioning guides in README.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look here
pytz = "*" | ||
requests = "*" | ||
requests_kerberos = { version = "*", optional = true } | ||
sqlalchemy = { version = "~1.4", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was ~1.3 in setup.py previously, is there any specific reason for that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hovaesco this is precisely the reason for this change, i.e., to ensure deterministic builds. Specifically when this was set to ~1.3
running,
poetry run tox -e py39 -- tests/unit
would result in the following error:
Traceback:
../../../.pyenv/versions/3.9.4/lib/python3.9/importlib/__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/unit/sqlalchemy/test_dialect.py:5: in <module>
from sqlalchemy.engine import make_url
E ImportError: cannot import name 'make_url' from 'sqlalchemy.engine'
In SQLAlchemy v1.4 they exposed the make_url
function to the to the sqlachemy.engine module as previously in v1.3 one would need to import this via,
from sqlalchemy.engine.url import make_url
i.e., v1.3 is not viable given how the current code is written and thus the lower bound was wrong. This logic was added in da1441f which defined the sqlalechemy~=1.3
requirement (note the inclusion of the =
which differs from the Poetry dependency), which—per PEP 440—implies,
For a given release identifier V.N , the compatible release clause is approximately equivalent to the pair of comparison clauses:
>= V.N, == V.*
thus the sqlalechemy~=1.3
requirement is equivalent to sqlachemy>=1.3, == 1.*
. If you look at a GitHub workflow CI run (example) you can see,
Collecting sqlalchemy~=1.3
Downloading SQLAlchemy-1.4.39-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
that the CI is installing the latest version (v1.4.39) which adheres to the requirement specification.
Note in Poetry the tilde requirements (which don't include the =
) differ slightly, so ~1.3
is equivalent to >= 1.3.0, < 1.4.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the detailed explanation.
ac8dd71
to
83938b0
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
83938b0
to
4932a79
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
4932a79
to
d2437ef
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
d2437ef
to
8cca8e3
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
8cca8e3
to
f3e84eb
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
f3e84eb
to
70cb0b1
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
- name: "Install pre-commit" | ||
run: pip install pre-commit | ||
|
||
- name: "Run pre-commit checks" | ||
run: pre-commit run --all-files | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those steps are removed, then I assume checks
jobs is redundant. Outstanding jobs step be moved to build
job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it would be nice to still execute pre-commit checks in checks
job. Then build
could be run only if checks
job is successful. It will reduce runners usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hovaesco is it necessary to split out the checks vs the build? I agree that one could argue that pre-commit
performs pre-flight checks, however i) it does mean the DRY principle isn't really adhered to, and ii) you could argue (per the name) that these should actually be first run locally (alongside all the other local CI checks) at the time of the commit. This is actually the most efficient use of resources and results in faster iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot control behaviour of people who submit PRs. It's still useful to let people submit PRs in an imperfect shape and guide them to the finish line so that their next contribution is easier.
Also keeping the checks split has two benefits:
- Lesser wall-time for each check.
- Easy to see at a glance what failed with zero additional clicks (that's also why a matrix was used instead of tox for the version based tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with arguments given @hashhar that CI should still have those jobs split out + it will reduce runners usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general comment, could you please squash commits (if possible) and rename commit messages?
70cb0b1
to
3d2464c
Compare
3d2464c
to
f97d3da
Compare
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: John Bodley.
|
f97d3da
to
1226d39
Compare
1226d39
to
4545d7c
Compare
f4d114e
to
11858f7
Compare
11858f7
to
ecc5a3b
Compare
Somewhat of a yak shaving exercise as I was initially going to look into #199 though realized that the
trino-python-client
package has no deterministic dependency management. See the Slack discussion here for more details.There's a slew of frameworks for handling Python dependencies including Pipenv, pip-tools, etc. however it seems like Poetry is well supported and integrates with
pyproject.toml
—a replacement for the legacysetup.py
. Per #201 (comment), the importance is having a dependency manager rather than the dependency manager. If necessary is should be fairly trivial to later migrate to hatch or similar.This PR:
setup.py
in favor ofpyproject.toml
.poetry.lock
file.tox
—ensuring a consistent local and remote CI workflow.to: @bitsondatadev @hashhar