Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: [proposal] de-matrix python-version in GHAs #27906

Merged
merged 9 commits into from
Apr 10, 2024
Merged

chore: [proposal] de-matrix python-version in GHAs #27906

merged 9 commits into from
Apr 10, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented Apr 4, 2024

SUMMARY

In this PR:

  • introducing the notion of a python-version named 'current' and 'next' for the setup-backend GHA, and using those semantics in the matrix + required checks. setup-backend is the only place where we install python now, so it's DRY and allowing us to bump in one place in the future, with required checks names staying consistent
  • run pytest against the next version of python as @villebro suggested, same as we've been doing for integration tests
  • simplifying the single-item matrix (python-version) to NOT using a matrix. I'm guessing the reason we currently have a single-item matrix is an artifact of supporting multiple version in the past, and/or making it easy to go multi-python-version checks in the future, but there's a burden associated, especially around how this relates to "required checks" specified in .asf.yml
  • fixing/simplifying the related no-op workflows. We'll need new ones, but will be able to deprecate a bunch and simplify things. For instance, when we migrate to 3.11 in the future, we won't have to manage a bunch of python-version-specific no-ops

About supporting multiple/future version of python, I'd argue that we should focus on a single one for a given CI run, and that if/when we need to CI against multiple version, we run a FULL test suite punctually in a dedicate PR/branch/ref. Point being, it's expensive for every commit to validate multiple versions of python and in many ways its not necessary.

Currently our multi-python-version support is dubious at best, with only few checks that run against multiple versions. I really think we should pick a single version and support it very well. If/when we want to upgrade python version, we'd cut a PR and run CI for that purpose.

If we want to continuously, actively support multiple python versions (and I don't think we should!), I'd suggest either
a release-specific procedure (release manager using release branch, running full CI for that version/release) and/or a nightly job that would keep an eye on that version of python.

pre-commit:
strategy:
matrix:
python-version: ["3.9"]
python-version: ["3.9", "3.10", ""]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include 3.9? Let's just do 3.10

@craig-rueda
Copy link
Member

I'm all for just "officially" supporting a single PY version going forward. Most other projects do this AFAIK, so what's stopping us? :)

@raphaelauv
Copy link

the only benefit of supporting multiple python version is to increase the chance to offer more compatible client/driver libraries for source data to the end users

example I use the database XXX , the python client is not compatible with python 3.11 , I want use latest version of Apache Superset version X.Y.Z that only support 3.11

a policy like supporting the last 2 version of python with active support could e a first step, wdyt ?

@mistercrunch
Copy link
Member Author

Right, though we need to define what "support" means. Some criteria:

  • providing an official Docker release for the version of python - probably running the CI/test suite at release time
  • running all CI unit/integration tests on that version of python at all times, insuring that say master is compatible with that version of python at most times
  • not doing pro-active tests, but also not restricting the python package letting people "run at their own risk" (this seems never desirable, and is kind of what we're doing currently)

What we're doing now:

  • python package let's user install on 3.9, 3.10 and 3.11 I believe, based on the pyproject.toml
  • we run the full test suite on 3.10
  • we run one or two test suite on 3.11
  • Official docker images are all 3.10

In this PR:
- simplifying the single-item matrix (python-version) to NOT using a
  matrix. I'm guessing the reason we currently have a single-item matrix
  is an artifact of supporting multiple version in the past, and/or making it
  easy to go multi-python-version checks in the future, but there's a burden
  associated, especially around how this relates to "required checks"
  specified in .asf.yml
- leveraging the `setup-backend`'s default for python version, making
  the main python version we use much more DRY.
- fixing/simplifying the related no-op. We'll need new ones, but will
  be able to deprecate a bunch and simplify things. For instance,
  when we migrate to 3.11 in the future, we won't have to manage
  a bunch of python-version-specific no-ops

About supporting multiple/future version of python, I'd argue that we
should focus on a single one for a given CI run, and that if/when we
need to CI against multiple version, we run a FULL test suite
punctually in a dedicate PR/branch/ref. Point being, it's expensive
for every commit to validate multiple versions of python and in
many ways its not necessary.

Currently our multi-python-version support is dubious at best, with only
few checks that run against multiple versions. I really think
we should pick a single version and support it very well. If/when we
want to upgrade python version, we'd cut a PR and run CI for that
purpose.

If we want to continuously, actively support multiple python versions
(and I don't think we should!), I'd suggest either
a release-specific procedure (release manager using
release branch, running full CI for that version/release) and/or a
nightly job that would keep an eye on that version of python.
@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 9, 2024
@mistercrunch
Copy link
Member Author

Capturing the conversation from the Release Strategy Group this morning:

  • let's support a single python version officially per release, and run CI against that one mainly
  • let's keep preventing "future regressions" on future python version by running some minimal CI against the future version of python (currently test-postgres against 3.11). For now this is the only place where we need a proper matrix. I might try to rename to "current" and "future" to address the required check issues

shell: bash
run: |
if [ "${{ inputs.python-version }}" = "current" ]; then
echo "PYTHON_VERSION=3.10" >> $GITHUB_ENV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem for this PR, but I'm thinking at some point we might want to keep our npm/node/python/whatever version numbers in a JSON file at the repo's root, so they can be pulled into scripts, documentation, etc. as needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG YES! But GHA doesn't play nicely with that kind of parameter injection. Composition through the "reusable action" approach that we have here is the best we can do with basic actions (calling a reusable action that hard-codes the values).

If we wanted more flexibility I'd suggest a Jinja2-based approach, where we'd have the source templates in say a .github/jinja folder, and render that into .github/ via a commit hook or similar. Seemed overkill for now, but I'd be supportive to move in that direction.

Now comparing to where we were a few months back, we're at a much better place with this setup-backend and setup-supersetbot approach.

steps:
- name: No-op for frontend-build
run: exit 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some inconsistent breaks between blocks if we want to get all OCD here.

I wonder: would it be a problem to add ALL required checks here? If so, I wonder if we could ingest and iterate through the list from asf.yaml rather than having to maintain this.

Copy link
Member

@rusackas rusackas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple nits/suggestions, but seems fine to me in general.

@mistercrunch mistercrunch merged commit dea4306 into master Apr 10, 2024
38 checks passed
@mistercrunch mistercrunch deleted the no-op branch April 10, 2024 21:32
mistercrunch added a commit that referenced this pull request Apr 11, 2024
After merging #27906, we shouldn't need this no-op file that was a bit of a hack in the first place. For more information as to why we needed this originally, view the comments/docs in the file I'm deleting in this PR
EnxDev pushed a commit to EnxDev/superset that referenced this pull request Apr 15, 2024
qleroy pushed a commit to qleroy/superset that referenced this pull request Apr 28, 2024
jzhao62 pushed a commit to jzhao62/superset that referenced this pull request May 16, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.1.0 labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels github_actions Pull requests that update GitHub Actions code preset-io size/L 🚢 4.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants