Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: check potential db migration conflict for open PRs #13498

Merged
merged 7 commits into from
Mar 11, 2021

Conversation

ktmud
Copy link
Member

@ktmud ktmud commented Mar 6, 2021

SUMMARY

Today master was broken for a couple of hours because a merge conflict of db migrations. This can happen when two PRs with db migrations that point to the same revision merge separately right after one another---since we don't rerun CI after merges, it's hard to catch this before merging.

This PR adds a new workflow that checks and notifies the PR author of potential db migration conflicts. It runs when an update of the db migration scripts is pushed to any branch, scans all open PRs that are using said branch as the base branch, and checks whether they also update the migration scripts. If yes, then a comment is sent to remind the PR author to rebase the branch.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TEST PLAN

Tested on my personal fork: ktmud#303

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

```

2. Pick one of them as the parent revision, open the script for the other revision
and update `Revises` and `down_revision` to the new parent revision. E.g.:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also updating the contribution guide on how to avoid adding a new empty db migration merge file.

@@ -0,0 +1,59 @@
name: Check DB migration conflict
on:
push:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this run only on pushes to master or is the intention to use this for release branches also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of feature branches, but this could be useful to release branches, too. We could potentially do some name matching here but since this workflow is relatively simple and fast, I think running on all branches is safer.

- name: Check and notify
uses: actions/github-script@v3
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure secrets will be available on push events from forked repositories.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think push events only triggers for the base repo. Forked repos will run the push event with their own secret, which does not have access to the base repo---in this case, the forked repo is considered its own base repo.

Either way, it shouldn't matter because we only want to run the workflow in the base branch anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GITHUB_TOKEN provided to forks is read-only. I believe a write token is required to publish a comment on a PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is, when this workflow is run in a fork's push event, it runs in the forked repo itself. It will check PRs within the fork. So it will be write token, but it's only writable to the PRs in the fork repo---which is why it worked in my test PR.

When this is merged into the base repo, then the push event will run on both the base repo and forks. When it's running on the base repo's push event, which is triggered only on PR merge and direct push, it will use a GITHUB_TOKEN that has write access to the base repo, because the code is considered safe since they are in the repo already.

Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the confusion was related to push events running on PRs from forked repos, but looking at recent action runs, this doesn't appear to be the case. The GitHub Actions docs leave something to be desired.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a check for migration conflicts. I was wondering if instead of posting a comment, we could just make a workflow that retriggers a workflow in all open PRs, and then fails if the head is different on the PR vs the branch it is opened against? Could of course be expensive to perform on all open PRs, but if we could check early on if there are migrations or not, we could avoid the expensive steps.


```bash
superset db merge {HASH1} {HASH2}
```

1. Upgrade the DB to the new checkpoint
3. Upgrade the DB to the new checkpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you should be able to always use number 1. and it will automatically increment it on the next bullet (avoids the need for updating all bullets when inserting/deleting bullets)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am aware of this markdown feature, just thought it reads nicer in the source code. The numbers doesn't have to be always 1 either, so one can also just choose not to update them while inserting/deleting bullets. I happen to think it's not a big deal to update it here.

E.g. this also works:

1.
2.
2.
5.

@ktmud
Copy link
Member Author

ktmud commented Mar 10, 2021

I like the idea of a check for migration conflicts. I was wondering if instead of posting a comment, we could just make a workflow that retriggers a workflow in all open PRs, and then fails if the head is different on the PR vs the branch it is opened against?

Sometimes updates in the migrations folder don't always create a new version (e.g. fixing a bug in an existing migration), so we'd also need to check for new files if we are to do this.

I also don't really know a clean way to re-trigger workflows from within a workflow that is guaranteed to work so was worried that even if we can manage to pull it off, it may require much more testing, which I'm not sure is gonna worth it.

After some digging, it seems Github actually does have a limitation on this:

When you use the repository's GITHUB_TOKEN to perform tasks on behalf of the GitHub Actions app, events triggered by the GITHUB_TOKEN will not create a new workflow run. This prevents you from accidentally creating recursive workflow runs.

@ktmud ktmud merged commit 5fca19d into apache:master Mar 11, 2021
@ktmud ktmud deleted the check-db-migration-conflict branch March 11, 2021 00:02
allanco91 pushed a commit to allanco91/superset that referenced this pull request May 21, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants