-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TriblerDatabaseMigrationChain
#7622
Conversation
b098927
to
ae3e20c
Compare
ae3e20c
to
ef744cb
Compare
f"Function {func.__name__} is not executed because DB version is not equal to {target_version}. " | ||
f"The current db version is {db.version}" | ||
) | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks dangerous to silently skip the migration in case of a version mismatch and can lead to incorrect migrations.
As a possible example:
- We have migration chain
[a, b, c]
, and after the migrationc
, the database has version 3 - Two developers make two independent PRs with migrations
d
ande
, both require db version 3 - After both PRs were successfully merged, the merge conflict in the migration chain was resolved to
[a, b, c, d, e]
, but bothd
ande
still mistakenly require version 3 of the database. - The execution of the migration chain silently skips the migration
e
For that reason, it is better to raise an error instead of skipping the migration so the database will not be left in a surprising state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to raise an exception as you suggested, then almost all runs would result in unsuccessful execution.
Example:
- We have a migration chain [a, b, c] with the corresponding versions [1, 2, 3]. Let's represent them as [a(1), b(2), c(3)].
- Our DB currently has a version set to 2.
- We begin executing the migration chain starting with migration a(1).
- This results in an exception because the DB version is not equal to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can first select suitable migrations, as described in the previous comment, and execute only them. This way, the migration decorator can throw an exception if the migration was performed at an improper moment. It looks safer than silently skipping migrations with incorrect db version numbers.
Also, consider the migrations [a, b, c, d, e] that mistakenly expect the schema version [1, 3, 2, 4, 5]. With silent skipping of the incorrect migrations, only migrations 1 and 2 will be executed. It is better to prevent such errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure that it is a good idea to automatically fix potential merge conflict errors.
Also, I'm not sure that the Migration Chain should have a monopoly on changing the version value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure that it is a good idea to automatically fix potential merge conflict errors.
I agree, and what I like in the current approach is that the merge conflicts for migrations always require explicit conflict resolution, as migrations are explicitly listed in the same list.
What I'm concerned about is that the current PR approach allows silent skipping of migration if the schema number specified in the decorator does not agree with the order of migrations in the explicit list. Raising an error looks like a safer approach.
|
||
return True | ||
|
||
def steps(self) -> Iterator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the terminology: I'd say we have a migration chain that consists of migrations, and a single migration can possibly consist of several steps. Here, in the for
loop, we iterate over migrations and not over the steps of an individual migration. So, it is better to rename it from steps
to migrations
to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a property named migrations
that stores all the migrations for this chain.
def steps(self)
represents the execution steps for the migration chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it is pretty common to say that an individual migration may consist of several steps. For that reason, using the word "steps" when referring to migrations looks confusing. Maybe we can rename steps
to something like find_suitable_migrations(current_schema_version)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you prove it?
it is pretty common to say that an individual migration may consist of several steps.
logger = logging.getLogger('Migration (TriblerDB)') | ||
|
||
|
||
def migration(execute_only_if_version: int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: It looks correct, but it may be more convenient to specify the resulting version after the migration and assume the schema version before the migration should be one less. In that case, @migration(schema_version=10)
would mean that the version after the migration is executed successfully should be 10, and the version before the migration should be 9. In that case, it will be a bit more obvious that after applying @migration(schema_version=10)
, the resulting schema has version 10. But it is a matter of personal preferences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is "a matter of personal preferences". Let's ask @xoriole for a second opinion.
A single place where the new format is documented would be great. Now lots of stuff is scattered across multiple issues. Fine for now, but please don't invest too much time into a generic migration tool with sub-migrations and concurrent developer conflict resolution fanciness. Please give me a new stable Tribler release in 2 weeks, with the merged Randevous certificates!
We only have 1 responsible developer for tribler.db migration, so this is not a scenario we need right? |
@synctext, the PR has been pending for about a week. I haven't received any feedback from the development team. But before I can merge, I need approval. In my view, the PR was already in good shape.
It's not the actual scenario. Any team member can write migrations, as has been done in the past. Unfortunately, there's no dedicated role as 'the developer responsible for tribler.db migration'. But regarding the
I agree that it's a low-probability scenario that we shouldn't focus on. |
But it was marked as ready for review just today. So, it looks reasonable that after the PR is ready for review, it can receive some review comments and not be accepted immediately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After additional discussion with @drew2a, we decided that we can handle version conflicts between different migrations in the future, and it is not necessary to handle it right now, as the chance of this error is low.
33f163e
to
3bda1f7
Compare
This PR introduces a more strict and formal way of DB migrations.
The main idea behind this change is to standardize the migration procedure into a series of functions that resemble:
first migration
second migration
This PR should fix some of the problems described in #7382
Related to #7398