Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current upgrader's implementation is a time bomb #7382

Closed
drew2a opened this issue Apr 21, 2023 · 1 comment
Closed

Current upgrader's implementation is a time bomb #7382

drew2a opened this issue Apr 21, 2023 · 1 comment

Comments

@drew2a
Copy link
Contributor

drew2a commented Apr 21, 2023

The current upgrader procedure is the following.

No matter what it runs all available migrations on each Tribler's run:

TriblerUpgrader(377) - Upgrading GigaChannel DB from version 8 to 10
TriblerUpgrader(197) - Upgrade Pony DB 10 to 11
tribler.core.upgrade.config_converter(12) - Upgrade config to 7.6
TriblerUpgrader(213) - Upgrade bandwidth accounting DB 8 to 9
TriblerUpgrader(181) - Upgrade Pony DB 11 to 12
TriblerUpgrader(166) - Upgrade Pony DB 12 to 13
TriblerUpgrader(146) - Upgrade Pony DB from version 13 to version 14
TriblerUpgrader(130) - Upgrade tags to knowledge
TriblerUpgrader(109) - Remove old logs
TriblerUpgrader(135) - Upgrade Pony DB from version 14 to version 15

This is a historically established approach. We, as developers, follow this approach with the contemporary Tribler developments and just add more and more migrations to the existing ones.

However, this approach is subject to at least two problems. The first problem is easier to understand and avoid therefore I first start with it.

The problem: a developer should be quite careful with the conditions inside his migration.

There are no guidelines or even tips or hints on how the migration should be written. From the perspective of the code, it is just a function without any prerequisites.

There is an example of the two very first migrations

class TriblerUpgrader:
    ...
    def run(self):
        self.upgrade_pony_db_8to10()
        self.upgrade_pony_db_10to11()

    def upgrade_pony_db_8to10(self):
        database_path = self.state_dir / STATEDIR_DB_DIR / 'metadata.db'
        if not database_path.exists() or get_db_version(database_path) >= 10:
            return
        ... # code of the migration

   def upgrade_pony_db_10to11(self):
        database_path = self.state_dir / STATEDIR_DB_DIR / 'metadata.db'
        if not database_path.exists():
            return
        ... # code of the migration

From the code above, the conditions for a migration execution are

upgrade_pony_db_8to10:

        if not database_path.exists() or get_db_version(database_path) >= 10:
            return

upgrade_pony_db_10to11

        database_path = self.state_dir / STATEDIR_DB_DIR / 'metadata.db'
        if not database_path.exists():
            return

As these migrations execute each Tribler's run it is possible that false positives may occur if the conditions are not carefully chosen.

For example, the upgrade_pony_db_10to11 migration will be executed at any Tribler's run if metadata.db is missed.
This could lead to unexpected consequences because usually when writing migrations the programmer thinks that his migration will run only once and only during upgrading the application.

With the current upgrade approach, this problem could be solved only by being extremely careful while choosing conditions for each migration. And the easiness of making a mistake is great.

Since developers are humans, they will make this mistake sooner or later.

The problem: a migration uses models that could be inconsistent with the previous versions.

This problem is harder to understand and avoid.

Some migrations use models like:

  • MetadataStore
  • TriblerConfig
  • BandwidthDatabase
    ...
  • TagDatabase

These models always change throw time. New fields are added. New __init__ logic is introduced, default values are changed, etc.

When writing the migration, the developer has in mind the model that is present in the code at that point in time. And at that point in time the migration could work perfectly. But after some time, let's say after one year, another developer could change the model. This change will also slightly entail a change in migration which is not obvious. This change may not be detected by tests, but later on, it can lead to strange errors.

Another example (MetadataStore).

Imagine, we have three versions of MetadataStore that were changed during the developing process:

  1. MetadataStore 2019 (v1)
  2. MetadataStore 2020 (v2)
  3. MetadataStore 2021 (v3)

For these changes, two migrations were written:

  1. MetadataStoreMigration 2020 (v1->v2)
  2. MetadataStoreMigration 2021 (v2->v3)

As migrations keep in place in the Tribler code, they all refer to MetadataStore as a model to which the migration should be applied. With the naive approach, the first migration will likely be broken in the year 2021, because the old model's version (v1) no longer exists. That's why Tribler developers use pure SQL for DB migrations.

But this approach (semi-pure SQL migrations) will not save us from the inconsistency of other models, as we keep using Python code as a migration scenario.

Another example (BandwidthDatabase).

Let's say we will remove BandwidthDatabase in the next release. But what should we do with all migrations that use BandwidthDatabase model? The most straightforward approach would be to remove them. But it will lead to two consequences:

  1. We will reduce the number of Tribler versions from which you can upgrade to the current.
  2. We can break the chain of upgrades if the BandwidthDatabase upgrade is located somewhere in the middle (1->2->BDUpgrade->3->4).

The answer to this question will have to be found by Tribler developers in the future.

Conclusion

This issue describes the two problems that are most obvious to me and that can lead to bugs.
I haven't touched here on how convenient it is to write migrations (not convenient).

The risks of the first problem can be managed by increasing attention to migration writing, but the risks of the second problem cannot be managed at all with the current approach.

So, with this second problem, it is just a matter of time before a single action by a programmer, not directly related to migrations, will break them.

We should discuss these problems and find a solution as each new release increases the probability that the errors described above will be made.

The migrations (and the problems around them) are very well-known in the developer's world. We can simply take the best practices that work successfully in hundreds of other projects and not reinvent the wheel.

@qstokkink
Copy link
Contributor

Migrations have now been removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants