Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always run pulpcore-manager migrate #351

Merged
merged 2 commits into from
Aug 14, 2024
Merged

Always run pulpcore-manager migrate #351

merged 2 commits into from
Aug 14, 2024

Conversation

evgeni
Copy link
Member

@evgeni evgeni commented Aug 13, 2024

Pulpcore uses post migration hooks to create data in the database, which
sometimes changes even if there are no real migrations to be run.

This is fine from a Django PoV -- post migration hooks run after every
migrate call, even if there have been no changes.

However, this means that we need to run migrate unconditionally as
otherwise we might miss changes to the hooks (or their data) when there
is no migration attached.

An example of such a change is
pulp/pulpcore@20cffca
where a new role was introduced and our setup started failing as it was
expecting the role to exist after the upgrade, but it did not as we did
not call migrate.

Pulpcore uses post migration hooks to create data in the database, which
sometimes changes even if there are no real migrations to be run.

This is fine from a Django PoV -- post migration hooks run after every
`migrate` call, even if there have been no changes.

However, this means that we need to run `migrate` unconditionally as
otherwise we might miss changes to the hooks (or their data) when there
is no migration attached.

An example of such a change is
pulp/pulpcore@20cffca
where a new role was introduced and our setup started failing as it was
expecting the role to exist after the upgrade, but it did not as we did
not call `migrate`.
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes our module non-idempotent so I'm not sure what to do here. It feels to me that they broke --check and that would be the correct fix.

It reminds me of Foreman's db:seed, which is also ugly.

@evgeni
Copy link
Member Author

evgeni commented Aug 13, 2024

This makes our module non-idempotent so I'm not sure what to do here. It feels to me that they broke --check and that would be the correct fix.

Technically correct. But p-m migrate is already idempotent in itself, so there is not much winning here to double it via Puppet.

It reminds me of Foreman's db:seed, which is also ugly.

Yeah.

@ekohl
Copy link
Member

ekohl commented Aug 13, 2024

From the Puppet perspective it looks really hard to make it idempotent so I wanted to understand what was going on.

This is the code that actually triggers the population:
https://github.com/pulp/pulpcore/blob/fa1931d780d9b1b1e409e387add38e42d8ec28b8/pulpcore/app/apps.py#L122-L136

Reading the post_migrate documentation there's an interesting note:

Sent at the end of the migrate (even if no migrations are run) and flush commands. It’s not emitted for applications that lack a models module.

Another thing that's interesting:

sender
An AppConfig instance for the application that was just installed.

Or is that the AppConfig for the file it's imported from (in this case PulpAppConfig)? It can't be the various other apps.

I had always assumed people in Django used data migrations to ensure records were present. At this point I'm not sure how to best proceed. I'm a bit tempted to at least implement a parameter to toggle the behavior so users can make it idempotent.

@evgeni
Copy link
Member Author

evgeni commented Aug 14, 2024

I had always assumed people in Django used data migrations to ensure records were present.

For a general "in Django", this is correct: https://docs.djangoproject.com/en/5.0/topics/migrations/#data-migrations

However, Pulp does not allow backports of migrations (pulp/pulpcore#952) and in the particular case of data migrations flush kills the data but does not re-apply the migration afterwards (as it's marked as run): https://pulp.plan.io/issues/7710

At this point I'm not sure how to best proceed. I'm a bit tempted to at least implement a parameter to toggle the behavior so users can make it idempotent.

You mean a pulpcore::database::always_run_migrations that is true by default but CI and some users can set to false?

Edit: mhh, pulpcore::database is marked as private API, should this boolean be pulpcore::always_run_database_migrations? Or should we just not care to make this officially public?

@evgeni evgeni force-pushed the always-migrate branch 2 times, most recently from 037ff43 to 7fdb9d5 Compare August 14, 2024 09:18
@evgeni
Copy link
Member Author

evgeni commented Aug 14, 2024

🍏

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some further consideration: making it non-idempotent will always restart Pulp services, meaning the module becomes really unreliable for people who use them as Puppet was intended (constantly running).

So I see 3 options:

  1. Default the new option to false here and set it to true in the installer
  2. Keep the current code but simply run the migration as a hook in the installer (I'd assume roles can be created after the application is already running.)
  3. Somehow write code that makes --check work again

I think now I'm leaning to option 1 and investigate option 3. Looking at the code that didn't look like an easy option so it's certainly not a short term solution. It also makes me think Pulp should list it as a known issue.

manifests/database.pp Show resolved Hide resolved
@evgeni
Copy link
Member Author

evgeni commented Aug 14, 2024

Some further consideration: making it non-idempotent will always restart Pulp services, meaning the module becomes really unreliable for people who use them as Puppet was intended (constantly running).

What makes you think it will restart pulp services? I see nothing that would notify the service objects after a migration has happened?

@ekohl ekohl added the Bug Something isn't working label Aug 14, 2024
@@ -2,6 +2,7 @@
# @api private
class pulpcore::database (
Integer[0] $timeout = 3600,
Boolean $always_run_migrations = true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] should this default to false (idempotent by default for people using the puppet module directly), with hiera in foreman-installer providing a true value for our purposes?

as an added benefit, the acceptance test here wouldn't rely on a non-default value... but is there a major downside that I'm overlooking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's option 1 from #351 (review) and yes we can certainly make that change.

The thing is, that will make the default installation by definition broken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I missed that review comment... thanks!

The thing is, that will make the default installation by definition broken.

since this is the first time the issue has occurred, it seems reasonable to me to assume that the types of changes requiring a --migrate run despite no pending migrations would be infrequent.

it also occurs to me now that the pulp folks could accompany such future changes with an empty dummy migration if we ask nicely... in theory, I think we could even submit such a patch to pulp and then simply revert this one here -- what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nevermind my idea about dummy migrations, I see the comment about pulp/pulpcore#952 now

Copy link
Member Author

@evgeni evgeni Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is the first time the issue has occurred, it seems reasonable to me to assume that the types of changes requiring a --migrate run despite no pending migrations would be infrequent.

While I agree such changes are probably infrequent, those are exactly the things you want to have smoothly as otherwise you'll forget or never learn about the special cases before things are broken ;-)

In this particular case we only caught it because the upgrade happened between 3.49.16 and 3.49.17. If it'd be a 3.39.z to 3.49.z, there would have been migrations and it would have went unnoticed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if Pulp could somehow extend https://github.com/django/django/blob/main/django/core/management/commands/migrate.py to just always return true on --check instead of lying. But it does not look easily extensible. Hooking into the planner is also something not for the faint of heart.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make it non-idempotent again (for Puppet) if --check always returns true?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would need to verify if all roles that need to exist actually exist

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, interesting you're right that we don't have a specific notify:

Anchor <| title == 'pulpcore::repo' |> ~> Class['pulpcore::install']
Class['pulpcore::install'] ~> Class['pulpcore::config', 'pulpcore::database', 'pulpcore::service']
Class['pulpcore::config'] ~> Class['pulpcore::database', 'pulpcore::static', 'pulpcore::service']
Class['pulpcore::database', 'pulpcore::static'] -> Class['pulpcore::service'] -> Class['pulpcore::apache']
# lint:ignore:spaceship_operator_without_tag
Class['pulpcore::install']
~> Pulpcore::Plugin <| |>
~> Class['pulpcore::database', 'pulpcore::service']
# lint:endignore

I think I designed it that way because whenever there was a migration, the service restart was already triggered by the other notifications.

I'm not happy with this, but let's try this out.

@ekohl ekohl merged commit 0004d05 into master Aug 14, 2024
17 checks passed
@ekohl ekohl deleted the always-migrate branch August 14, 2024 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants