Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(migrations): apply command #7099

Merged
merged 9 commits into from
Mar 2, 2021
Merged

Conversation

jzaralim
Copy link
Contributor

@jzaralim jzaralim commented Feb 25, 2021

Description

Migrations tool apply command. Follows the algorithm outlined here. Also includes helper functions for writing metadata, loading all migration files and extracting the name of the migration from the file name, and introduces a new class, Migration

Testing done

unit and integration tests

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@jzaralim jzaralim marked this pull request as ready for review February 26, 2021 20:43
@jzaralim jzaralim requested a review from a team as a code owner February 26, 2021 20:43
Copy link
Contributor

@vcrfxia vcrfxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jzaralim this is really great! A few concerns and the usual boatload of nits. A lot of my comments are around supporting non-consecutive migration versions and (in the future) out-of-order migrations. Feel free to defer any of the comments related specifically to out-of-order migrations, though I do think we should support non-consecutive version numbers out of the box.

this.version = Integer.toString(version);
this.name = Objects.requireNonNull(name);
this.checksum = Objects.requireNonNull(checksum);
this.previous = version == 1 ? MetadataUtil.NONE_VERSION : Integer.toString(version - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is incorrect. Migration numbers don't have to be consecutive. "Previous" refers to the previous version with state MIGRATED. We shouldn't try to find this at the time of loading the files, instead we should let the "apply" algorithm figure this out (via MetadataUtil.getLatestMigratedVersion()).

final String migrationsDir
) {
int end = start;
while (getFilePathForVersion(Integer.toString(end), migrationsDir).isPresent()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this logic doesn't work since version numbers don't need to be consecutive. Instead we should open all files in the directory, check if they match the expected name format (log a warning if not), and then apply a filter for the versions we care about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, getFilePathForVersion() already loads all the files, so we'd rather not call this helper multiple times.

return 1;
}

if (ValidateMigrationsCommand.validate(config, migrationsDir, ksqlClient)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's add log message saying that we're validating the currently applied migrations as part of preparing to applying new migrations, else users might get confused since the logger for any potential validation errors is the ValidateMigrationCommand logger rather than the ApplyMigrationCommand logger.

LOGGER.info("Loading migration files");

if (next) {
migrations = getAllMigrations(start, start, migrationsDir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this logic doesn't quite work since the next version does not have to be consecutive. The "next" version can use the same logic as "all" to find future migration files (from the latest applied version) but should only apply the first version found.

We might also need to check whether the version(s) to be applied have already been applied or not. It's possible that we went "backwards" in version due to an out of order migration. We should check with @colinhicks what the expected user experience is here. Specifically, suppose a user applies migrations with versions 1, 2, and 6, then applies version 3 out of order. If the user then tries to apply the "next" migration, do we apply the migration with version 4 or 7?

Copy link
Contributor Author

@jzaralim jzaralim Feb 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design doc suggests this option:

  --out-of-order     Allow out of order migrations. Only applicable for `apply`. This does not affect the behavior of `next` but will affect `all` and `until`.``

So if we follow this, then next doesn't support out of order migrations and would always apply 7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still ambiguous. For example, if you apply 1 then 2 then 6 then 3, does "CURRENT" now point to 6 or 3? I think it should point to 3, and that the "previous" state linked to from 3 should be 6. If that's the case then "next" from the "CURRENT" of 3 is still 4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @vcrfxia above.

Specifically, suppose a user applies migrations with versions 1, 2, and 6, then applies version 3 out of order. If the user then tries to apply the "next" migration, do we apply the migration with version 4 or 7?

As a user, I would expect version 4.

I think it's still ambiguous. For example, if you apply 1 then 2 then 6 then 3, does "CURRENT" now point to 6 or 3? I think it should point to 3, and that the "previous" state linked to from 3 should be 6. If that's the case then "next" from the "CURRENT" of 3 is still 4.

This also sounds good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colinhicks Won't that behavior cause confusions with users? If I migrated to 6, and for some reason I applied an out-of-order migration 3, why would we continue from 4 after that? Btw, I don't understand out-of-order migrations yet, but if users use it, then those should be in special cases, i.e. to apply a migration that they forgot (oops!). I would expect the tool continues applying migrations from the highest version migrated, in this case next one should be 7.

Btw, do we need to support out-of-order migrations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spena can you clarify the line here?

--out-of-order Allow out of order migrations. Only applicable for apply. This does not affect the behavior of next but will affect all and until.``

I think I might be misinterpreting it. Are we actually saying that --out-of-order handles the case where the current migration is 6 and a new migration, 4, was added late, after we had applied 1, 2, 3, 5, and 6?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we actually saying that --out-of-order handles the case where the current migration is 6 and a new migration, 4, was added late, after we had applied 1, 2, 3, 5, and 6?

I added that line to the KLIP. My intent in saying "This does not affect the behavior of next" was that if the current migration is 6, then next will not apply 4 even if --out-of-order is set. Upon thinking about it more, I think we should either disallow setting --out-of-order for the next mode (to avoid confusion) or allow it with the semantics of finding the first, unapplied migration, which may be before the "current" version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, do we need to support out-of-order migrations?

We've already decided to not support this for the first version of the migrations tool, so I assume you're asking about the long-term plan instead? I thought it was a useful feature to have in the long run but maybe not if it's generating so much confusion. I'll let @colinhicks make the call here.

A related question is whether we want to allow non-consecutive migration versions (e.g., as a user can I apply version 1 and then version 3 without applying version 2). Doing so without supporting out of order migrations means I'll never be able to apply version 2 (on this cluster). This may still be valuable to users who, for example, start using the migrations tool from a non-clean slate and create migration 1 to capture the current state but cannot apply it (because those streams and tables already exist), and want to start from migration 2. Does this make sense, or would supporting non-consecutive migrations introduce more confusion than it's worth, if we don't also support out of order migrations (at least in v1 of the migrations tool)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

I think we should either disallow setting --out-of-order for the next mode (to avoid confusion) or allow it with the semantics of finding the first, unapplied migration

Based on the conversation above, my vote is to disallow setting it for next in order to avoid confusion :). How do others feel about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we want to keep the option open to support other out-of-order migration scenarios in future versions. Ideally we can be informed by user feedback when considering cases.

.filter(s -> s.length() > 1)
.collect(Collectors.toList());
for (final String command : commands) {
ksqlClient.executeStatement(command + ";").get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work for INSERT VALUES statements. We can handle this case in a follow-up PR, but INSERT VALUES statements need to be submitted via the insertInto() method rather than executeStatement().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that translating INSERT VALUES statements into insertInto() calls is actually really annoying. Let me think some more about the best way to handle this. We might want to make another minor update to the client.

return false;
} else {
migrations = getAllMigrations(start, migrationsDir);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add another variant of the "apply" command that just applies a single, specific version. (We can use the "-v" flag.) Let's do this in a follow-up PR (I can take it up if you're tired of working on "apply" haha).

Copy link
Contributor

@vcrfxia vcrfxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jzaralim ! A few minor things inline. LGTM otherwise!

@jzaralim jzaralim merged commit a75c355 into confluentinc:master Mar 2, 2021
@jzaralim jzaralim deleted the metadata-util branch March 2, 2021 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants