-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(migrations): apply command #7099
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jzaralim this is really great! A few concerns and the usual boatload of nits. A lot of my comments are around supporting non-consecutive migration versions and (in the future) out-of-order migrations. Feel free to defer any of the comments related specifically to out-of-order migrations, though I do think we should support non-consecutive version numbers out of the box.
this.version = Integer.toString(version); | ||
this.name = Objects.requireNonNull(name); | ||
this.checksum = Objects.requireNonNull(checksum); | ||
this.previous = version == 1 ? MetadataUtil.NONE_VERSION : Integer.toString(version - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is incorrect. Migration numbers don't have to be consecutive. "Previous" refers to the previous version with state MIGRATED. We shouldn't try to find this at the time of loading the files, instead we should let the "apply" algorithm figure this out (via MetadataUtil.getLatestMigratedVersion()
).
final String migrationsDir | ||
) { | ||
int end = start; | ||
while (getFilePathForVersion(Integer.toString(end), migrationsDir).isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this logic doesn't work since version numbers don't need to be consecutive. Instead we should open all files in the directory, check if they match the expected name format (log a warning if not), and then apply a filter for the versions we care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, getFilePathForVersion()
already loads all the files, so we'd rather not call this helper multiple times.
return 1; | ||
} | ||
|
||
if (ValidateMigrationsCommand.validate(config, migrationsDir, ksqlClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's add log message saying that we're validating the currently applied migrations as part of preparing to applying new migrations, else users might get confused since the logger for any potential validation errors is the ValidateMigrationCommand logger rather than the ApplyMigrationCommand logger.
LOGGER.info("Loading migration files"); | ||
|
||
if (next) { | ||
migrations = getAllMigrations(start, start, migrationsDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this logic doesn't quite work since the next version does not have to be consecutive. The "next" version can use the same logic as "all" to find future migration files (from the latest applied version) but should only apply the first version found.
We might also need to check whether the version(s) to be applied have already been applied or not. It's possible that we went "backwards" in version due to an out of order migration. We should check with @colinhicks what the expected user experience is here. Specifically, suppose a user applies migrations with versions 1, 2, and 6, then applies version 3 out of order. If the user then tries to apply the "next" migration, do we apply the migration with version 4 or 7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design doc suggests this option:
--out-of-order Allow out of order migrations. Only applicable for `apply`. This does not affect the behavior of `next` but will affect `all` and `until`.``
So if we follow this, then next
doesn't support out of order migrations and would always apply 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's still ambiguous. For example, if you apply 1 then 2 then 6 then 3, does "CURRENT" now point to 6 or 3? I think it should point to 3, and that the "previous" state linked to from 3 should be 6. If that's the case then "next" from the "CURRENT" of 3 is still 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @vcrfxia above.
Specifically, suppose a user applies migrations with versions 1, 2, and 6, then applies version 3 out of order. If the user then tries to apply the "next" migration, do we apply the migration with version 4 or 7?
As a user, I would expect version 4.
I think it's still ambiguous. For example, if you apply 1 then 2 then 6 then 3, does "CURRENT" now point to 6 or 3? I think it should point to 3, and that the "previous" state linked to from 3 should be 6. If that's the case then "next" from the "CURRENT" of 3 is still 4.
This also sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colinhicks Won't that behavior cause confusions with users? If I migrated to 6, and for some reason I applied an out-of-order migration 3, why would we continue from 4 after that? Btw, I don't understand out-of-order migrations yet, but if users use it, then those should be in special cases, i.e. to apply a migration that they forgot (oops!). I would expect the tool continues applying migrations from the highest version migrated, in this case next one should be 7.
Btw, do we need to support out-of-order migrations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spena can you clarify the line here?
--out-of-order Allow out of order migrations. Only applicable for
apply
. This does not affect the behavior ofnext
but will affectall
anduntil
.``
I think I might be misinterpreting it. Are we actually saying that --out-of-order
handles the case where the current migration is 6 and a new migration, 4, was added late, after we had applied 1, 2, 3, 5, and 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we actually saying that --out-of-order handles the case where the current migration is 6 and a new migration, 4, was added late, after we had applied 1, 2, 3, 5, and 6?
I added that line to the KLIP. My intent in saying "This does not affect the behavior of next" was that if the current migration is 6, then next
will not apply 4 even if --out-of-order
is set. Upon thinking about it more, I think we should either disallow setting --out-of-order
for the next
mode (to avoid confusion) or allow it with the semantics of finding the first, unapplied migration, which may be before the "current" version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, do we need to support out-of-order migrations?
We've already decided to not support this for the first version of the migrations tool, so I assume you're asking about the long-term plan instead? I thought it was a useful feature to have in the long run but maybe not if it's generating so much confusion. I'll let @colinhicks make the call here.
A related question is whether we want to allow non-consecutive migration versions (e.g., as a user can I apply version 1 and then version 3 without applying version 2). Doing so without supporting out of order migrations means I'll never be able to apply version 2 (on this cluster). This may still be valuable to users who, for example, start using the migrations tool from a non-clean slate and create migration 1 to capture the current state but cannot apply it (because those streams and tables already exist), and want to start from migration 2. Does this make sense, or would supporting non-consecutive migrations introduce more confusion than it's worth, if we don't also support out of order migrations (at least in v1 of the migrations tool)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
I think we should either disallow setting --out-of-order for the next mode (to avoid confusion) or allow it with the semantics of finding the first, unapplied migration
Based on the conversation above, my vote is to disallow setting it for next
in order to avoid confusion :). How do others feel about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we want to keep the option open to support other out-of-order migration scenarios in future versions. Ideally we can be informed by user feedback when considering cases.
.filter(s -> s.length() > 1) | ||
.collect(Collectors.toList()); | ||
for (final String command : commands) { | ||
ksqlClient.executeStatement(command + ";").get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work for INSERT VALUES statements. We can handle this case in a follow-up PR, but INSERT VALUES statements need to be submitted via the insertInto()
method rather than executeStatement()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just occurred to me that translating INSERT VALUES
statements into insertInto()
calls is actually really annoying. Let me think some more about the best way to handle this. We might want to make another minor update to the client.
return false; | ||
} else { | ||
migrations = getAllMigrations(start, migrationsDir); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add another variant of the "apply" command that just applies a single, specific version. (We can use the "-v" flag.) Let's do this in a follow-up PR (I can take it up if you're tired of working on "apply" haha).
...ols/src/test/java/io/confluent/ksql/tools/migrations/commands/ApplyMigrationCommandTest.java
Outdated
Show resolved
Hide resolved
...ols/src/test/java/io/confluent/ksql/tools/migrations/commands/ApplyMigrationCommandTest.java
Show resolved
Hide resolved
...ols/src/test/java/io/confluent/ksql/tools/migrations/commands/ApplyMigrationCommandTest.java
Outdated
Show resolved
Hide resolved
...ols/src/test/java/io/confluent/ksql/tools/migrations/commands/ApplyMigrationCommandTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jzaralim ! A few minor things inline. LGTM otherwise!
Description
Migrations tool
apply
command. Follows the algorithm outlined here. Also includes helper functions for writing metadata, loading all migration files and extracting the name of the migration from the file name, and introduces a new class,Migration
Testing done
unit and integration tests
Reviewer checklist