-
Notifications
You must be signed in to change notification settings - Fork 33
Rewrite Migration parser using conventional FromJSON #43
Conversation
This patch defines a `MigrationYaml` record that represents the shape of a migration's Yaml file directly, then decodes to it. This obviates the manual "parsing" from a decoded `Object`. It also produces better error messages. The tests don't pass because we have actually been using Show/Read for the `UTCTime` fields instead of a JSON representation. I'll have to introduce a new type to account for that without breaking compatibility.
There's no functional need for this, using the JSON representation would've been readable enough, but there are going to be tons of migration files in the wild with these simplified values. We could write encoding at JSON and a decoder that accepts either, but I'm not sure it's worth it vs just using a completely custom newtype.
It was listed in that `requiredFields` array, but was being defaulted to the empty string elsewhere. There is a test that confirms its optional, which makes sense anyway.
It seems the `Depends` _key_ is indeed required, but a `Null` as the _value_ is acceptable (and defaulted to no dependencies). I don't know if we need to preserve this too, but the tests fail if we don't. It was super fun to see one test assert it was optional and fail, only to have the test that asserts it as required then fail (:
I just remembered that this was still open. Let me know when it's ready for me to review. |
It is (was always) ready for review 😄 |
Okay, great! For what it's worth, an explicit declaration that it's ready would be helpful for a couple of reasons. First, you created the pull request before you were done clarifying its purpose and pushing patches, which makes it a bit ambiguous as to whether it's ready. Second, you made changes to the PR description and I got the impression that the purpose of the PR had not settled (so, in other words, don't review yet) and PR description edits don't generate email notifications, so I had no idea that a new message had been edited into the description. I'll see if I can take a look at the changes in the next few days. |
I took a look and the changes look okay to me. As far as the test expectations are concerned, I think the new errors are more descriptive than the old ones but the error message text containing the offending file path means that to check for those errors requires doing a bit of string manipulation in the tests to check only the part of the error that comes after the file path. That works for me, so if you are willing to make that change, too, then I'd welcome it. Thanks for your work on this! |
Awesome thanks!
I will usually open as Draft, then clicking Ready for Review is my "explicit declaration". I guess I should've also actually formally requested you as reviewer -- I'm so used to my team's auto-assign feature that I didn't think to.
Yeah, I can see how that's confusing. I mis-fired my tooling ( Going forward, if I'm careful to correctly use |
I see, makes sense! I also see now that you did mark this as ready for review above, but not only is it a bit hard hard to see in github's UI but that doesn't generate an email notification. I really wish it did, since that's what I'm depending on when PRs are evolving. That's why I was thinking an explicit comment would be helpful. Perhaps if I used some kind of app, maybe I'd get a notification when the ready state changed?
I usually react right away and come look to see what the PR is all about, even if I can't get to a deeper review of the code until later.
I'm happy to just try this out and see. However, FWIW, my ideal preference is not to have draft PRs at all, i.e., if discussion is warranted, I'd much prefer to do it in an open ticket that captures the problem statement. That way, we can discuss it and it may turn out that the solution is not a PR. And if it turns out code changes are warranted, then having discussion in advance will really help ensure that the PR contains no big surprises. I find code changes much easier to review if they're the end of a conversation, not the beginning. :) |
All sounds good, but for just a little bit about how I use Draft PRs: A Draft PR is purely for me. It's just that I'd like to get my code pushed (in the backup sense) and start working on the non-code details. In the case of a single-commit PR, I can certainly just do I'll also often do a final self-review in the GitHub UI, where I find the change in head-space will often surface something I'd not otherwise see. Only when I flip from Draft to non-draft do I expect to involve the upstream / reviewer at all. Until then, the Draft should not be on their radar at all. So I'm definitely not using them as a conversation tool. I'd open an Issue before that, then a PR at the end. Do you get notified of Draft PRs opening, or only when they flip to non-draft? I assumed the latter, and if that's not the case that would certainly invalidate this approach a bit 😬 |
Oh and Draft PRs are usually a good time to see if CI passes and address it, before involving any reviewers. |
I don't know for sure. If you want, we could test it. Looking at github's material on this topic, I assume that I'd get notified since Draft status does not imply privacy/visibility, just non-readiness for merging. |
I wouldn't go to any trouble testing. I think the behavior just comes down to your own watching settings. A 1-dev OSS project is likely watching "all activity", so naturally you'll get notified of the draft PR opening. If the contributor explicitly requests you as a reviewer, you'd definitely get another notification then. In such a setup, the Draft state could still be useful as an indicator of something not yet ready -- but (assuming there is no notification for the draft-to-ready state-change) you would need your contributors to use explicit reviewer requests to then see when it has become ready. This isn't historically how OSS projects tend to work, I think. The workflow I've internalized comes from projects with a larger team. In such a case, you want to watch "participating and mentioned" only. And you always want to request reviewers (either explicitly or via CODEOWNERS). With such a system in place, the "quiet" draft state is useful in the ways I've started using it, since it notifies nobody. I think I just need to remember all this and adjust my own behavior accordingly. Thanks for engaging on this -- it's been interesting! |
toJSON = toJSON . formatTime defaultTimeLocale utcTimeYamlFormat . unUTCTimeYaml | ||
toEncoding = toEncoding . formatTime defaultTimeLocale utcTimeYamlFormat . unUTCTimeYaml | ||
|
||
-- Keeps things as the old Show/Read-based format, e.g "2009-04-15 10:02:06 UTC" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 @jtdaugherty I just wanted to mention: I finally updated my own app to this version of this library and I found it failed because my migrations all had fractional seconds, such as
Created: 2009-04-15 10:02:06.23345 UTC
I'm not sure where that came from, or why the parse won't accept it, because I swear I tested things when I wrote this PR.
In any event, I ran the following to truncate them before my migrations would run with the new version:
sed -i 's/\.[0-9]\+ UTC$/ UTC/' db/migrations/**/*
I'm not sure if we should make an update to make the parser more flexible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! Sorry to hear about this. It seems to me that a relaxation to the parser ought to be okay. If you want to make that change, let me know - otherwise I'll see to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that a relaxation to the parser ought to be okay
Yeah, agreed. If I'm reading the docs correctly, it seems like using ...%T%Q UTC
could do it:
%Q
decimal point and fraction of second, up to 12 second decimals, without trailing zeros. For a whole number of seconds, %Q omits the decimal point unless padding is specified
(Emphasis mine)
I'm not quite sure my availability, so I'll comment again if I get a chance to work on it; otherwise, feel free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to do something, because new migrations still get created with fractional seconds, so the problem will keep coming back. @pbrisbin any chance you could create a PR with this fix, as you seem quite familiar with the problem and relevant code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because new migrations still get created with fractional seconds
I didn't realize this. You're saying this version creates fractional seconds files that it can't parse? That's much worse.
you seem quite familiar with the problem and relevant code?
I may have been at one time, but that was a long time ago. Still, it seems like a relatively straightforward thing. The decoder needs to accept with or without fractional seconds. Adding a unit test and using the format-string I mentioned above should do it.
The code as it was was decoding the Yaml as a type-less
Object
, thendigging around manually in the key-values to "parse" it further, to the actual
Migration
type. This resulted in a brittle situation (relying onread
) withpoor error messages ("unrecognized field" was used for a lot of distinct
errors). It also required Aeson internals such that we'd need CPP to build
against 1.x and 2.x.
This series of commits moves to a more idiomatic approach of a record type
with a
FromJSON
that is decoded directly. The decoding then handlesrequired-vs-optional, parsing, etc.
I recommend by-commit review, as the final result is somewhat complex, but
this was due to some landmines encountered along the way, and there may
exist a simpler approach depending on external constraints. Personally, I'd
say it's worth keeping the custom
UTCTime
en/decoder, but I would notpreserve the quirky "
Depends
must be present but can beNull
" behavior.At this point, we are behavior-neutral except for the following changes in error
message:
Some of this is our intended improvements:
unrecognized field
But they are a bit noisy without any extra polishing. Therefore, if you'd like to
see the messages tweaked in some way, do let me know. Otherwise, I can just
update the test expectations and we'll be green.