-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dataclasses-jsonschema types #1589
dataclasses-jsonschema types #1589
Conversation
8678189
to
3dc1aff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@beckjake what's the state of this rn? I looked through a bunch of the changes which look fine individually. i agree with your point that this is already huge and we should try to push further work into future PRs
Replaceable, | ||
metaclass=abc.ABCMeta | ||
): | ||
_ALIASES: ClassVar[Dict[str, str]] = field(default={}, init=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you want field(default_factory=dict, ...)
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! This is a ClassVar, as in a static class-level attribute, so default
is appropriate. It's not called on instantiation by the dataclass layer (hence init=False
). Though I do think I could maybe do this instead?:
_ALIASES: ClassVar[Dict[str, str]] = {}
It seems like dataclasses correctly figure out when they're class vars that they don't get an __init__
entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was pretty wrong about doing the above instead! When that happens, everything looks great until you call dataclasses.replace(credentials, ...)
- then python throws a TypeError complaining that it got an unexpected keyword argument _ALIASES
. That seems like a bug to me.
@cmcarthur I'm going to mark it as "ready for review", but I'm still writing a bunch of unit tests around the various fiddly bits of going to/from dicts that I'll add (and probably make minor changes accordingly). |
ea2ab72
to
07b63cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a massive undertaking - nice work on replacing like 1.5k LOC with 1.5k lines of tests!
I took an initial look at this today and:
- asked a couple of questions about things i didn't understand on my first pass
- responded to all (?) of your comments - thanks for calling certain things out
You said:
dbt.main.handle_and_check no longer returns the exact same signature. In particular, it no longer returns a dict-like object as its first result (the second part of the result is still a bool). You can see some of the kinds of changes in tests that inspected results. What else will be impacted by this change, and is it ok?
I know some users are running dbt via this API, so we should be really careful to document this change when this code goes live. Additionally, I think some of our own internal tooling relies on this return value, so we should be sure to update that as well. I can follow up on that off-line.
Agreed on documenting! I feel ok about it since we are pretty clear (IMO) that we don't support that interface, and we usually direct people to the mostly-unchanged I checked the internal tooling I know about and it only cares about the |
@beckjake the shape of this looks good although i would not be surprised if we catch a few more bugs when we test this code out. this is approved from my end. with that, i am still curious:
|
32c94f9
to
b7783bd
Compare
Yes! Well, I've called
No... serializing works great of course, we do that in the code, deserialization has some type confusion issues that we will work out in #1602. A modified form that only accepts |
@beckjake I gave this a spin locally and found that some of the schema.yml specs in our internal-analytics project had invalid/additional fields in them! While that's really cool and convenient, the warning provided by dbt was pretty noisy:
Is there anyway to quiet this output down? The information here is correct (and appropriate for the debug logs) but definitely way to much to show in the stdout IMO! |
@drewbanin I made changes to both hologram and this branch that make the message much quieter outside of debug level (where it is perhaps still too noisy, but stack traces are nice to have) Here's what a warning looks like now:
The |
@beckjake that message looks a lot better! I think there might be a quirk around the exception handling though -- when i run dbt with an invalid source (containing an additional property) I see an exception raised by
My schema.yml looks like:
I see the same |
Ignore this ^ - I was using an outdated version of |
Can you just update the changelog to note that we removed the graph from the context? This LGTM - ship it! |
Most of the things that previously used manually created jsonschemas Split tests into their own node type Change tests to reflect that tables require a freshness block add a lot more debug-logging on exceptions Make things that get passed to Var() tell it about their vars finally make .empty a property documentation resource type is now a property, not serialized added a Mergeable helper mixin to perform simple merges Convert some oneOf checks into if-else chains to get better errors Add more tests Use "Any" as value in type defs - accept the warning from hologram for now, PR out to suppress it set default values for enabled/materialized Clean up the Parsed/Compiled type hierarchy Allow generic snapshot definitions remove the "graph" entry in the context - This improves performance on large projects significantly Update changelog to reflect removing graph
22a480b
to
49f7cf8
Compare
That rebase was kind of a monster, but when tests pass I'll merge! |
Convert dbt to use dataclasses + hologram for many/most internal types
Something to think about in this PR:
dbt.main.handle_and_check
no longer returns the exact same signature. In particular, it no longer returns a dict-like object as its first result (the second part of the result is still a bool). You can see some of the kinds of changes in tests that inspected results. What else will be impacted by this change, and is it ok?Things to do in future PRs because this is already far too long:
deps
process a bit more saneIntermediateSnapshotNode
thing I had to do, but that is an issue buried deep inside the fundamentals of how we do parsing, so I really am loath to fix it hereThe typing.Union issue is that this assert passes:
This behavior makes complete and total sense for the original subclass-aware behavior of typing.Union! But it messes pretty badly with hologram, since now this won't decode:
This comes up a lot with the relationship between ParsedNode and CompiledNode. I've got a workaround in contracts/graph/compiled.py, but it kind of sucks.