Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CT 1990 add access property to parsed nodes #7007

Merged
merged 17 commits into from
Feb 24, 2023
Merged

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Feb 18, 2023

resolves #6824

Description

Add access property to parsed nodes. (Did not add to metrics.) Update tests and bump schema version.

Checklist

@gshank gshank requested review from a team as code owners February 18, 2023 14:30
@gshank gshank requested review from emmyoop and iknox-fa February 18, 2023 14:30
@cla-bot cla-bot bot added the cla:yes label Feb 18, 2023
@gshank gshank requested review from MichelleArk and peterallenwebb and removed request for iknox-fa February 18, 2023 14:30
@MichelleArk
Copy link
Contributor

MichelleArk commented Feb 21, 2023

Would adding the access property to parsed nodes would enable configuring access on all ManifestNodes?

ManifestSQLNode = Union[
    AnalysisNode,
    SingularTestNode,
    HookNode,
    ModelNode,
    RPCNode,
    SqlNode,
    GenericTestNode,
    SnapshotNode,
]

# All SQL nodes plus SeedNode (csv files)
ManifestNode = Union[
    ManifestSQLNode,
    SeedNode,
]

(From: https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/contracts/graph/nodes.py#L1153-L1168)

I'm not sure it's desirable to provide this configuration beyond ModelNodes (and perhaps SnapshotNode) given we don't yet have plans to act on the setting of the access property beyond restricting references to ref-able nodes, which only applies to some of the ManifestNodes.

@gshank
Copy link
Contributor Author

gshank commented Feb 21, 2023

The code and the tests assume that all of those nodes have the same set of attributes. It's certainly possible to limit it to just ModelNode and SnapshotNode, but it's actually rather more work, which is why I did it this way for commenting. Are we going to limit "group" to the same set of nodes?

@gshank
Copy link
Contributor Author

gshank commented Feb 21, 2023

Also the way that "patches" are processed, all of the manifest nodes are handled with the same code, so unless we split that up we still have to add "access" to ParsedNodePatch. Would we want to throw an error if somebody tried to set "access" (or group?) for non-supported node types?

@MichelleArk
Copy link
Contributor

MichelleArk commented Feb 21, 2023

Are we going to limit "group" to the same set of nodes?

Nodes that should be 'groupable' are nodes that have the ability to ref other nodes. On the other hand, nodes that can have the access property are nodes that are 'refable'. So no - they should be different sets of nodes.

The configured group value will be used to determine whether the ref is permitted or not for private models. If there is no configured group on a model accessing a private model, the ref should be restricted.

Would we want to throw an error if somebody tried to set "access" (or group?) for non-supported node types?

I'd say an error or at least a warning, given that the user may be expecting dbt to actually something given the set access property and we have no plans to act on this property value for other nodes in Phase 1 of Multi-project deployments

It's certainly possible to limit it to just ModelNode and SnapshotNode...

I'd actually advocate to limit this to just ModelNode, since that is the most concrete product story we have around multi-project deployments. Ultimately, a snapshot or seed could be wrapped in a model to use the access property. Would definitely be open to relaxing this given community / user feedback though.

@gshank
Copy link
Contributor Author

gshank commented Feb 24, 2023

Note: we didn't move to version 9 of the manifest schema when we should have, so I updated all the v8 manifests that shouldn't have the new fields in them, and switched to v9.

field_value=patch.access,
field_name="access",
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a warning (which a user can promote to an error), Is it possible to raise a ParsingError here instead? I think that's more consistent with existing product expectations for other entities. For example, configuring type: invalid on an exposure gets:

dbt.exceptions.YamlParseDictError: Parsing Error
Invalid exposures config given in FilePath(searched_path='models', relative_path='schema.yml', modification_time=1677270162.964944, project_root='/Users/michelleark/src/jaffle_shop') @ exposures: {'name': 'jaffle_shop_report', 'type': 'invalid', 'owner': {'name': 'testsss'}} - at path ['type']: 'invalid' is not one of ['dashboard', 'notebook', 'analysis', 'ml', 'application']

Copy link
Contributor

@MichelleArk MichelleArk Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we define access on the dataclass declaration for the ParsedNode, simialr to the other class attributes set by this method (created_at, description, patch_path). Ideally access is a AccessType if possible. nevermind!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It's an exception, so it aborts everything, which is not my favorite pattern because if you have multiple errors you have to address them one at a time, but we can certainly do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're suggesting with regard to ParsedNode. It's set in the dataclass declaration for ModelNode, which is where you wanted it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're suggesting with regard to ParsedNode. It's set in the dataclass declaration for ModelNode, which is where you wanted it.

I see, makes sense. Still getting my head around patches. Sorry for the confusion!

Copy link
Contributor

@MichelleArk MichelleArk Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is not my favorite pattern because if you have multiple errors you have to address them one at a time

Agree - but I'm nervous about gracefully handling this invalid config because, given that the default protected value is not the most restricted access option. For example, a typo to access:priivate would lead to access:protected on the model, which is probably a more open level of access than the user meant to specify. Hopefully they'd catch it in review / via logs but I think it's worth raising an exception for this case.

@gshank gshank merged commit 0ef9931 into main Feb 24, 2023
@gshank gshank deleted the ct-1990-access_property branch February 24, 2023 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1990] Parse access property on model node
2 participants