-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new internal lookup API #798
Conversation
nexus/src/db/lookup.rs
Outdated
pub fn organization_name<'b, 'c>(self, name: &'b Name) -> Organization<'c> | ||
where | ||
'a: 'c, | ||
'b: 'c, | ||
{ | ||
Organization { key: Key::Name(self, name) } | ||
} | ||
|
||
pub fn organization_id(self, id: Uuid) -> Organization<'a> { | ||
Organization { key: Key::Id(self, id) } | ||
} | ||
|
||
pub fn project_id(self, id: Uuid) -> Project<'a> { | ||
Project { key: Key::Id(self, id) } | ||
} | ||
|
||
pub fn instance_id(self, id: Uuid) -> Instance<'a> { | ||
Instance { key: Key::Id(self, id) } | ||
} | ||
|
||
pub fn disk_id(self, id: Uuid) -> Disk<'a> { | ||
Disk { key: Key::Id(self, id) } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to clarify, if we extend this model to other types, we need to:
- Add functions to
LookupPath
for each resource that can be looked up (generally "by ID") - Add a
define_lookup_with_parent
macro call - Add manual impls to look up names, for the auto-generated structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new proc macro implementation, the steps are:
- add one function to
LookupPath
--fn ${new_resource_name}_id()
- add a call to the new
lookup_resource
macro to define everything else for the new resource
I think the TL;DR of my comments are:
|
For the record, I was able to make this work with |
I realized that you can do |
Okay, I've made a lot of changes since the first round of reviews, but the API has not changed. I completely replaced the implementation with a proc macro. I think I've addressed all the feedback up to this point. Thanks @smklein @david-crespo and @andrewjstone for the reviews! @smklein in particular would you mind taking a look at nexus/src/db/lookup.rs, since I know you found the previous version kind of icky? The new version is function-like proc macro that does nearly everything. To add a new resource, the only thing you have to do aside from calling the macro is add the |
ancestors = [ "Organization", "Project" ], | ||
children = [], | ||
authz_kind = Generic | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is so great!
// For example: `authz_organization.project()`. For "generic" resources | ||
// (`AuthzKind::Generic`), the parent has a function called `child_generic` | ||
// that's used to construct all child resources, and there's an extra | ||
// `ResourceType` argument to say what resource it is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW in order to understand why there are two patterns, I had to read this comment, the one on enum AuthzKind
on line 42, and then also go look at authz/api_resources.rs
. It sounds like "typed" is resources that you can attach roles to (org and project) and generic
is resources that have no permissions of their own, instead inheriting them from their parent. The reason they're called "typed" is you need to use the constructor specific to that resource to get an Authz
object, whereas for generic children, we don't care what kind of child they are (e.g., disk or instance), we only care that they are children of the parent. (Summarizing to make sure I get it.)
I don't think the comments necessarily need to be changed, but it doesn't seem like there's a single main spot where this is explained. Just giving my experience of where the mental friction is as someone who did not write the code.
omicron/nexus/src/authz/api_resources.rs
Lines 141 to 146 in 52dfddb
/// Aside from Organizations (which you create with | |
/// [`Fleet::organization()`] instead), all instances of all types of Fleet | |
/// children are treated interchangeably by the authz subsystem. That's | |
/// because we do not currently support assigning roles to these resources, | |
/// so all that matters for authz is that they are a child of the Fleet. | |
pub fn child_generic( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's basically right. Part of what's confusing is that I think of this as two different properties that happen to be correlated:
- a resource cannot have roles directly attached to it (like anything under a Project)
- a resource's
authz::
type is constructed by a typed constructor vs. a generic one
Those are the same, in that the typed constructor is used iff the resource can have roles directly attached to it, and that's not really a coincidence. But it seemed like a pretty big abstraction leak to have the macro know anything about roles.
For what it's worth, another change I want to make soon is to have all of these types use a typed constructor. I realized recently that there's nothing to stop you from passing an authz::Instance
where an authz::Disk
is expected because they're both type aliases for authz::ProjectChild
. That's dangerous, so I want to instead use the typed approach (which may mean another macro :-/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's confusing to use names that are about the code to make a substantive distinction about roles. Typed for everything would make a lot of sense and eliminate a whole thing to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to echo @david-crespo 's sentiments - I bumped into this when reading the lookup_resource
arguments, trying to understand when I'd want to supply one AuthzKind vs another, and had to do a lot of digging to understand when to supply one vs the other.
The terms "typed" and "generic" seem to refer a lot more to "how we have implemented these objects" rather than the properties of the objects themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get that this is more annoying than we'd like, and as I said, that's why I want to simplify the authz
types to have a more uniform interface. Do you have a suggestion for the meantime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s fine to aim to fix later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
nexus/src/db/db-macros/src/lookup.rs
Outdated
// NOTE: this is only "pub" for the `cargo doc` link on [`lookup_resource!`]. | ||
#[derive(serde::Deserialize)] | ||
pub struct Input { | ||
/// Name of the resource (PascalCase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "PascalCase" part of this is kinda coincidental - the actual expectation is that:
name
is the exact name of a type innexus::db::model
name
is the exact name of a type innexus::authz
- The resource exists in
db::schema::<snake_case_name>::dsl
I tried declaring lookup_resource
for a resource named MyNewResource
- I wanted to use "compiler-driven development" just to see where the dependencies are, and it seems like there are quite a handful of these that are implied by the implementation details of the macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's absolutely true. Since a bunch of those constraints are useful for developers anyway, it seemed prudent to take advantage of those. An alternative would be adding a bunch of extra configuration properties that you pass into the macro that we expect to be exactly the same. That feels like it would make it easy to get wrong, and also easy to violate the conventions that I do think are useful for developers (e.g., that the type in authz::
has the same name as the one in model::
). Is there something else you'd propose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a599546 I've updated this comment to emphasize the assumptions that you mentioned. Thanks for pointing that out!
Re: the use of "PascalCase" here: I get what you mean by coincidental, but I'm not sure that's true. To be explicit, there are two versions of the name of a resource:
- the Pascal case version is used for Rust type names and enum variant names (including
authz::$Resource
,db::model::$Resource
, theResourceType
variant, and the struct created by this macro) - the snake case version is used for database table names, database column names, Rust module names, and Rust struct fields
As a developer, when I need to put the name of a resource somewhere, which one do I use? It depends on the (arbitrary) convention of the place where you're using it. This macro winds up needing to use it in almost all of the above places. So for the input form we just pick one and document it. As a reader, I find it much clearer to say "the PascalCase version of the name goes here" than "the version of the name that matches the db::model
and authz
types". Maybe that's a personal preference?
Anyway, I agree that it's useful to document the assumptions, so I've done that and led with those. I just wanted to explain why I've left "PascalCase" in the docs, both here and the other places where it appears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I do think documenting the expected behavior is a strong step in the right direction.
I personally find associated types ideal for this sort of association - we're effectively claiming that there exists a resource...
- ... with a model struct
- ... with a schema in the DB
- ... with a corresponding authz resource
but the connection between these pieces of code is implicit.
Like, that's why we're supplying AuthzKind
by type to the macro, but all the other arguments are strings - it's because they aren't types, they must actually correspond to multiple types simultaneously.
To be clear - I don't think this should be blocking - I think the new lookup functions are great, and I love the new API, but I just view this as an area where cleanup could potentially make this more readable by depending on types rather than string-ified tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I think I'll land this as it is now. I'm definitely open to cleanup here. Ideally, we wouldn't need a macro at all to generate all these functions (per our earlier conversation).
nexus/src/db/db-macros/src/lookup.rs
Outdated
path_authz_names: Vec<syn::Ident>, | ||
|
||
// Child resources | ||
/// list of names of child resources (PascalCase, raw input to the macro) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't think the "PascalCase" bit is the piece that matters - these are expected to be the names of types / structures which exist in specific modules, and need specific functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above -- in a599546 I've updated this to reference the place where it's explained more fully, and left "PascalCase" there because I find it very helpful as a reader.
// For example: `authz_organization.project()`. For "generic" resources | ||
// (`AuthzKind::Generic`), the parent has a function called `child_generic` | ||
// that's used to construct all child resources, and there's an extra | ||
// `ResourceType` argument to say what resource it is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to echo @david-crespo 's sentiments - I bumped into this when reading the lookup_resource
arguments, trying to understand when I'd want to supply one AuthzKind vs another, and had to do a lot of digging to understand when to supply one vs the other.
The terms "typed" and "generic" seem to refer a lot more to "how we have implemented these objects" rather than the properties of the objects themselves.
This has been a long time coming: this PR prototypes a new API for looking up resources. This would replace the zillion $foo_lookup_by_path, $foo_lookup_by_id, and $foo_fetch functions in datastore.rs.
I got pretty far into an implementation using
macro_rules!
, but I think I've reached the limit and need to dive into a proc macro. Before I do that, I want to get feedback on the API itself. Please take a look at the changes in nexus.rs, where I've changed a handful of cases to use the new API. If folks like this, I will dig into the proc macro approach. (I wouldn't recommend looking too hard at lookup.rs -- it's an undocumented mess that I would almost entirely replace if we like this API.)I'd suggest starting with the summary below, then read through the changes to nexus.rs (it's pretty small, as I only updated a few cases), then take a look at the notes below.
Synopsis: lookups follow a builder pattern. Say you want to fetch an Instance. Previously, you'd use
project_lookup_by_path(org_name, project_name)
followed byinstance_fetch(authz_project, instance_name)
. Now:As a bonus: you get the
authz_org
andauthz_project
, too.Now, say you already have a
project_id
(maybe using theauthz_project.id()
from above). You can look up the Project by id, then look up the Disk in that Project by name:The general pattern is that you start at the root (with
LookupPath::new
), at which point you can feed it (1) any object's id, or (2) the name of a top-level object (currently just organizations, but this will include users and other things). At each level, you'll have functions that let you descend to the next level. At each level, you can call eitherfetch()
(which acts like "fetch" today, which does an authz check and fetches the database row for the object you've selected) or you can calllookup_for(authz::Action)
(which does an authz check for the given action and returns theauthz
object for whatever you've selected, but not the whole database row). More on this below.If you're still reading, at this point I'd recommend looking at the nexus.rs changes to see more examples. Then come back here.
Here's what I like about the new API/possible implementation:
foo_fetch()
, which always did an access check. That's not very different in this change.organization_lookup_by_path
in order to callproject_fetch()
to look up a Project in this Organization After this change, what you'd do instead is directly fetch the Project, without having to look up the Project first. Here's an example.instance_lookup_by_path
anddisk_lookup_by_path
. These aren't transactional, so you could wind up selecting things in different Projects. Instead, existing code chains the lookups: first look up the Project, then do two fetches to get the Instance and Disk. The new code fetches one of the objects, but the new fetch API returns all theauthz
objects along the path. So if you look up the disk, you also get back theauthz::Project
, and you can start a new lookup from there to get to the Instance. Here's an example.lookup_for()
which requires anauthz::Action
. This function does the lookup and an authz check for whatever action you give it. Here's an example. This isn't bulletproof of course, but it should make it harder to accidentally get this wrong.authz::Project
, and then do something you shouldn't be able to do with theauthz::Project
. But this eliminates a pretty common case that'd be very easy to get wrong.OpContext
when doing all lookups.pool_authorized()
instead ofpool()
everywhere, which has the advantage of skipping the database altogether for unauthenticated requests .Obviously there's a lot that I like about this. But why not do this? That's what I want feedback on. It's a different enough API that I think there's some risk we get pretty far into using it before realizing it doesn't handle some important case at all, which would be pretty sad. Two concrete reasons I can think why this might not be good:
(authz::Organization, authz::Project, authz::Vpc, authz::VpcSubnet, db::model::VpcSubnet)
. If you only really wanted the subnet, you'll wind up doing a lot oflet (_, _, _, authz_subnet, db_subnet)
stuff. We could make this a little neater if we reverse the order of these, though I think the reverse order is less intuitive.Anyway, I'd love to know if folks think this is a promising direction, in which case I'll proceed with fixing up the implementation. (note: there's a working version of this that doesn't return all the ancestors in the path. That's just a few commits back. It's returning the ancestors in a flat tuple where I think I've run into the limits of
macro_rules!
.)