-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISCUSSION: SPDX 3 data model #240
Conversation
Signed-off-by: Keith Zantow <[email protected]>
Signed-off-by: Keith Zantow <[email protected]>
Element | ||
|
||
RootElements []IElement | ||
Elements []IElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem with this implementation is that RootElement is not class Element but any class that inherits from Element. In Go, I guess, that translates to "any struct that implements the Element interface".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in this variant, the I*
types are all interfaces
-- so these are "any types implementing the IElement
interface". These could be named differently... naming is hard! there doesn't seem to be a way to avoid interfaces unless we made any reference field an any
/interface{}
type, but then there's no compiler enforcement of elements going in there. (Also I probably modeled this wrong, I was going by the software profile svg which appears to have *
for both of these fields).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think personally I'm leaning towards the embedded_with_interfaces_and_constructors
variant, with setters that return error
, as the example currently shows. This allows generated code to enforce valid values for all fields and for a user to most quickly understand when things failed rather than needing to run a full validation once all values are set. It also eliminates the need to export the data structs (all the embedded variants need exported structs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my bad: the property is of type IElement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work. It would be easier to generate:
package main
import "fmt"
type ElementP struct {
Name string
}
type Element interface {
AsElement() ElementP
}
type Artifact struct {
ElementP
StandardName []string
}
func (a Artifact) AsElement() ElementP {
return a.ElementP
}
type ElementCollection struct {
ElementP
RootElement Element
}
func main() {
a := Artifact{
ElementP: ElementP{
Name: "AnArtifact",
},
StandardName: []string{"TheArtifact"},
}
ec := ElementCollection{
ElementP: ElementP{
Name: "AnElementCollection",
},
RootElement: a,
}
fmt.Println(ec)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more issue: The jinja renderer I am using from the shacl2code project can only generate one file. So if we were to generate a model and a constructor file, we would need to roll our own jinja renderer...which I would rather do once I can handle the code better. You are welcome to try building one that will generate multiple files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One note: there is no requirement to have separate files -- I just did this to separate things logically, but all the code for each of these approaches could be generated in a single file. In other words: if we were to export all 4 types: interface, data struct, constructor, constructor props; all of these could be one after the other in the same file.
From the "generic shacl2code" perspective, I think in the example there's a bit of an inconsistency with the example you posted: only a single struct type gets renamed (ElementP
) for an Element
interface, but there are no other interfaces created. I think the easiest thing to do in shacl2code
is probably: generate an exported interface and a struct for each type. It also probably makes the most sense to name the interface after the shacl type rather than the struct, as one of my examples has. Abstract types are a question: golang would let you indiscriminately construct any abstract type if these structs are exported, and they would need to be generated and exported to be used like superclasses through embedding.
The interfaces_only
approach has the potential to model inheritance defined in shacl in the most correct way, I believe -- because abstract classes can omit constructor functions and since the types themselves are not exported, there's no way for a user to create these types incorrectly. It has some complexity that at the time you're generating code for a type, it needs to have access to supertypes to be able to inline all the properties -- a quick look makes me think this wouldn't be too hard to accomplish, since it looks like we can get a reference to a parent class and jinja supports recursion.
An aside: naming is hard! Is the P
from ElementP
short for Props
or Properties
? I am sorta partial to Data
for these structs despite using Impl
in one of the variants, but also like Props
, at least for the constructor props types, or other things; happy for any other suggestions but would definitely like to keep it short but maybe more descriptive than a single letter. Interfaces only allows the lowercase name to be used, since it's not exported and we don't have to come up with names, so I guess there's another plus for that one... except for package
, which is a reserved word 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an "embedded_second" variant, which I think is similar to what you were suggesting (replace Data
with P
or some other suffix TBD and the interfaces are the first-class names of the types); I just wanted to point out specifically what the usage would look like, which could be just fine: https://github.com/spdx/tools-golang/pull/240/files#diff-55bf8e9ab2432ee525685cd1e7ef75be715a7e18c4eef3011711b89af64bc389R14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with you on going with the interface only approach i.e. Abstract class -> Go Interface with getter and setter for each property. Concrete class -> Go struct which implements the getter and setter methods for the abstract class.
I personally would prefer having corresponding internal structs to hold common properties in the concrete classes, but it's not a deal breaker. We have to embed the interfaces anyway (Artifact is actually an abstract class that inherits from Element).
Signed-off-by: Keith Zantow <[email protected]>
ElementCollection | ||
} | ||
|
||
type spdxDocument struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the "interfaces only" example, but this is a struct. Is a concrete class a struct and an abstract class an interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this struct is hidden though, so users are not able to directly index the struct and will need to rely on interfaces to get/set values.
spdxID ElementID | ||
name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating each abstract parents' properties using jinja would be ok to do. You think this is a better option than using a struct called elementProperties
(purposefully not exporting the struct, but making the concrete class implement the interface Element
).
name string | ||
} | ||
|
||
var _ Element = (*ElementImpl)(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a way to write implements <interface>
in go:
var _ Element = (*ElementImpl)(nil)
is a compile-time check that an *ElementImpl
type can be assigned to an Element
interface, it gets optimized out by the compiler so no runtime overhead is incurred. It's not really necessary for any of the code generation, I mostly added it for my own benefit.
A similar compile-time check would happen with constructor functions, like:
func NewPackage() Package {
return &packageImpl{}
}
will similarly fail to compile if packageImpl
doesn't adhere to the Package interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you sooo much @kzantow for this super detailed rundown - I like embedded with interfaces and constructors the best!
@@ -0,0 +1,53 @@ | |||
# SPDX 3 Model Patterns | |||
|
|||
This is a non-exhaustive list of some patterns we are able to use in Golang to work with an SPDX 3 data model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you sooo much for putting this level of detail and even mocking out all the examples! You've made it super understandable and easy to review!
- pro: modifying structs is simple and idiomatic | ||
- pro: minimal surface area: very few additional functions necessary | ||
- con: construction is confusing -- which nesting has field X? | ||
- con: naming is hard, need `Package` and `IPackage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an interesting dimension to consider also is addition on new fields and compatibility between versions, i think each of them may have pros/cons in this regard as well. (like in interfaces we can just have the v3.1 interface embed the v3.0 interface and make backward compatibility simple.
- pro: interfaces are simple | ||
- pro: modifying structs is simple and idiomatic | ||
- pro: minimal surface area: very few additional functions necessary | ||
- con: construction is confusing -- which nesting has field X? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea this is something i really dislike about embedding as well, the constructors are very non-intuitive.. especially to a user who does not really have to the inheritance relationships to actually use SPDX.
ElementCollection | ||
} | ||
|
||
type spdxDocument struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this struct is hidden though, so users are not able to directly index the struct and will need to rely on interfaces to get/set values.
|
||
- [embedded with interfaces and constructors](embedded_with_interfaces_and_constructors) - takes the approach of both | ||
embedding and interfaces, along with constructor functions. This makes creation reasonably simple and working with | ||
existing documents reasonably simple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this one the best! i think there is one aspect which i'm not sure what the solution to is (which may be generic to anything that is implemented with interfaces and not just to this one)... when we have a field like a slice of elements, like To []Element
. Once we get hold of an Element
, how do we get back to the child class of the Element? Like a Package is an Element but when we get a hold of the Package as an Element in a relationship, how do we know its a Package?
- pro: eliminates some code duplication seen with `interfaces_only` | ||
- con: getter/setter still not especially idiomatic | ||
- con: larger surface area: at least 4 exported parts for each type | ||
- con: absolutely no JSON/etc. output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure what you mean by this, is it just because the fields are not public thus no serialized JSON output?
- pro: reasonably simple to work with existing documents | ||
- pro: eliminates some code duplication seen with `interfaces_only` | ||
- con: getter/setter still not especially idiomatic | ||
- con: larger surface area: at least 4 exported parts for each type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... yea i wonder if there's anyhting we can do with getters/setters here with reflection...
why do we need to export the internal struct again? can you remind me?
@goneall is there a JSON schema spec for v3? Also would love your thoughts on the proposal since you've been doing this with Java as well (since i imagine having inheritance is nice for this data model :)) |
Yes - https://spdx.github.io/spdx-spec/v3.0/model/schema.json - This is generated from the SHACL file using the code generator Joshua wrote. I haven't reviewed the above discussion yet, but I thought I'd do a quick reply with a pointer to the schema |
Following up: any consensus on what direction we want to take? |
DO NOT MERGE!
This PR is intended to provide a spot for an open discussion about what people would like to see for working with SPDX 3 documents. There is a particularly challenging bit, that SPDX 3 is implemented using a single inheritance model, which Go does not directly support. Due to this, I'd like to offer up a few possibilities to determine which users would most like to see.
In the
spdx/v3/v3_0
directory, there are several subdirectories, each implementing a different variant of a model that I believe would support SPDX 3. I've implemented a very simple example with each: creating a couple packages, a file, and some relationships. 👀 look at the*_test.go
to see the example usage. There, of course, could be certain parts of these different variants used together such as setter errors and first-class embedded structs.Also check the README with some more information and what I see as pros/cons for each of the approaches.
And, of course, if someone has different ideas, please do share them! Perhaps I've missed some obvious solutions that could simplify things here!
There's a high likelihood that I've made some mistakes with the modeling and usage, but this is more to determine the patterns we would like to use, so we can use the SPDX 3 SHACL model to generate the corresponding Go code here.