-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Struct update dsl #83
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #83 +/- ##
=========================================
Coverage ? 88.96%
Complexity ? 459
=========================================
Files ? 31
Lines ? 1015
Branches ? 150
=========================================
Hits ? 903
Misses ? 68
Partials ? 44 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some good ideas for enhancements here.
I think there are actually two distinct features you are proposing—path-based accessors and a convenient way to update ContainerElement
s. I think they are sufficiently distinct that they can be done in separate PRs, and I think it will be easier to make incremental progress if we split them up.
Some of my comments might seem like big changes. Don't assume that it has to be done my way. I'm really just trying to get a discussion going to make sure that we've considered alternatives and their tradeoffs.
public data class Index(val index: Int) : IonPathElement() | ||
|
||
/** | ||
* Fetch an element from a nested IonElement using the [path] to traverse struct fields and [SeqElement] indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, let me start by saying that this is a useful enhancement—based on my prior experience, it's clearly something that people want.
Can you document what the caller should expect for failure cases (i.e. an element is the wrong type, a field name does not exist, or an index is out of bounds)?
What happens if there is more than one element matching the path? E.g.:
// Given a path of ["foo", 0] we could find 1, a, or both 1 and a
{
foo: [1, 2, 3],
bar: true,
foo: (a + b),
}
Would you consider adding a getPathOrNull
function—similar to how the Kotlin Standard Library first()
and firstOrNull()
?
What would you think of adding a type parameter to the function so that you can get a specific type of element from the given path? For example:
val mySymbol = someElement.getPath<SymbolElement>(somePath)
val anyElement = someElement.getPath<AnyElement>(someOtherPath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getPath
was left over from some other experiments I was doing and I think it makes sense to move to another PR.
What happens if there is more than one element matching the path?
There should be separate functions for getting the first value and getting all values, similar to the existing API for StructElement.
Would you consider adding a getPathOrNull function—similar to how the Kotlin Standard Library first() and firstOrNull()?
Yes, makes sense.
What would you think of adding a type parameter to the function so that you can get a specific type of element from the given path?
Great idea, this seems idiomatic. If the result does not match up I suppose an IonElementConstraintException
should be thrown similar to the as*
functions.
/** | ||
* An [IonPath] represents a traversal into a nested IonElement made of [StructElement]s and [SeqElement]s | ||
*/ | ||
public class IonPath(public val elements: List<IonPathElement>) : Collection<IonPathElement> by elements { | ||
public constructor(vararg elements: IonPathElement) : this(elements.asList()) | ||
|
||
public fun tail(): IonPath { | ||
return IonPath(elements.drop(1)) | ||
} | ||
} | ||
|
||
public sealed class IonPathElement | ||
|
||
/** | ||
* Traverse into a nested [StructElement] by its field name. | ||
*/ | ||
public data class Field(val fieldName: String) : IonPathElement() | ||
|
||
/** | ||
* Traverse into a nested [SeqElement] by its numerical index. | ||
*/ | ||
public data class Index(val index: Int) : IonPathElement() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, let me say that this is a good enhancement idea. I have seen similar things in multiple applications I've worked on in the past, so clearly this is something that Ion users want. Thank you!
Here are some thoughts, in no particular order.
There is already an ion-java-path-extraction
library. It would be a good idea if we could come up with a name other than IonPath
. I don't think there's any particular reason it needs to have Ion
in the name, so something like TreePath
or ElementPath
could work.
There is nothing about your IonPath
that is inherently tied to ion-element-kotlin
—would you consider contributing it to ion-java
instead? The ion-java
repo is base for all of our JVM libraries, so having the path in ion-java
would make it easier to write things in the future that use these paths and are interoperable with each other.
Have you considered any other ways of modeling it? If so, what made you choose this particular way?
For example, I've tried to consider other possible ways of modeling it, and came up with this.
sealed class TreeLocation {
data class Path(public val components: List<TreeLocation>): TreeLocation()
data class Field(public val fieldName: String): TreeLocation()
data class Index(public val index: Int): TreeLocation()
}
I think the benefit of something like this is that it allows users to interchangeably provide a single index, a single field name, or a path. However, it would make the logic to interpret the path a little more complicated. (Just to be clear, I don't think that this is necessarily a better or worse way to model the path. I'm just trying to stimulate some discussion about the design decisions.)
Alternately, we could try to reuse the ion-java-path-extraction
path syntax. That could be a really nice feature because it allows you to select multiple matching things, or match conditionally if a certain annotation is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, I've tried to consider other possible ways of modeling it, and came up with this.
sealed class TreeLocation { data class Path(public val components: List<TreeLocation>): TreeLocation() data class Field(public val fieldName: String): TreeLocation() data class Index(public val index: Int): TreeLocation() }
I was planning on changing it to this pattern because I don't think there is a need to wrap the list of path elements into another class at the moment. One reason to wrap it would be to encapsulate path manipulation functions, but I haven't thought that far ahead yet and they could always be added as extension functions to List<TreeLocation>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not aware of ion-java-path-extraction
(I had already planned on extending this representation to sexp
s too). So are you suggesting to allow fetching from an IonElement using a PathExtractor
? I think that makes sense if there is already a well known way to represent paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are you suggesting to allow fetching from an IonElement using a PathExtractor?
Yes—more or less. It would be nice to see if we can have some similar/familiar APIs across multiple libraries.
I think that the path syntax/model defined in that library could be used to fetch child IonElements from a container. That would fulfill the same use case that you are trying to solve with this PR.
It would also be interesting to be able to extract IonElements directly off of an IonReader (as in ion-java-path-extraction
), but I am not suggesting that you should do that. (It would be a big scope increase.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...but just to be clear, I don't think ion-element-kotlin
should take a dependency on ion-java-path-extraction
. If we want to try to have a unified path extractor API, the classes to model the path should probably be implemented in ion-java
.
/** | ||
* Implement [] operator for fetching a nested element | ||
*/ | ||
public operator fun IonElement.get(path: IonPath): IonElement { | ||
return this.getPath(path) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hesitant to include an overloaded operator for this particular case because I think people will expect it to behave like accessing a List or Map—which is tricky because a Map will return null if a key is not found, but a List will throw an exception if the index is out of bounds. My personal opinion is just to pass on the operator fun rather than implement something that could be surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I don't have a strong opinion about this.
// Determines whether new fields are added to the struct or replace existing fields of the same name | ||
public enum class FieldUpdateMode { | ||
REPLACE, | ||
ADD | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the idea of having a "default mode". Since there's only two modes here, I think we're better off just having separate functions for the two cases to be explicit about them. Maybe set()
and add()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue is that the this["foo"] = ...
syntax in the builder body will have to choose to be add
or replace
.
// Gets a field, including fields that may have been added by current builder | ||
public operator fun get(fieldName: String): AnyElement { | ||
return fieldUpdates[fieldName]?.first() | ||
?: constraintError(sourceStruct, "Required struct field '$fieldName' missing") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to throw an exception here, then we must include a function to check whether a field name is present. Also, given that this is a builder (i.e. the thing we're building is expected to be incomplete), I'm not sure whether it's a good idea to throw an exception in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mimics the behavior of the get
operator for structs. Are you saying this should return null, or are you suggesting there should be some sort of lazy evaluation that happens at build time?
Even for lazy evaluation this would still be an error (or null) unless we do not preserve the order of operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's consistent with structs, then we can keep it the way it is. However, we must include a contains
function so that you can check if a field exists before you try to get
it.
Or maybe I'm misunderstanding how this builder is supposed to work?
import kotlin.test.assertEquals | ||
import org.junit.jupiter.api.Test | ||
|
||
class StructUpdateDemo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a Java demo as well so we can make sure the API works for Java?
} | ||
|
||
/** | ||
* A [StructUpdateBuilder] builder implements a typesafe builder that allows updating a [StructElement] by adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to typesafe builder or just the generic concept of a builder that is typesafe in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was referring to that link, basically the concept of using lambda with receiver to make a builder body. Maybe, I am misusing the terminology.
private val fieldUpdates: MutableMap<String, MutableList<AnyElement>> = | ||
sourceStruct.fields.groupBy({ it.name }, { it.value }) | ||
.mapValues { it.value.toMutableList() } | ||
.toMutableMap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's a multimap behind the scenes here, I have to ask—how much value do you think there is in having a StructUpdateBuilder
class vs. just using a multimap (or multimap-like class)?
Or, instead of a traditional builder, what do you think of something like this?
public interface MutableStructFields: Iterable<StructField> {
public operator fun set(name: String, value: IonElement)
public fun add(name: String, value: IonElement)
public operator fun get(key: String): MutableCollection<AnyElement>
public fun remove(key: String)
public operator fun contains(key: String): Boolean
public fun compute(name: String, operation: (AnyElement?) -> IonElement)
// At a later date, add things like these:
// public fun putIfAbsent(name: String, value: IonElement): Boolean
// public operator fun plusAssign(field: StructField)
// public operator fun plusAssign(field: Pair<String, IonElement>)
// public fun removeIf(predicate: (String, AnyElement) -> Boolean)
// public fun getOrPut(name: String, provider: () -> IonElement): List<AnyElement>
// public fun getOrPut(name: String, default: IonElement): List<AnyElement>
}
Because it implements Iterable<StructField>
, it can be used with the ionStruct
factory function, and it could be used with a copy
function as well. E.g.:
val newStruct = myStruct.copy(myStruct.mutableFields.apply {
compute("documentRevision") { it + 1 }
set("lastUpdatedDate", Instant.now().toTimestampElement())
// ...
})
I think this might also be a little more Java-friendly than having an update DSL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, I will have to play around with it bit. I would want to make sure that within a group of operations, only single copy is made with all of the changes, instead of a copy for each change, which the current builder ensures.
I was also thinking, that since StructElement
and SeqElement
already use PersistentMap
and PersistentList
we should be able to expose those to leverage copy-on-write semantics, again as long as multiple updates only create a single copy. I just wasn't sure how receptive you would be to changing the internals too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would want to make sure that within a group of operations, only single copy is made with all of the changes, instead of a copy for each change, which the current builder ensures.
I can elaborate a bit more. The implementation of MutableStructFields
could use a MutableMap
or a MutableList
behind the scenes, so we would create one shallow copy when we create the MutableStructFieldsImpl
instance, and then one shallow copy when we pass the instance to the constructor or copy()
function.
This is fundamentally similar to what I suggested for SeqElement
. The mutableStructFields
could return a MutableList<StructField>
, but that would make it more difficult for users to do simple things like update an existing field. It could return a MutableMap<String, MutableList<IonElement>>
, but that's awkward to have to deal with the list of values. The MutableStructFields
is just a wrapper around a mutable collection that makes it easier to perform useful operations on the fields.
I was also thinking, that since StructElement and SeqElement already use PersistentMap and PersistentList we should be able to expose those to leverage copy-on-write semantics, again as long as multiple updates only create a single copy. I just wasn't sure how receptive you would be to changing the internals too much.
I'm open to changing the internals, but I'm hesitant to expose the persistent collections because that has the potential to limit our ability to change implementation details in the future. We would want to be really sure before we commit to changing the persistent collections from being an implementation detail to being part of the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while, but after coming back, I really like the mutableFields
approach. The interface exposes the data in a writeable way and doesn't leak the internals. Is someone currently working on implementing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, no one is implementing it right now. Since it's a fairly thin layer built on top of ion-java
, generally ion-element-kotlin
does not get feature development unless someone requests/contributes a feature.
Edit—just to be clear, ion-element-kotlin
does benefit from many of the features and improvements in ion-java
, and in no way am I suggesting that ion-element-kotlin
is unmaintained.
@@ -362,6 +362,8 @@ public interface SeqElement : ContainerElement { | |||
override fun withMetas(additionalMetas: MetaContainer): SeqElement | |||
override fun withMeta(key: String, value: Any): SeqElement | |||
override fun withoutMetas(): SeqElement | |||
|
|||
public fun withValueAtIndex(index: Int, value: IonElement): SeqElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts on just adding an overload of copy()
instead of this function? It would be slightly more verbose, but it means that you get all of the operations of MutableList
without having to rewrite them for ListElement
.
E.g.:
public fun copy(
values: Iterable<IonElement>,
annotations: List<String> = this.annotations,
metas: MetaContainer = this.metas
)
Usage:
val newList = myList.copy(myList.valuesAsMutable.apply {
set(1, myOtherElement)
removeAt(0)
})
Or in Java:
List<AnyElement> values = myList.getValuesAsMutable();
values.set(1, myOtherElement);
values.removeAt(0);
ListElement newList = myList.copy(values);
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would prefer to move this behavior to copy.
@@ -450,4 +452,7 @@ public interface StructElement : ContainerElement { | |||
override fun withMetas(additionalMetas: MetaContainer): StructElement | |||
override fun withMeta(key: String, value: Any): StructElement | |||
override fun withoutMetas(): StructElement | |||
|
|||
/** Copy the struct, replacing all occurrences of fields with names that match the provided fields **/ | |||
public fun withFields(vararg fields: StructField): StructElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Would you consider overloading the copy()
function?
Closing this since it was mostly superseded by #84. If you still want to contribute path-based accessors, you can re-open this or (probably easier) start a new PR. |
Issue #, if available:
Description of changes: This adds a builder DSL for copying StructElements while changing / adding new fields. Dropping fields is currently not implemented, but can be added. See
test/com/amazon/ionelement/demos/kotlin/StructUpdateDemo.kt
for example usage.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.