-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for typed maps #917
Comments
Surely it's not a subtype if it's not substitutable for the supertype, that is if it can't be used as an argument to functions such as map:put()? The optimisation point is an interesting one. Javascript seems to manage OK, it's not noted for being slow. I'm very reluctant to introduce data model changes or extensions to the type system, as experience shows it takes years to work through all the implications. Now, if we could introduce new kinds of value without a change to the type system, that would be a different matter... But making the whole language fully object-oriented is just too big a step. I think that with named item types it should be easy enough to persuade users to declare the type of map/record that's expected at function boundaries, and that's sufficient (a) to catch user errors early, and (b) to allow the system to choose a map representation that's efficient for the type of access being performed. A record constructor as an alternative to a map constructor might well make life a little bit easier both for implementors and for users, and that can be done without a data model or type system change. |
I think of records as maps with frozen entries. I think everyone agrees that records are a big step forward, but it seems pretty erratic and unpredictable that you can destroy a record by a single update operation, without even noticing.
One advantage of JavaScript is that it has very basic data structures (…one byte is needed to represent a byte). Next, billions have been spent to make JS as fast as it is today… If I could, I would focus on getting our languages faster, possibly by the help of lightweight data structures, instead of adding more sugar. The raytracer code is a good example to demonstrate that all existing implementations of XQuery are… slow. Of course, one fundamental property of functional languages that affects performance is that each update is a copy operation; this won’t change as long as we don’t introduce mutable objects (which would lead to a completely new language).
I share your concerns. I wound hope that a record subtype would be much easier to specify than a completely new type for objects (otherwise, the latter would certainly be my favorite). It could also be the naivete of someone who hasn’t disclosed the gory details yet. |
Thinking around this, suppose we scrap "declare item type" in favour of "declare record type" on the grounds that (a) most of the use cases for named item types are actually for named record types, and (b) we already want to special-case record types by allowing them to be recursive. And then, Perhaps, along these lines, we should also have custom syntax for doing map:put in a way that constrains the type of the result. Something like
where
|
I think the fact that you have to use functions in the map namespace to achieve this gives a strong hint. But one of the aims of the "methods" proposal is to provide a more disciplined way of creating modified data in a type-safe way, as the "?resize()" example attempts to demonstrate. It may be worth revisiting issue #220 (Encapsulation) to see whether it can be made to work with records and methods.
Despite strenuous efforts, I haven't found a way to avoid copying node trees, because of the problem of node identity, but for maps and arrays, "immutable/persistent" data structures (I dislike both terms) mean that changes can easily be made without copying the parts of the structure that don't change. |
Yes, I believe that would be a great step in a good direction.
Another lightweight option could be a
Yes, it's impressive how efficient immutable data structure are (sometimes they even outperform mutable concepts). We’ll never beat direct memory updates; but of course, the goal has never been to compete with language like Rust, C or ASM. |
I withdraw Maybe the record constructor could take keyword arguments and interpret an optional single argument without keyword as the record to be updated: declare record geo:coord(
x as xs:double,
y as xs:double,
z as xs:double := 0,
final zero := fn() { $this?x = 0 and $this?y = 0 and $this?z = 0 }
);
let $pos := geo:coord(x := 3, y := 5)
return geo:coord($pos, z := 7)
In addition, a |
(Incidentally, I took a glance at the raytracer query, which I have never studied, and I suspect it could be made to go much faster using maps and arrays in place of node trees. Thats partly because of the cost of node construction and copying, and partly because it's doing a lot of string-to-double conversion.) |
Yes, it would be interesting to see a 3.1/4.0 update of the code. All I did was apply little changes that made it about 20% faster (in two processors that I tested, I can't recollect their names ;-). |
I am discovering this this fascinating discussion only now ... 😢 How can something that is immutable be "destroyed"? I have always regarded records as what in modern programming languages are interfaces. Thus, looking at records as a subtype of maps gives us nothing. It is synonymous of saying that everything is a map (object), which , again gives us nothing new - that we haven't already known about. On the other side, having what essentially is corresponding to interface in other languages gives us huge new capabilities. We can speak of an object implementing an interface (a map containing all fields of a record), and have the same duck typing as in Python. It is only coincidental that a record is defined as a restricted map (as in other languages, everything is a kind of object). |
@dnovatchev If you implement an interface, and if this interface defines a method, your programming language usually won't allow you to remove the implementation of that method from your object. With the current definition of records in our languages, that's easily possible. For example,
I think we should do our best to disallow such cases. By defining records as subtypes of maps, we could reject updates on maps that would violate the definition of the input record. The idea has similarities with the suggestion to define arrays as subtypes of maps (#298 (comment)): |
All items in XPath are immutable. It is not possible to "destroy", add or remove key-value pairs of a map instance. Trying to do so creates another map instance. The original record instance remains unchanged.
I don't see any need to do this, Records are Records, Maps are Maps. If we did really have such a problem in XPath, we would probably have proposed a const construct that marks a particular item as immutable - as many programming languages do. But because everything is immutable in XPath, we don't have this problem at all, this is why we don't need any const keyword.
I don't feel enthusiastic about this either. If Map is a synonym of Object, saying that arrays (specifically) are maps is like saying once again that everything is an object -- we already know this. Finally, even if we are in a typical OOP PL whose instances are not immutable, the whole idea of having a subtype that prohibits a method from the base class is wrong: This would violate the Liskov principle (the "L" in SOLID). See this said by Jon Skeet((here: https://stackoverflow.com/a/2779195/36305)) |
That’s obvious. We do have functions like
The problem as stated is introduced with records, as they are currently defined. It did not exist before 4.0, because it does not apply to maps, arrays or other types.
It seems to me you mix up several things here. No one wants to prohibit methods from a base class. Your SO link says:
My intent is exactly to prevent users from removing a function (in the resulting copy) unless it’s part of the record definition. Next, if we reject a copy if it does not match the definition of a record (as I’ve recommended above) would be a postcondition. Postconditions may be strengthened, but not weakened, according to Liskov (see https://en.wikipedia.org/wiki/Liskov_substitution_principle). With OOL, it’s fairly common to define stricter postconditions in overwritten methods. |
At he same Wikipedia article it is said: "Preconditions cannot be strengthened in the subtype." A precondition for the Map (base type) is that any set of key-values is allowed. But this is strengthened in a Record type, which requires only sets of key-values that include a specific subset of string-keys. Thus, any attempt to make Record to be a a subtype of Map violates the above rule, as it requires strengthening of the preconditions of the base class (Map). |
So what about the XDM type I’m not aware that pre-/post-conditions apply to the type; I associated them with methods/functions. |
I think we are diverging significantly from the main topic here.
Of course, if someone wants this badly enough, they could introduce (yet another!) namespace, and place only record-specific functions there. Say: rec:update(). Or, with the cost of added complexity, we can have a new overload of map:put that accepts a boolean parameter/argument named, say, preserve-record, and default value of false. Once again, as a user I don't feel any need either for creating such a separate namespace with record-specific functions or for creating a new overload of |
Maybe you as a user won’t, but other users will. Check out the examples in this and related GitHub issues, in which exactly this challenge is discussed (e.g. #720/#916: If you think the current definition of records is perfect as it is, that's just fine. If you don't, it would be helpful to hear some constructive feedback on what you would change. |
'record' is just a typename. A Typename is not a new type. It is a shorthand. If we start creating new types for some typenames, we would have a lot of (probably not pressingly necessary) new stuff, that would be inevitably biased (based on subjective preferences). And if we create new types, then why do we need typenames at all? I believe in step-by-step progress, and the first step here are the typenames. Based on sufficient, accumulated objective data of using the typenames after a considerable period from their introduction, then we could probably consider making the most popular typenames into separate types - if we do decide so, |
Exactly, that’s what it currently is. To follow on from the original considerations: Another solution to increase type-safety could be to use record constructors to create typed maps: The result will be a classical map, but with a record definition attached. Whenever an update operation is performed on a typed map (such as declare record geo:coord(
x as xs:double,
y as xs:double
);
let $pos := geo:coord(3, 5) … |
Ok, for a different perspective, which may or may not be helpful (feel free to ignore), I don't completely understand the original motivation but i do broadly understand examples (even if I'm inventing in my head what 'record' means). To me, this is more like a 'state monad' than a subtype. If you don't know what one is, they are a constraint on the type of the function signature 'put', and 'remove'. They are used mostly in pure(ish) languages to simulate imperative style semantics, which I think is sort of what you are getting at? i.e. in an imperative OO language, your 'remove' example breaks the contract because in OO all 'state changes' must be consistent the the contract on the type...i.e. a geo:coord is always a geo:coord, no matter what you do to it. In a pure language this isnt a problem except, sometimes its quite nice and convenient to have this sort of model (especially in a world of OO programmers), i.e. you want the constraint to try to give some structure to your code. So....say...."I want 'get" to be a state monad, and promise to return me a new geo:coord and the answer" i.e. in psuedo fp or in psuedo xpath
"remove" clearly can't promise to return a new geo:coord, so it can't have this signature. similarly "put" also doesnt unless you are overwriting x or y....(actually i dont know if this is a run time error, lets assume not). so
I'm not clear if this is helpful at all...I'm not suggesting making state monads part of XPath etc (though like the LINQ conversaton you could drive them from XQuery), I think most people would cry. but it may be worth applying in some way, it may better fit. |
My interpretation of this would be in terms of generic types: you want a signature of |
@MarkNicholls Thanks for the feedback. I invite you to check out the current XPath 4.0 draft to learn more about records, record types and record tests: https://qt4cg.org/specifications/xquery-40/xpath-40.html#id-record-test If a map
@michaelhkay I think the suggestion from my last comment would be the most promising one, as it’s much more light-weight than an independent object type, or a record subtype: We could introduce a record constructor, which creates a map that has the record type information attached. Whenever a new map is created from that map, we would simply need to ensure that the new map still matches this type. |
No, record is not a type - just a type name. |
It reminds me of C# and 'async'...which does something very very similar (though personally I'm not a fan...thats not the point). They wanted to secretly include an async monad (same pattern different application). https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/async there you annotate the signature of the function to tell the compiler its 'special'
and
the compiler then secretly injects the monadic code to do the magic they didnt want to infect the language with (probably because LINQ doesnt really inherently handle exceptions, and they wanted something bullet proof). how that translates into xpath i dont know, technically something higher order like (completely made up)
and some magic operator..
or explicitly
(i think, its been a while) scala does something 'clever' and overloads 'foreach' to deference monads (as C# does with 'for x in y' and LINQ, they chose to not do so in this case), the trick makes syntactical sense because evaluating sequences is also monadic. in which case
if you want, for example, to add async, maybe, stacks or random etc support, you just hijack the same syntax. |
While I can feel the "convenience" of this, it also poses unwanted restrictions. For example, one may want to construct a record "a piece at a time" and during putting each piece together the result will not be of the required type (yet). Or, one might want to incrementally construct one record structure from a record that has a different record-structure. With the restricted put() such freedom is difficult or prohibited. |
Feel free to fix this in the spec (you'll find multiple occurrences of “record type” in the text). |
Well, it is best that the original author(s) correct their text. |
I have trouble to understand the relationship: Do you refer to the existing record rules, or the suggestion to embed record definitions in a map? |
this reminds me of F# and their record syntax, you create a record
`let newFoo = { foo with name = 'jeremy' } but you arent allowed to 'extend' the type, i.e. this is a type error `let newBar = { foo with height = 155 } ` if you can map a record to a map, and a map to a record though you then CAN then go.
|
A possible way forward might be: A map MAY have a type annotation, which is an item type (typically a record type or map type). If a map M has a type annotation T, then A record constructor (details TBA) always constructs a map having a type annotation with the corresponding record type. In addition, maps with type annotations may be created using the constructs
for example or in XSLT
A further tweak: if a map has a type annotation with key type K, then the keys are effectively upcast to type K -- the map is no longer obliged to retain the actual subtype of each key value. (Retaining the subtype of each key value imposes an overhead which hardly ever has any value for users.) |
Possibly yes, if there should be agreement to do so. In this thread, I’ll tend to use the wording from the spec until we decide to rephrase it. |
Thanks; that's a perfect summary of what I think we should add. |
ah....thats because its not at all obvious....it was really as an example of an existing language extends the syntax of a function to get monadic behaviours.....and if you introduce the monadic behaviour of "state", to constrain the function types of your put/remove functions, then a similar trick would work, i.e. you annotate the function AND you annotate the call to the function to inject the necessary magic to post the (map/record) state through without the programmer seeing it. They could have written LINQ operators to do it...a more general magic, but they chose not to. Its a vague, hand wavy sort of analogy to how someone else has done it. |
what i find awkward about this is map:put for example is just a function, its not a method, so asking the function to fail for some parameters irks me....if it were a method and subservient to the map then that makes sense, it all feels DBC, but it isnt, so it feels weird. isnt it just easier to say, a) you can add remove things from maps then havent you got all you need (again, not fully comprehending the motivation), and it all seems quite neat and tidy. (or are you saying you want map:put etc to be a state monad? but then whats the type of map:put?...or is it secretly a state monad? can I, as a programmer write such a function?) |
Or, to put it clearly and precisely, we are adding type as an object. The details how this is done (using annotations or in another way) are irrelevant. Reminds me of a proposal I made quite some time ago ... Or, shall we say: Everything is a record, and a map is a record with 0 fixed keys... |
@MarkNicholls Note that records are nothing else than maps that match one or more given record types. As a result, functions like |
@michaelhkay I believe we should exclude map types. Otherwise, we would indeed introduce a backward incompatibility. |
It's not a backwards incompatibility if it only applies to maps that have a type annotation, since existing maps don't have one. |
I see. I first assumed that type annotations will also be assigned to a map implicitly, via coercion… let $x as map(xs:string, xs:string) := map { 'a': 'A' } …but if we only do this via the proposed |
and i think this is where i get concerned, I've looked at the spec, its not one of my strength reading specs. for me a map and a record are disjoint (like xs:integer and xs:double) and I personally would specify and implement them independently except for the ability to convert conveniently from one to the other and back. You can then define your functions to have signatures that preserve what you need to preserve in each case and exclude functions that make no sense in each case. Then i dont think any magic is required. I sense this is quite out of step with where the underlying proposal is, so I'll shut up. |
This would also be my preferred option (related: #916 (comment)). As you’ve already noted, it's probably out of scope if we want to finalize 4.0 in any foreseeable timeframe. |
I agree with your comment. |
This was a fascinating discussion, and I think it influenced our thinking in a number of areas, but I think it can be closed. If there are concrete changes coming out of it, they would be better handled in a new, more focussed issue. One thing that this discussion influenced was the Saxon implementation: see https://blog.saxonica.com/mike/2024/08/maps-and-records.html |
The CG agreed to close this issue without any further action at meeting 088. |
Edit (2023-01-04): See #917 (comment) for the most promising suggestion resulting from the discussion in this thread.
Inspired by #720 and concerns regarding usability and performance, it may be a big step, but couldn’t we define records as subtypes of maps?
The text was updated successfully, but these errors were encountered: