Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Proposal: Allow creating arrays with expressions #950

Open
sargunv opened this issue Dec 20, 2024 · 22 comments
Open

Design Proposal: Allow creating arrays with expressions #950

sargunv opened this issue Dec 20, 2024 · 22 comments

Comments

@sargunv
Copy link

sargunv commented Dec 20, 2024

Design Proposal: Allow creating arrays with expressions

Motivation

A number of layer properties take arrays as their values. For example, see text-offset and text-variable-anchor-offset. The only way to produce an array in the existing expression system is to either get it from a feature with get, or create it with literal. Neither of these allow calculating the elements of the array with math expressions, though you can select between different literal arrays or properties with conditional match / case expressions.

For arrays of just numbers (let's call them vectors), one can work around this for certain use cases by interpolating between multiple vectors, and doing your math on the interpolation input. MapLibre Compose does this workaround to simulate multiplying the magnitude of a text-offset (to convert between SP and EM units).

For arrays containing other types, not even that workaround is available (example: text-variable-anchor-offset). MapLibre Compose was unable to work around this.

Some use cases that could benefit from expressions in array elements:

Proposed Change

I propose a new function in the style spec, [], that takes any number of expression arguments and returns an array of all those arguments. The syntax would look like:

{
  "text-offset": ["[]", ["get", "label_x"], ["get", "label_y"]]
}

This is in contrast to literal, which does not evaluate its arguments as expressions.

API Modifications

Just the new function []. Also perhaps a similar function {} for objects.

Migration Plan and Compatibility

No migration needed for the main proposal; this is new functionality.

Some alternatives in the "Rejected Alternatives" section below are backwards incompatible and would need a migration if they're selected.

Rejected Alternatives

  • Update literal to evaluate its elements as expressions: this would not be backwards compatible in cases of nested arrays. The only place in the style spec that uses nested arrays is text-variable-anchor-offset, whose value looks like ["top", [1, 2], "bottom", [3, 4]]. If it weren't for that backwards incompatibility, I'd prefer this alternative.
  • Add an optional boolean parameter to literal to evaluate its elements as expressions: I think this is confusing to the reader, as it's not immediately apparent what true or false mean when reading a style spec. Also, my impression (haven't read the code so perhaps wrong) is that literal is part of the expression parser, and not part of expression evaluation like other functions.
  • Just evaluate elements of any array that doesn't start with a string: If you want the first element of such an array to be a literal string, you'd have to escape it with something like concat. Feels inconsistent.
  • Double brackets syntax like [[ ["get", "x"], ["get", "y"] ]]: Syntax is ambiguous in case of single element arrays where the first element is an array of strings (looks like a function call). Would need some sort of escape like the above alternative.
  • New function as proposed, but alternative names: I considered various alternative names before landing on []:
    • Looking for existing functions in the spec that take some input and return a new object, we have collator that creates a Collator, format that creates a Formatted , and image that creates a ResolvedImage. So, the natural convention might be array to create an Array, but that function name is already taken for a function to assert that the input is an array (similar to string, number, boolean, and object)
    • We have to-color, to-string, etc, which all take a single value and convert it to the target type (some with fallbacks). The name to-array is available, but the convention doesn’t quite match what we want.
    • rgb and rgba are closest in concept to what we want (evaluate some arguments as expressions, construct a result object from it) but of course that naming convention doesn't have an obvious analogue for arbitrary arrays.
    • So, I think a new naming convention is necessary. Alternative: new-array is pretty clear in intent, but could get visually noisy in case of nested arrays. I don't really dislike this, just like [] a bit better as it would take less of the reader's focus away from the content of the array.
    • So, I landed on [].

EDIT: additional alternatives as considered in the issue thread

  • New functions point and padding specifically for those types: Solves the issue for some properties (text-offset) but not others (text-variable-anchor-offset, line-dasharray, etc). I think it's worthwhile to have those but we'd still need [] in addition to that.
@sargunv
Copy link
Author

sargunv commented Dec 20, 2024

Relevant years-old mapbox-gl-js feature request: mapbox/mapbox-gl-js#6155 with folks still asking for this functionality to this day

@1ec5
Copy link
Contributor

1ec5 commented Dec 20, 2024

A number of layer properties take arrays as their values.

rgb and rgba are closest in concept to what we want (evaluate some arguments as expressions, construct a result object from it) but of course that naming convention doesn't have an obvious analogue for arbitrary arrays.

This touches on a more fundamental issue. The open-ended arrays are reasonably idiomatic for these properties in JavaScript and JSON, but when translating the runtime styling API to Objective-C/Swift, it became necessary to represent these same properties as something more strongly typed, typically a type that’s built into the standard library, such as UIEdgeInsets or CGPoint. Similarly, the within expression’s argument is represented by a GeoJSON model object in gl-native instead of an anonymous JSON object.

The JSON format takes this same approach with a more limited set of types, such as colors. Maybe there need to be more type constructing operators like offset with corresponding type converting operators like to-offset, to strengthen what have been weakly typed properties up to now. Only a few properties are unavoidably plain arrays, like text-font and line-dasharray.

More facilities for working with arrays will be useful for all the platforms in these cases regardless, but I suspect the NSExpression representation on iOS would need to be more lenient about arrays versus these more specific types.

@sargunv
Copy link
Author

sargunv commented Dec 20, 2024

Similarly, MapLibre Compose uses the Compose Color, Offset, PaddingValues.Absolute types to represent those values under the hood. On the API surface, MapLibre Compose has additional strongly typed values (Dp, TextUnit, Duration, DpOffset, etc).

If we had say, point and padding functions to construct those points/padding like we have rgb and rgba for colors, that'd solve this issue for some properties (text-offset) but not others (text-variable-anchor-offset). Still, I think that'd be a reasonable addition, especially paired with this proposal.

@1ec5
Copy link
Contributor

1ec5 commented Dec 21, 2024

Based on the proposed operator names [] and {}, what would the corresponding functions be in the Kotlin and Swift expression DSLs, which can’t contain punctuation? If we come up with a valid identifier for the operators on those platforms, then the JSON format might as well use those same identifiers instead of punctuation.

An observation: there’s only one literal operator to wrap either an array or an object as a literal, so there only needs to be one operator to wrap either an array or an object as a nonliteral. Could we call it nonliteral? (Or figurative? 🤓)

@sargunv
Copy link
Author

sargunv commented Dec 21, 2024

If you use one function name for both, how would you determine the output format? Objects need keys and values, while an array just needs a sequence of elements. Two signatures implies two functions. Also curious if you're aware of any use cases for objects here; I'm only aware of array use cases.

Literal wraps a single object or array, and doesn't allow function calls inside. That's the reason it can support both input types with one name.

I don't have a string opinion on the Java/Kotlin/ObjC/Swift name, but there's already precedent for other expression functions with symbols as names. On Java/Kotlin I imagine it would be a function taking vararg Expression, perhaps Expression.newArray reflecting Java conventions. If y'all prefer a identical name between the JSON syntax and the native SDKs, that's fine with me (new-array?).

Considering naming, I think the constraints on the different platforms are different (ability to define reusable functions, ability to add overloads, etc) so I feel that naming things exactly the same on all platforms is an anti goal. Happy to defer on that though; I really only care about the functionality.

On the MapLibre Compose side, we already use different, Kotlin idiomatic names for all the expressions. We have type safe functions to create various expression types and would probably add a overloads accepting expressions of particular types instead of the corresponding primitives we currently accept.

@1ec5
Copy link
Contributor

1ec5 commented Dec 21, 2024

That all sounds reasonable to me. I was just spitballing about a way to avoid the punctuation if desired, but I agree that there’s no hard constraint about keeping the operator name consistent across platforms. (Otherwise, I would be more guilty than anyone for having violated that constraint over the years.)

Wrapping both arrays and objects with the same operator probably won’t work for the reason you pointed out.

There might be a use case for wrapping an object. For example, if the tiles have a feature property that indicates the language code of the text in another feature property, such as name and name_language, then a collator could be based on that language instead of the default or a hard-coded language. Similarly, a formatted expression’s attributes object could specify the font size of one part of the formatted text based on a feature property. However, I don’t think this has to be within scope for your proposal unless you think it would make the proposal more elegant.

@sargunv
Copy link
Author

sargunv commented Dec 21, 2024

I think if this proposal is accepted, and if [] is a reasonable name, then {} would be a fine name for the object version.

I'd be happy to make a separate proposal to discuss the parameter scheme for the {} function (but personally I'm not comfortable writing that one until I've dug around in the code enough to implement []).

@HarelM
Copy link
Collaborator

HarelM commented Dec 21, 2024

Can you better clarify why literal can't be used? How will it break existing styles? I would say that literal was designed to solve problems like what is mentioned in this proposal as far as I understand and creating a new operator will create confusion...

@sargunv
Copy link
Author

sargunv commented Dec 21, 2024

literal treats nested sub arrays as literal arrays, but to evaluate expressions we'd need to treat them as function calls.

Consider ["literal", [["get", "label_x"], ["get", "label_y"]]]. With existing behavior, that's a literal array of two nested literal arrays of strings. If literal evaluates expressions, that'll become a literal array of two function calls.

Sure, there's probably not a style with nested literal arrays of strings, because nothing in the style spec requires that. But consider "text-variable-anchor-offset": ["literal", ["top", [0,0]]]. Under existing behavior, that's a literal array of a string and a nested literal array (point). Under the new behavior, that'd be a literal array of a string and an invalid function call.

Interestingly, I discovered recently MapLibre Native rejects nested arrays inside literal via some code paths but not others. So my impression is perhaps literal was meant to evaluate expressions, it was never actually implemented, but now it evaluates sub arrays as literal arrays, which even though it's not fully implemented, it's implemented enough that it'd break currently valid styles if literal begins to evaluate expressions inside.

Personally, I think it's valid to have both (LISP like languages often do) but once [] is added, the docs/examples should be updated to use [] over literal anywhere that nested literal behavior isn't explicitly used, because evaluating expressions inside is more consistent with the rest of the language.

@sargunv
Copy link
Author

sargunv commented Dec 21, 2024

See: quote and quasiquote in Lisp: https://courses.cs.washington.edu/courses/cse341/04wi/lectures/14-scheme-quote.html

Wrapping both arrays and objects with the same operator probably won’t work for the reason you pointed out.

Actually, after sleeping on it, I think it's possible if the syntax is like literal (one parameter, either object or array) with the only difference being arrays in elements/values are evaluated as expressions. If we take that approach, I'd recommend semiliteral as the name, consistent with lisp (even though it's confusing to me, without a lisp background).

Not 100% sure how that'd work with the parser, but I can dig into it. If maintainers agree that something with this functionality is likely desirable, I can start hacking on a PR to experiment with the implementation.

@1ec5
Copy link
Contributor

1ec5 commented Dec 21, 2024

Interestingly, I discovered recently MapLibre Native rejects nested arrays inside literal via some code paths but not others. So my impression is perhaps literal was meant to evaluate expressions, it was never actually implemented, but now it evaluates sub arrays as literal arrays, which even though it's not fully implemented, it's implemented enough that it'd break currently valid styles if literal begins to evaluate expressions inside.

The original reason for literal is that values of some properties like text-font and icon-image needed to be evaluated statically in order to know resource requirements upfront (think offline maps), and text-font specifically needed an array to represent a fontstack: mapbox/mapbox-gl-js#5393.

@sargunv
Copy link
Author

sargunv commented Dec 21, 2024

That's super interesting context. How did other array properties work back before literal (did they exist)? Was get the only way to get array outputs back then?

But yeah, the code that does static analysis of possible outputs for those properties would need to be updated to support the proposed [] or semiliteral. I don't think it's necessarily incompatible with such static analysis (determine all possible outputs of each element, calculate permutations if order matters) but does make this less trivial.

@1ec5
Copy link
Contributor

1ec5 commented Dec 21, 2024

That's super interesting context. How did other array properties work back before literal (did they exist)? Was get the only way to get array outputs back then?

This would’ve been around the time that expressions were being implemented in the first place. In the previous style function syntax, which didn’t have arbitrary arrays, properties like text-font weren’t data-driven; they could only be hard-coded or possibly tied to a zoom function.

@HarelM
Copy link
Collaborator

HarelM commented Dec 22, 2024

It's still not clear to me why literal isn't good enough. The following example is using a "reserved word" - get, so there's a need to evaluate the content before assigning it to an array:

Consider ["literal", [["get", "label_x"], ["get", "label_y"]]]. With existing behavior, that's a literal array of two nested literal arrays of strings. If literal evaluates expressions, that'll become a literal array of two function calls.

I'm still confused why there's a need to create a new operator (not to mention the confusion other users will have when they'd need to understand which one to use). As someone who reads these expressions I'm usually able to understand what the intention was, I would expect the parser to be able to do the same...

@sargunv
Copy link
Author

sargunv commented Dec 22, 2024

Because array syntax is indistinguishable from function syntax, a choice needs to be made on which to assume. And literal already made that choice one way; changing it to the other way would be a breaking change.

In my use case, I want to do some math to calculate the x and y components of an offset. That requires function calls inside an array.

Function calls are themselves represented by arrays in the JSON syntax. So an array with x and y calculated by functions means an array with sub arrays.

The existing behavior of literal is to treat sub arrays themselves as literal arrays, not function calls. We could change that, but existing styles with sub arrays inside literal would be parsed differently.

Or, an example:

["literal", [["*", 1, 2], ["*", 1, 2]]]

Does that output an array of two numbers calculated by multiplication? Or does that output an array of two sub arrays, each with a string and two numbers?

Looking at it as a human, I can tell the intent of the writer is the first. But looking at it as a parser, the existing behavior today is the second.

If maintainers are cool with a breaking change, I can make that PR. But I assume we'd prefer to avoid a breaking change here.

@HarelM
Copy link
Collaborator

HarelM commented Dec 22, 2024

I think literal behavior is broken and we should fix it.
There are "reserved words" for operators, it makes sense to respect them, like "*“ in the above example.
If there is a literal that doesn't work right now and style authors relay on this behavior that's a problem, but I'm good with dealing with it like we did for geometry-type - releasing this change as part of a breaking change version.
If there is an example you can show that isn't as clear as the above (where it's not clear what to expect as a human not as the current buggy parser) I'll be happy to consider a new operator.

@1ec5
Copy link
Contributor

1ec5 commented Dec 22, 2024

If I set text-font to ["literal", ["get", "font"]], should MapLibre fetch the fontstack literally named “get,font”, or should it fetch the fontstack specified by font (an array-typed feature property), treating “get” as a reserved keyword rather than a font name? If the latter, what is the point of the literal operator anyways?

@sargunv
Copy link
Author

sargunv commented Dec 22, 2024

If we change parsing to treat arrays starting with "reserved words" differently from other arrays, and every function name is a reserved word, then every new function added to the spec becomes a breaking change. If that's acceptable I'm happy to PR it, but I'd advise against it.

The literal operator is fundamentally the same thing as quote in Lisp languages, so solving it the same way (quasiquote operator) doesn't seem that bad to me. Call it semiliteral, call it []/{}, call it array-of/object-of, or whatever else.

@HarelM
Copy link
Collaborator

HarelM commented Dec 22, 2024

I guess I lack the relevant knowledge to understand what I'm missing. Maybe someone else can approve this then...

@sargunv
Copy link
Author

sargunv commented Dec 22, 2024

I filed a PR so we have something more concrete to hack on: #951

naming is tbd (currently semiliteral) but this PR pretty closely reflects what I'm proposing

@HarelM
Copy link
Collaborator

HarelM commented Dec 23, 2024

I would suggest to present this in the next monthly meeting and discuss the pros and cons on each approach.

@sargunv
Copy link
Author

sargunv commented Dec 24, 2024

I usually have a conflict at the time of the monthly meeting, so can't usually attend. Will see if I can rearrange things on the day of the next meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants