Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialization of expr in parameter #2573

Closed
mattijn opened this issue Mar 24, 2022 · 13 comments
Closed

serialization of expr in parameter #2573

mattijn opened this issue Mar 24, 2022 · 13 comments
Labels

Comments

@mattijn
Copy link
Contributor

mattijn commented Mar 24, 2022

As is mentioned here: #2528 (comment). I've the feeling when an expr is defined within a parameter it is serialized into a debatable VL-spec.

Current behavior also leads to not needing the add_parameter, when an expr is defined, as mentioned here: #2528 (comment).

BTW, in VL one can write: "thetaOffset": {"expr": "-PI/length(data('MY_DATA'))"}, but in Altair one can not write alt.expr(-PI/length(data('source_wind'))).

Open the Chart in the Vega Editor

@ChristopherDavisUCI
Copy link
Contributor

Hi @mattijn, in the lines you identified in the other PR, if we add an elif condition so they become

            if self.param.expr is Undefined:
                return {"expr": self.name}
            elif isinstance(self.param.expr, str):
                return {"expr": self.param.expr}
            else:
                return {"expr": repr(self.param.expr)}

that seems to fix the issue. If I understand correctly, the main issue was repr turning ' into \', but I haven't looked super carefully. I ran the tests and didn't get any errors, but I should look around and see if it broke any of the existing parameter functionality.

This change produces

"thetaOffset": {
      "expr": "-PI/length(data('source_wind'))"
    }

in the serialization. Are you confident that the following would be preferred?

"thetaOffset": {
      "expr": "parameter027"
    }

@ChristopherDavisUCI
Copy link
Contributor

I think my question about which "expr" we want is very related to your "we don't need to add the parameter" comment. Of course if we use

"thetaOffset": {
      "expr": "parameter027"
    }

then the add_parameter will be required.

@mattijn
Copy link
Contributor Author

mattijn commented Mar 25, 2022

I think I prefer:

theta_shift_as_par = alt.parameter(expr=(-PI/length(data('source_wind'))))
theta_shift_as_expr = alt.expr(-PI/length(data('source_wind')))  # not yet possible

chart = alt.Chart(source_wind).mark_arc(
    tooltip=True,
    thetaOffset=theta_shift_as_par  # serialize to parameter name
    thetaOffset=theta_shift_as_expr  # serialize directly to expr
).encode(
    theta='winddirection:N',
    color='label:N'
).add_parameter(theta_shift_as_par)
chart#.to_dict()

@mattijn
Copy link
Contributor Author

mattijn commented Mar 25, 2022

The repr issue will solve when entering the expression as string, but I since then found out I should import the relevant expressions from from altair.expr import X, Y, Z and use these to build up the expressions.

@ChristopherDavisUCI
Copy link
Contributor

Could you post an example of how you make the above parameter using from altair.expr import X, Y, Z? My memory is a little fuzzy, but I remember trying to circumvent altair.expr as much as possible.

@mattijn
Copy link
Contributor Author

mattijn commented Mar 25, 2022

import altair as alt
from altair.expr import PI, length, data

data_wind = [
    {'winddirection': 0, 'label': 'North'},
    {'winddirection': 90, 'label': 'East'},
    {'winddirection': 180, 'label': 'South'},
    {'winddirection': 270, 'label': 'West'},
]

source_wind = alt.DataSource(alt.InlineData(values=data_wind, name='source_wind'))

theta_shift_as_par = alt.parameter(expr=-PI/length(data('source_wind')))
#theta_shift_as_expr = alt.expr(-PI/length(data('source_wind')))

chart = alt.Chart(source_wind).mark_arc(
    tooltip=True,
    thetaOffset=theta_shift_as_par  # should serialize to parameter name?
    #thetaOffset=theta_shift_as_expr  # should serialize directly to expr?
).encode(
    theta='winddirection:N',
    color='label:N'
)#.add_parameter(theta_shift_as_par)  # when wanted as parameter name
chart.to_dict()
{'$schema': 'https://vega.github.io/schema/vega-lite/v5.2.0.json',
 'config': {'view': {'continuousHeight': 300, 'continuousWidth': 400}},
 'data': {'name': 'source_wind',
  'values': [{'label': 'North', 'winddirection': 0},
   {'label': 'East', 'winddirection': 90},
   {'label': 'South', 'winddirection': 180},
   {'label': 'West', 'winddirection': 270}]},
 'encoding': {'color': {'field': 'label', 'type': 'nominal'},
  'theta': {'field': 'winddirection', 'type': 'nominal'}},
 'mark': {'thetaOffset': {'expr': "((-PI) / length(data('source_wind')))"},
  'tooltip': True,
  'type': 'arc'}}

https://colab.research.google.com/drive/1-jFoG2BOYVUnMzAhKhaLQLTU6dFpeDWI?usp=sharing

@ChristopherDavisUCI
Copy link
Contributor

@jakevdp do you agree with @mattijn's suggestion, that theta_shift_as_par (as defined below) should serialize using the parameter name, as in "expr": "parameter027", and that theta_shift_as_expr should serialize so that the formula gets spelled out, as in "expr": "-PI/length(data('source_wind'))"? If that sounds good to you, I will see how much of that I can implement and will report where I get stuck.

import altair as alt
from altair.expr import PI, length, data

theta_shift_as_par = alt.parameter(expr=(-PI/length(data('source_wind'))))
theta_shift_as_expr = alt.expr(-PI/length(data('source_wind')))  # not yet possible

chart = alt.Chart(source_wind).mark_arc(
    tooltip=True,
    thetaOffset=theta_shift_as_par  # serialize to parameter name
    thetaOffset=theta_shift_as_expr  # serialize directly to expr
).encode(
    theta='winddirection:N',
    color='label:N'
).add_parameter(theta_shift_as_par)
chart#.to_dict()

@mattijn
Copy link
Contributor Author

mattijn commented Apr 1, 2022

Maybe @domoritz can shed a light on this? What is the proper way to adopt for altair? Will expr continue to co-exist next to param in VL grammar and should it be treated separately?

@domoritz
Copy link
Member

domoritz commented Apr 1, 2022

I'm not following this whole conversation but here is my take (let me know if you want to to elaborate on anything). We plan to support and keep supporting expr in many places like

{
  "params": [
    { "name": "cornerRadius", "value": 0,
      "bind": {"input": "range", "min": 0, "max": 50, "step": 1} }
  ],
  "data": {
    "values": [
      {"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43},
      {"a": "D", "b": 91}, {"a": "E", "b": 81}, {"a": "F", "b": 53},
      {"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}
    ]
  },
  "mark": {
    "type": "bar",
    "cornerRadius": {"expr": "cornerRadius"}
  },
  "encoding": {
    "x": {"field": "a", "type": "nominal", "axis": {"labelAngle": 0}},
    "y": {"field": "b", "type": "quantitative"}
  }
}

expr supports expression and they can refer to params but usually you would use a param that is a simple value. A param may be a composed object in the implementation (e.g. selection params) so we do support param in some places where we can automatically resolve how they should be have (e.g. filter a dataset by the interval defined in a selection param).

{
  "data": {"url": "data/cars.json"},
  "vconcat": [{
    "params": [{"name": "brush", "select": "interval"}],
    "mark": "point",
    "encoding": {
      "x": {"field": "Horsepower", "type": "quantitative"},
      "y": {"field": "Miles_per_Gallon", "type": "quantitative"}
    }
  }, {
    "transform": [
      {"filter": {"param": "brush"}}
    ],
    "mark": "point",
    "encoding": {
      "x": {
        "field": "Acceleration", "type": "quantitative",
        "scale": {"domain": [0,25]}
      },
      "y": {
        "field": "Displacement", "type": "quantitative",
        "scale": {"domain": [0, 500]}
      }
    }
  }]
}

@ChristopherDavisUCI
Copy link
Contributor

I'm a little stuck with how to proceed @mattijn @jakevdp @joelostblom

Say we define a parameter like @mattijn did above

theta_shift = alt.parameter(expr=(-PI/length(data('source_wind'))))

There is currently no real difference between that object in Altair and an object like 5+theta_shift (just an extra 5+ in the expr).

The good thing about the current setup is that 5+theta_shift can be referred to without explicitly making a new parameter. But @mattijn points out some unintended consequences, like for example theta_shift can itself be referred to in a chart without explicitly adding that parameter to the chart.

Should we make it so that 5+theta_shift is fundamentally different from theta_shift, so for example 5+theta_shift can only be used in a chart if theta_shift has been explicitly added to the chart? It could either be a new object or there could be some Parameter attribute that would differentiate these cases.

I am also tempted by the two-line temporary solution that I mentioned above #2573 (comment) which would get @mattijn's original chart working (and doesn't seem to break anything that currently works).

I'm happy for any suggestions!

@mattijn
Copy link
Contributor Author

mattijn commented Apr 3, 2022

Thanks @domoritz, its a bit of a semantic discussion. The parts you mentioned are clear. The issue here is about the exprs. These can be defined:

  1. within a scope of a parameter or
  2. outside a parameter.

I think if 1 is the case it always should be referred by the parameter name (not happening at the moment in altair). For 2, if exprs are allowed to be defined outside the scope of a parameter, then altair should introduce something for this.

@ChristopherDavisUCI
Copy link
Contributor

Hi @mattijn I made an attempt to resolve these issues in #2591. Please take a look and let me know any comments!

@mattijn
Copy link
Contributor Author

mattijn commented Apr 16, 2022

I mentioned before that this was not possible:

expr_ref_type = alt.expr(-PI/length(data('source_wind')))

But I just found out that within VL this is called an ExprRef, so this is possible:

expr_ref_type = alt.ExprRef(-PI/length(data('source_wind')))
expr_ref_type
ExprRef({
  expr: ((-PI) / length(data('source_wind')))
})

So one can currently use an expression in-line, without defining an expression within a parameter as such:

import altair as alt
from altair.expr import PI, length, data

data_wind = [
    {'winddirection': 0, 'label': 'North'},
    {'winddirection': 90, 'label': 'East'},
    {'winddirection': 180, 'label': 'South'},
    {'winddirection': 270, 'label': 'West'},
]

source_wind = alt.DataSource(alt.InlineData(values=data_wind, name='source_wind'))

expr_ref_str = alt.ExprRef("-PI/length(data('source_wind'))")
expr_ref_type = alt.ExprRef(-PI/length(data('source_wind')))

chart = alt.Chart(source_wind).mark_arc(
    tooltip=True,
    thetaOffset=expr_ref_type  # <--
).encode(
    theta='winddirection:N',
    color='label:N'
)
chart.to_dict()
{'$schema': 'https://vega.github.io/schema/vega-lite/v5.2.0.json',
 'config': {'view': {'continuousHeight': 300, 'continuousWidth': 400}},
 'data': {'name': 'source_wind',
  'values': [{'label': 'North', 'winddirection': 0},
   {'label': 'East', 'winddirection': 90},
   {'label': 'South', 'winddirection': 180},
   {'label': 'West', 'winddirection': 270}]},
 'encoding': {'color': {'field': 'label', 'type': 'nominal'},
  'theta': {'field': 'winddirection', 'type': 'nominal'}},
 'mark': {'thetaOffset': {'expr': "((-PI) / length(data('source_wind')))"},  # <--
  'tooltip': True,
  'type': 'arc'}}

But alt.ExprRef() does not support operand functions (one cannot do expr_ref_str + expr_ref_str).

Not sure if we can or should introduce a shortcut to create an ExprRef through alt.expr(). I like it, but alt.expr is a module and I am not sure if we can follow a similar strategy as alt.datum() (as defined here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants