fix: correct references to flattened fields in selection signals. #5351

arvind · 2019-08-28T23:58:56Z

Fields are only flattened if they participate in encoding rules. Thus, we continue to use bracket notation to access datum values in signal expressions as we cannot be guaranteed that the flattened field will be present. However, the field name stored as part of a SelectionProjection will always use dot notation as these names are either returned to us flattened by model.vgField or users explicitly specify them in the input spec (in which case, dot notation is their only recourse). Thus, when referencing the field via a selection's top-level signal, we just directly access the flattened field name, rather than unpacking it via bracket notation.

Fixes #5334.

domoritz · 2019-08-29T00:12:00Z

Looks good. Are you adding flattening for fields that are only referenced in selections as well?

arvind · 2019-08-29T00:25:11Z

I'm not planning to primarily due to time constraints (with my limited time, I'd rather push forward on code paths I'm already familiar and this PR felt like a principled fix rather than a hack/workaround).

domoritz · 2019-08-29T00:27:50Z

Okay. Can you explain what you meant by "this PR felt like a principled fix rather than a hack/workaround"?

arvind · 2019-08-29T00:34:06Z

Sure -- I mean that the description/reason I gave when opening this PR feels principled to me in the sense that selection signals do not need to make any assumptions about what is or isn't happening to the datum. Instead, selections can remain agnostic/independent about whether any flattening has occurred and assemble purely based on what the user specifies (which matches what is returned by model.vgField).

domoritz · 2019-08-29T00:56:53Z

Just to clarify, if the user starts with this spec.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v3.json",
  "selection": {
    "grid": {
      "type": "multi", "fields": ["nested.b"]
    }
  },
  "data": {
    "values": [
      {"nested":{ "a": "1", "b": 28}},
      {"nested":{ "a": "2", "b": 55}},
      {"nested":{ "a": "3", "b": 43}},
      {"nested":{ "a": "4", "b": 91}},
      {"nested":{ "a": "5", "b": 81}},
      {"nested":{ "a": "6", "b": 53}},
      {"nested":{ "a": "7", "b": 19}},
      {"nested":{ "a": "8", "b": 87}},     
      {"nested":{ "a": "8", "b": 52}}
    ]
  },
  "mark": "point",
  "encoding": {
    "y": {
      "field": "nested.a",
      "type": "quantitative"
    },
    "x": {
      "field": "nested.a",
      "type": "quantitative"
    },
    "color": {
      "condition": {"selection": "grid", "value": "red"}
    }
  }
}

And now they add a calculate transform to derive fields nested.aa and nested.bb.

In the spec below, the user has to know that they have to write "fields": ["nested\\.bb"] instead of "fields": ["nested.bb"] in the selection. They don't need to change anything in the encoding.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v3.json",
  "selection": {
    "grid": {
      "type": "multi", "fields": ["nested.bb"]
    }
  },
  "data": {
    "values": [
      {"nested":{ "a": "1", "b": 28}},
      {"nested":{ "a": "2", "b": 55}},
      {"nested":{ "a": "3", "b": 43}},
      {"nested":{ "a": "4", "b": 91}},
      {"nested":{ "a": "5", "b": 81}},
      {"nested":{ "a": "6", "b": 53}},
      {"nested":{ "a": "7", "b": 19}},
      {"nested":{ "a": "8", "b": 87}},     
      {"nested":{ "a": "8", "b": 52}}
    ]
  },
  "transform": [{
    "calculate": "datum.nested.a", "as": "nested.aa"
  }, {
    "calculate": "datum.nested.b", "as": "nested.bb"
  }],
  "mark": "point",
  "encoding": {
    "y": {
      "field": "nested.aa",
      "type": "quantitative"
    },
    "x": {
      "field": "nested.aa",
      "type": "quantitative"
    },
    "color": {
      "condition": {"selection": "grid", "value": "red"}
    }
  }
}

arvind · 2019-09-07T17:57:42Z

Thanks for that example, @domoritz! That helped me understand the full scope of flattening. And it turns out updating formatparse wasn't as hard/as much work as I'd initially feared, yay! Updated this PR to more thoroughly address this issue, with corresponding test cases. So it should be ready for re-review.

domoritz

The second example in #5351 (comment) still doesn't work for me. Clicking the circles doesn't make them red.

If I replace

{
      "name": "grid_tuple_fields",
      "value": [{"type": "E", "field": "nested.bb"}]
    },

with

{
      "name": "grid_tuple_fields",
      "value": [{"type": "E", "field": "nested\\.bb"}]
    },

in the generated Vega, it works.

domoritz · 2019-09-07T23:19:08Z

src/compile/data/parse.ts

@@ -322,6 +322,7 @@ export function parseData(model: Model): DataComponent {
  }

  head = ParseNode.makeImplicitFromEncoding(head, model, ancestorParse) || head;


I believe that we may create unnecessarily flattening for fields that are already flattened in makeImplicitFromEncoding. I see three solutions. First, we could extract the code that computes implicit for encodings and selections and then call makeWithAncestors once for a single implicit dict. Second, makeImplicitFromEncoding could return its implicit, which we pass to makeImplicitFromSelection, and there we ignore fields that are already flattened otherwise. Second, you could merge makeImplicitFromEncoding and makeImplicitFromSelection so that they share the same implicit dict. I guess this is like the first approach but encapsulates the logic. I'd prefer the first as it's most explicit.

Thanks for the comment. Yes, I'd thought about this and tested it informally (e.g., using the spec below in which both makeImplicitFromEncoding and makeImplicitFromSelection are called). As far as I could tell, duplicate flattening does not occur. But, I wasn't sure how to formally test this as a unit test (i.e., focused on just parsing/assembling the ParseNode). I'm happy to add a test case that parses the whole model and assembles the full dataflow and test what's in the transform array, but that doesn't feel like the right approach. Open to ideas/suggestions here.

Re: your suggested solutions, option (1) won't work because the existing fromEncoding logic expects TypedFieldDefs -- selections only hold field and channel names as strings to simplify assembly. I'm happy to merge into a single makeImplicitFromEncodingAndSelection such that only a single implicit object is populated and passed to makeWithAncestors if we prefer that. I chose the current approach because I thought that better separated concerns.

Let me know what you'd prefer.

{ "$schema": "https://vega.github.io/schema/vega-lite/v3.json", "selection": { "grid": { "type": "multi", "fields": ["nested.b"] } }, "data": { "values": [ {"nested":{ "a": "1", "b": 28}}, {"nested":{ "a": "2", "b": 55}}, {"nested":{ "a": "3", "b": 43}}, {"nested":{ "a": "4", "b": 91}}, {"nested":{ "a": "5", "b": 81}}, {"nested":{ "a": "6", "b": 53}}, {"nested":{ "a": "7", "b": 19}}, {"nested":{ "a": "8", "b": 87}}, {"nested":{ "a": "8", "b": 52}} ] }, "mark": "point", "encoding": { "y": { "field": "nested.b", "type": "quantitative" }, "x": { "field": "nested.b", "type": "quantitative" }, "color": { "condition": {"selection": "grid", "value": "red"} } } }

does not yield any duplicate flattening (though it should, were it an issue I think?).

Which field would you expect to be flattened twice?

I can take care of avoiding duplicate flattening since I am more familiar with the dataflow stuff.

Sorry, I think we did a simultaneous update. In the updated spec above, the selection is projected over nested.b which is also used in encoding. We would expect to see duplicates, but we don't. I think this was your original concern in #5351 (comment) but perhaps I'm misunderstanding?

arvind · 2019-09-09T18:52:25Z

Thanks, I'd forgotten to send a PR for updating the vlSelectionTest function to no longer use a field accessor function. We should see the expected behavior with vega/vega#2002.

domoritz · 2019-09-09T18:59:10Z

Is it necessary to update Vega or should Vega-Lite generate different spec as I suggested above?

arvind · 2019-09-09T19:03:12Z

We should update Vega to keep our logic consistent. Otherwise, we introduce more complexity of needing to remember when we're referring to flattened fields, and when we're referring to nested fields. I'd rather not need to remember how to coordinate that, and just reference flattened fields always.

arvind · 2019-09-09T19:09:21Z

Oh I just noticed that Vega-Lite outputs nested\\.b for field refs elsewhere. Which function should I be calling to output the escaped version instead?

domoritz · 2019-09-09T19:11:01Z

We have to remember already since when we access a field in Vega and use a ., Vega interprets it as a nested access. But if we use . to write a field with as, the field is flattened and . becomes part of the name of the field.

domoritz · 2019-09-10T04:20:15Z

@arvind Please have a look at afc6337 and merge if you think it looks good.

domoritz

Merge after reviewing afc6337. @arvind

arvind · 2019-09-10T13:25:16Z

That’s a nice solution! Thanks @domoritz!

arvind requested a review from domoritz August 28, 2019 23:58

arvind changed the title ~~Correct references to flattened fields in selection signals.~~ fix: correct references to flattened fields in selection signals. Aug 29, 2019

domoritz force-pushed the as/selectionNestedFields branch from 8837838 to 23849a7 Compare August 29, 2019 00:44

arvind added 2 commits September 7, 2019 13:54

Update formatparse to flatten non-encoding selection projections.

62db907

Correct references to flattened fields in selection signals.

96d9071

arvind force-pushed the as/selectionNestedFields branch from 23849a7 to 96d9071 Compare September 7, 2019 17:56

domoritz requested changes Sep 7, 2019

View reviewed changes

arvind mentioned this pull request Sep 9, 2019

Selection fields are flattened, so should be directly accessed. vega/vega#2002

Closed

arvind and others added 2 commits September 9, 2019 15:30

Escape flattened field names in tuple_fields signal.

ea6f407

Extract how fields should be parsed implicitly.

afc6337

domoritz self-requested a review September 10, 2019 04:20

domoritz approved these changes Sep 10, 2019

View reviewed changes

arvind merged commit 88333ef into master Sep 10, 2019

arvind deleted the as/selectionNestedFields branch September 10, 2019 13:25

arvind mentioned this pull request Jan 6, 2021

fix: correctly flatten nested field references for scale bindings #7159

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct references to flattened fields in selection signals. #5351

fix: correct references to flattened fields in selection signals. #5351

arvind commented Aug 28, 2019 •

edited

Loading

domoritz commented Aug 29, 2019

arvind commented Aug 29, 2019 •

edited

Loading

domoritz commented Aug 29, 2019

arvind commented Aug 29, 2019 •

edited

Loading

domoritz commented Aug 29, 2019 •

edited

Loading

arvind commented Sep 7, 2019

domoritz left a comment

domoritz Sep 7, 2019

arvind Sep 9, 2019

arvind Sep 9, 2019 •

edited

Loading

domoritz Sep 9, 2019

arvind Sep 9, 2019

arvind commented Sep 9, 2019

domoritz commented Sep 9, 2019

arvind commented Sep 9, 2019

arvind commented Sep 9, 2019

domoritz commented Sep 9, 2019

domoritz commented Sep 10, 2019

domoritz left a comment

arvind commented Sep 10, 2019

		@@ -322,6 +322,7 @@ export function parseData(model: Model): DataComponent {
		}

		head = ParseNode.makeImplicitFromEncoding(head, model, ancestorParse) \|\| head;

fix: correct references to flattened fields in selection signals. #5351

fix: correct references to flattened fields in selection signals. #5351

Conversation

arvind commented Aug 28, 2019 • edited Loading

domoritz commented Aug 29, 2019

arvind commented Aug 29, 2019 • edited Loading

domoritz commented Aug 29, 2019

arvind commented Aug 29, 2019 • edited Loading

domoritz commented Aug 29, 2019 • edited Loading

arvind commented Sep 7, 2019

domoritz left a comment

Choose a reason for hiding this comment

domoritz Sep 7, 2019

Choose a reason for hiding this comment

arvind Sep 9, 2019

Choose a reason for hiding this comment

arvind Sep 9, 2019 • edited Loading

Choose a reason for hiding this comment

domoritz Sep 9, 2019

Choose a reason for hiding this comment

arvind Sep 9, 2019

Choose a reason for hiding this comment

arvind commented Sep 9, 2019

domoritz commented Sep 9, 2019

arvind commented Sep 9, 2019

arvind commented Sep 9, 2019

domoritz commented Sep 9, 2019

domoritz commented Sep 10, 2019

domoritz left a comment

Choose a reason for hiding this comment

arvind commented Sep 10, 2019

arvind commented Aug 28, 2019 •

edited

Loading

arvind commented Aug 29, 2019 •

edited

Loading

arvind commented Aug 29, 2019 •

edited

Loading

domoritz commented Aug 29, 2019 •

edited

Loading

arvind Sep 9, 2019 •

edited

Loading