Modified recursive dictionary parsing to correctly handle lists in dvc plot templates #134

alexk101 · 2023-05-19T23:13:00Z

Previously, the way in which lists were parsed from dvc templates caused the dict_find_value function to end early without continuing to parse the rest of the template once it encountered a list. Additionally, the structure of vega-lite plots when using lists allows for the case of a list of strings. Because of this, when called recursively as was done in 635eaff , the function may fail when it encounters something like a list of strings, as it tries to access the values of d, which is no longer a dictionary, but a string. I have fixed this by allowing for d to be a string, but this may not be desired/sensible since the function is called dict_find_value. It may make sense to change the name of the function or to call another function to handle the string case.

I include a template which previously would not parse as reference for the changes made.

{
    "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
    "repeat": ["quintile"],
    "config": {"view": {"stroke": "transparent"}},
    "spec": {
        "title": {"text": "<DVC_METRIC_TITLE>", "anchor": "middle"},
        "encoding": {
            "column": {"field": "category", "header": {"orient": "bottom"}},
            "row": {"field": "quintile", "title": "Quintile"},
            "y": {"field": "<DVC_METRIC_Y>", "type": "quantitative"},
            "x": {"field": "<DVC_METRIC_X>", "axis": null},
            "color": {"field": "<DVC_METRIC_X>"}
            },
        "mark": "bar",
        "data": {
            "values": "<DVC_METRIC_DATA>"
        }
    }
}

…c plot templates

shcheklein · 2023-05-19T23:49:00Z

src/dvc_render/vega_templates.py

-                return True
-        if isinstance(v, list):
-            return any(dict_find_value(e, value) for e in v)
+def dict_find_value(d: Union[dict, str], value: str) -> bool:


to generalize: can it be a list also? E.g. in case of some nested lists.

shcheklein · 2023-05-19T23:51:14Z

Thanks for the fix @alexk101 ! It looks reasonable to me, we can keep the name also (it's still trying to find a name).

Can we add some tests for this please?

dberenbaum · 2023-05-20T00:40:04Z

@alexk101 Thanks for catching this and the changes look good! Since I was also working on this recently (causing this bug), my feeling was that this felt a bit complicated just to check for the anchor, and it feels even more complicated now.

An alternative approach would be to search the raw text for <DVC_METRIC_DATA> before parsing it as json. It looks like the template initially gets read here:

dvc-render/src/dvc_render/vega_templates.py

Lines 691 to 693 in 635eaff

    
           with _open(template_path, encoding="utf-8") as f: 
        
               content = json.load(f) 
        
           return Template(content, name=template)

You could replace that with something like:

        with _open(template_path, encoding="utf-8") as f:
            raw = f.read()
        content = json.loads(raw)
        template = Template(content, name=template)
        anchor = template.anchor("data")
        if anchor not in raw:
            raise BadTemplateError(
                f"Template '{template.name}' "
                f"is not using '{anchor}' anchor"
            )
        return template

Then we could drop dict_find_value and the validation logic that uses it.

Let me know if that makes sense. The approach you have also works if we have some better tests to avoid a bug like the one I introduced.

Extend tests.

codecov-commenter · 2023-05-20T07:49:08Z

Codecov Report

Patch coverage: 86.36% and project coverage change: -0.24 ⚠️

Comparison is base (635eaff) 96.26% compared to head (fb6631e) 96.03%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #134      +/-   ##
==========================================
- Coverage   96.26%   96.03%   -0.24%     
==========================================
  Files          19       19              
  Lines         777      781       +4     
  Branches      125      129       +4     
==========================================
+ Hits          748      750       +2     
- Misses         16       17       +1     
- Partials       13       14       +1

Impacted Files	Coverage Δ
src/dvc_render/vega_templates.py	`94.19% <81.25%> (-1.14%)`	⬇️
tests/test_templates.py	`100.00% <100.00%> (ø)`
tests/test_vega.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

daavoo · 2023-05-20T07:57:07Z

Then we could drop dict_find_value and the validation logic that uses it.

Actually, I think that we should be validating more strictly.
We are very opinionated about where the anchor data should be put (inside data->values) but the current validation (and the raw string matching) passes as long as the anchor is somewhere in the template.

Modified recursive dictionary parsing to correctly handle lists in dv…

e31956d

…c plot templates

alexk101 mentioned this pull request May 19, 2023

Add support for arbitrary Vega-Lite plots iterative/dvclive#569

Closed

shcheklein reviewed May 19, 2023

View reviewed changes

Rename dict_find_value -> find_value.

b67958d

Extend tests.

daavoo approved these changes May 20, 2023

View reviewed changes

Update type hint

fb6631e

daavoo merged commit e11b61c into iterative:main May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified recursive dictionary parsing to correctly handle lists in dvc plot templates #134

Modified recursive dictionary parsing to correctly handle lists in dvc plot templates #134

alexk101 commented May 19, 2023

shcheklein May 19, 2023

shcheklein commented May 19, 2023

dberenbaum commented May 20, 2023

codecov-commenter commented May 20, 2023 •

edited

Loading

daavoo commented May 20, 2023

Modified recursive dictionary parsing to correctly handle lists in dvc plot templates #134

Modified recursive dictionary parsing to correctly handle lists in dvc plot templates #134

Conversation

alexk101 commented May 19, 2023

shcheklein May 19, 2023

Choose a reason for hiding this comment

shcheklein commented May 19, 2023

dberenbaum commented May 20, 2023

codecov-commenter commented May 20, 2023 • edited Loading

Codecov Report

daavoo commented May 20, 2023

codecov-commenter commented May 20, 2023 •

edited

Loading