Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialize state model with index #251

Merged
merged 4 commits into from
Sep 3, 2024

Conversation

robfitzgerald
Copy link
Collaborator

this PR refactors the route.state_model output field in order to make it clearer to the user how it relates to state tuples.

there were two ways to improve readability, with problems each:

  • keep state_model a JSON object, but add an "index" field to denote the tuple index for the feature
    • problem: order is not guaranteed, so, programmatically .zip()-ing the state tuple with the state_model doesn't give you what you would expect
  • change state_model to a JSON array so it structurally matches the state tuple
    • problem: we can no longer index into the state model by name: route.state_model.energy is no longer a key

i opted for an array because i think programmatic access is probably more useful for people toying around in the JSON outputs. but, this means any existing scripting that parses outputs will be broken by this change.

@robfitzgerald
Copy link
Collaborator Author

here's an example output:

        "state_model": [
            {
                "index": 0,
                "initial": 0.0,
                "name": "time",
                "time_unit": "minutes"
            },
            {
                "energy_unit": "kilowatt_hours",
                "index": 1,
                "initial": 0.0,
                "name": "energy_electric"
            },
            {
                "format": {
                    "floating_point": {
                        "initial": 100.0
                    }
                },
                "index": 2,
                "name": "battery_state",
                "type": "soc",
                "unit": "percent"
            },
            {
                "distance_unit": "kilometers",
                "index": 3,
                "initial": 0.0,
                "name": "distance"
            }
        ],

and, here's what i mean about breaking changes to JSON paths:

[2024-08-29T17:20:29Z ERROR routee_compass::app::cli::run] Error: {"csv":{"distance_unit":"could not find object distance in path route.state_model.distance.distance_unit","time_unit":"could not find object time in path route.state_model.time.time_unit"}}

Copy link
Collaborator

@nreinicke nreinicke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I do like having the state model match the result state but it's a bummer that we would lose the direct access to the path by name. It seems like we could get the best of both worlds by just keeping it in a map format by name and then including the explicit index? Is there another reason to have it as an array and have the index included?

@robfitzgerald
Copy link
Collaborator Author

... it's a bummer that we would lose the direct access to the path by name. It seems like we could get the best of both worlds by just keeping it in a map format by name and then including the explicit index? Is there another reason to have it as an array and have the index included?

i just don't love the existing serialization of the map format. directly serializing CompactOrderedHashMap will not guarantee index ordering once it is JSON. i guess the best of both worlds would include a custom serde Serializer for CompactOrderedHashMap that does what i want. and that's the main reason i went to arrays. so, let me look into a custom serializer so we can get both a map with indexing and index-ordering in the serialization order.

@robfitzgerald
Copy link
Collaborator Author

figured it out! there's a "preserve_order" for serde_json which replaces the HashMap-based Value::Object representation with an IndexMap to preserve insertion order, and then also iterating in the correct order, i can create the best of both worlds as requested:

"state_model": {
            "time": {
                "time_unit": "minutes",
                "initial": 0.0,
                "index": 0,
                "name": "time"
            },
            "distance": {
                "distance_unit": "kilometers",
                "initial": 0.0,
                "index": 1,
                "name": "distance"
            },
            "battery_state": {
                "type": "soc",
                "unit": "percent",
                "format": {
                    "floating_point": {
                        "initial": 100.0
                    }
                },
                "index": 2,
                "name": "battery_state"
            },
            "energy_electric": {
                "energy_unit": "kilowatt_hours",
                "initial": 0.0,
                "index": 3,
                "name": "energy_electric"
            }
        },

caveat

this affects serde_json's behavior globally so now nothing is lexicagraphically ordered, it's all insertion-based.

@nreinicke ready for your review again

@nreinicke
Copy link
Collaborator

Beautiful, thanks for modifying that to keep the original format with ordering. Looks like we just need to modify the tests to expect this new ordering

Copy link
Collaborator

@nreinicke nreinicke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌴

@robfitzgerald robfitzgerald merged commit 016ac85 into main Sep 3, 2024
5 checks passed
@robfitzgerald robfitzgerald deleted the rjf/serialize_state_model_with_index branch September 3, 2024 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants