Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rjf/csv output #168

Merged
merged 11 commits into from
Mar 29, 2024
Merged

Rjf/csv output #168

merged 11 commits into from
Mar 29, 2024

Conversation

robfitzgerald
Copy link
Collaborator

@robfitzgerald robfitzgerald commented Mar 28, 2024

this PR introduces a new CSV output format for RouteE Compass. a user can specify a mapping from JSON dot-delimited paths into CSV column names. for example:

[response_output_policy]
type = "file"
filename = "output.csv"
[response_output_policy.format]
type = "csv"
sorted = true
[response_output_policy.format.mapping]
od_id = "request.name"
dlon = "request.destination_x"
dlat = "request.destination_y"
olon = "request.origin_x"
olat = "request.origin_y"
model_name = "request.model_name"
distance = "traversal_summary.distance"
time = "traversal_summary.time"
energy_electric = { optional = "traversal_summary.energy_electric" }
energy_liquid = { optional = "traversal_summary.energy_electric" }
distance_cost = "cost.distance"
time_cost = "cost.time"
energy_electric_cost = { optional = "cost.energy_electric" }
energy_liquid_cost = { optional = "cost.energy_liquid" }
total_cost = "cost.total_cost"
distance_unit = "state_model.distance.distance_unit"
time_unit = "state_model.time.time_unit"
energy_electic_unit = { optional = "state_model.energy_electric.energy_unit" }
energy_liquid_unit = { optional = "state_model.energy_liquid.energy_unit" }
distance_weight = "request.weights.distance"
time_weight = "request.weights.time"
energy_electric_weight = { optional = "request.weights.energy_electric" }
energy_liquid_weight = { optional = "request.weights.energy_liquid" }

running this with the query in our example notebook produces the following CSV:

distance,distance_cost,distance_unit,distance_weight,dlat,dlon,energy_electic_unit,energy_electric,energy_electric_cost,energy_electric_weight,energy_liquid,energy_liquid_cost,energy_liquid_unit,energy_liquid_weight,model_name,od_id,olat,olon,time,time_cost,time_unit,time_weight,total_cost
6.733433188572938,4.4103987385152745,"miles",0,39.693005,-104.97536,"kilowatt_hours",0.0,0.0,null,0.0,0.7994559767720513,"gallons_gasoline",1,"2016_TOYOTA_Camry_4cyl_2WD","least_energy",39.779021,-104.969307,12.777312852016575,4.21651324116547,"minutes",0,9.426367956452797
9.152878807061885,5.995135618625535,"miles",0,39.693005,-104.97536,"kilowatt_hours",0.0,0.0,null,0.0,0.9603338983361063,"gallons_gasoline",0,"2016_TOYOTA_Camry_4cyl_2WD","least_time",39.779021,-104.969307,10.905464892278664,3.598803414451959,"minutes",1,10.5542729314136
6.470755749159553,4.238345015699507,"miles",1,39.693005,-104.97536,"kilowatt_hours",0.0,0.0,null,0.0,0.8412387893925269,"gallons_gasoline",1,"2016_TOYOTA_Camry_4cyl_2WD","least_cost",39.779021,-104.969307,13.158554513449252,4.342322989438253,"minutes",1,9.421906794530287

details

  • if paths are invalid, an error will be written to the response JSON object in memory for debugging at $["error"]["csv"]
  • had trouble finding out how to keep the field ordering the same as the ordering provided in TOML
    • tried using OrderedHashMap lib
    • maybe problem is because of TOML object read by Config lib doesn't respect ordering
    • provided a "sorted" setting which can at least sort the fields by name
    • we could move to a TOML array of pairs instead of object?
  • the mapping values can be a Path, Optional, or Sum. see csv_mapping.rs for the enum implementation and serialization, but they look like this:
regular_path = "path.to.value"
optional_path = { optional = "path.to.optional.value" }
sum_path = { sum = [
  "path.to.first",
  "path.to.second"
] }
sum_with_optional = { sum = [
  "path.to.non.optional",
  { optional = "path.to.optional" }
] }

along the way,

Closes #147.
Closes #161.
Closes #164.

@robfitzgerald robfitzgerald requested a review from nreinicke March 28, 2024 16:59
Copy link
Collaborator

@nreinicke nreinicke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, thanks for adding in.

Just a tiny thing but I think the notebook needs to be updated a little bit, I got this error when trying to run it:
image

@robfitzgerald
Copy link
Collaborator Author

k, will fix this. it looks like

  • an autosave overwrote the config filename
  • the cost model field names weren't updated in the TOML files

@robfitzgerald
Copy link
Collaborator Author

robfitzgerald commented Mar 29, 2024

i realized this morning our typical use case will actually be a combined output policy, allowing a summary in CSV and the complete output as JSON, so i added that. here's an example:

[response_output_policy]
type = "combined"
[[response_output_policy.policies]]
type = "file"
filename = "output_complete.json"
format = { type = "json", newline_delimited = true }

[[response_output_policy.policies]]
type = "file"
filename = "output_summary.csv"
[response_output_policy.policies.format]
type = "csv"
sorted = true
[response_output_policy.policies.format.mapping]
od_id = "request.name"
dlon = "request.destination_x"
dlat = "request.destination_y"
olon = "request.origin_x"
olat = "request.origin_y"
model_name = "request.model_name"
distance = "traversal_summary.distance"
time = "traversal_summary.time"
energy_electric = { optional = "traversal_summary.energy_electric" }
energy_liquid = { optional = "traversal_summary.energy_electric" }
distance_cost = "cost.distance"
time_cost = "cost.time"
energy_electric_cost = { optional = "cost.energy_electric" }
energy_liquid_cost = { optional = "cost.energy_liquid" }
total_cost = "cost.total_cost"
distance_unit = "state_model.distance.distance_unit"
time_unit = "state_model.time.time_unit"
energy_electic_unit = { optional = "state_model.energy_electric.energy_unit" }
energy_liquid_unit = { optional = "state_model.energy_liquid.energy_unit" }
distance_weight = "request.weights.distance"
time_weight = "request.weights.time"
energy_electric_weight = { optional = "request.weights.energy_electric" }
energy_liquid_weight = { optional = "request.weights.energy_liquid" }

@robfitzgerald
Copy link
Collaborator Author

@nreinicke this has been updated, does it work now for you? also, see note above about combined output policies.

Copy link
Collaborator

@nreinicke nreinicke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call on the combined output policy, that makes a lot of sense. Just retested and everything looks good!

@robfitzgerald robfitzgerald merged commit 82bf03e into main Mar 29, 2024
5 checks passed
@robfitzgerald robfitzgerald deleted the rjf/csv-output branch March 29, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants