Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User Story: add summary stats for to mart_gtfs.fct_observed_trips #2381

Closed
tiffanychu90 opened this issue Mar 14, 2023 · 2 comments
Closed
Assignees
Labels
warehouse-poc Staging proof-of-concept tables related to analytics-driven warehouse models

Comments

@tiffanychu90
Copy link
Member

tiffanychu90 commented Mar 14, 2023

User stories

Summary

As an analyst, I want to have a schedule vs RT trip table that provide additional summary statistics for the RT trips observed.

Table Schema

  • build out the existing mart_gtfs.gtfs.fct_observed_trips table
  • Grain: gtfs_dataset_key-name-trip_id-activity_date
  • Metric columns:
    • tu_num_distinct_message_ids (existing)
    • tu_min_extract_ts and tu_max_extract_ts.
      Analyst could easily calculate this difference (rt_trip_elapsed = max_extract_ts-min_extract_ts).
      Use that calculation to get rt_trip_elapsed / num_distinct_message_ids = number of distinct messages per minute. If we're hitting close to 3, that's close to capturing every 20 sec.
    • number_of_distinct_minutes_with_vp as a measure of RT availability for that trip. Analyst would compare against scheduled trip's service_hours --> converted to service_minutes and see how much coverage had RT. This could easily surpass a proportion of 1.0, esp when a trip takes longer than the scheduled time and there are vehicle positions keep on coming in.
    • time_of_day: this could be where the heuristic is applied. trip_id appears in schedule and RT, use schedule to determine time_of_day, otherwise use tu_min_extract_ts

Tester [Stakeholder]

Notes

WIP @amandaha8 providing exploratory context for @edasmalchi's idea: cal-itp/data-analyses#668

@tiffanychu90 tiffanychu90 added the warehouse-poc Staging proof-of-concept tables related to analytics-driven warehouse models label Mar 14, 2023
@tiffanychu90 tiffanychu90 changed the title User Story: add summary stats for to mart_gtfs.fct_observed_trips User Story: add summary stats for to mart_gtfs . fct_observed_trips Mar 14, 2023
@tiffanychu90 tiffanychu90 changed the title User Story: add summary stats for to mart_gtfs . fct_observed_trips User Story: add summary stats for to mart_gtfs.fct_observed_trips Mar 14, 2023
@lauriemerrell
Copy link
Contributor

@tiffanychu90
Copy link
Member Author

Closing - progress based on above mentioned issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
warehouse-poc Staging proof-of-concept tables related to analytics-driven warehouse models
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants