You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tu_min_extract_ts and tu_max_extract_ts.
Analyst could easily calculate this difference (rt_trip_elapsed = max_extract_ts-min_extract_ts).
Use that calculation to get rt_trip_elapsed / num_distinct_message_ids = number of distinct messages per minute. If we're hitting close to 3, that's close to capturing every 20 sec.
number_of_distinct_minutes_with_vp as a measure of RT availability for that trip. Analyst would compare against scheduled trip's service_hours --> converted to service_minutes and see how much coverage had RT. This could easily surpass a proportion of 1.0, esp when a trip takes longer than the scheduled time and there are vehicle positions keep on coming in.
time_of_day: this could be where the heuristic is applied. trip_id appears in schedule and RT, use schedule to determine time_of_day, otherwise use tu_min_extract_ts
tiffanychu90
changed the title
User Story: add summary stats for to mart_gtfs.fct_observed_trips
User Story: add summary stats for to mart_gtfs . fct_observed_trips
Mar 14, 2023
tiffanychu90
changed the title
User Story: add summary stats for to mart_gtfs . fct_observed_trips
User Story: add summary stats for to mart_gtfs.fct_observed_trips
Mar 14, 2023
User stories
Summary
As an analyst, I want to have a schedule vs RT trip table that provide additional summary statistics for the RT trips observed.
Table Schema
mart_gtfs.gtfs.fct_observed_trips
tablegtfs_dataset_key-name-trip_id-activity_date
tu_num_distinct_message_ids
(existing)tu_min_extract_ts
andtu_max_extract_ts
.Analyst could easily calculate this difference (
rt_trip_elapsed = max_extract_ts-min_extract_ts
).Use that calculation to get
rt_trip_elapsed / num_distinct_message_ids
= number of distinct messages per minute. If we're hitting close to 3, that's close to capturing every 20 sec.number_of_distinct_minutes_with_vp
as a measure of RT availability for that trip. Analyst would compare against scheduled trip'sservice_hours
--> converted toservice_minutes
and see how much coverage had RT. This could easily surpass a proportion of 1.0, esp when a trip takes longer than the scheduled time and there are vehicle positions keep on coming in.time_of_day
: this could be where the heuristic is applied.trip_id
appears in schedule and RT, use schedule to determinetime_of_day
, otherwise usetu_min_extract_ts
Tester [Stakeholder]
Notes
WIP @amandaha8 providing exploratory context for @edasmalchi's idea: cal-itp/data-analyses#668
The text was updated successfully, but these errors were encountered: