You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Travel distances vary across different areas. For example, we cannot assume that commute distances in London are the same as those in Cambridge.
Matching (#8) is done based on socioeconomic and demographic variables, but individuals/households in different parts of the country that share the same variables may exhibit different travel behaviour due to land use / transport options. If we don't filter, we may end up with travel distances in our study area that are too long, or a mode share that is not representative.
When carrying out matching for any area, we should filter the NTS survey data for that area. Initially I was doing this (see this function, but the sample became to small and matching at the household level notebook resulted in a low matching rate.
Possible workarounds:
Use filter_by_region() but include all regions that would have similar travel patterns to the study area
Use filter_by_region() to filter the NTS to the study area. Apply propensity score matching on the household level, as described in Improve matching approach #13
The text was updated successfully, but these errors were encountered:
jupyter notebook showing regional variations in mode share, travel times etc. This could inform user choice on which regions from the NTS to include in their model. If a user is applying the model to LEeds, they may decide to use Yorkshire and The Humber only, or include other regions that ave similar characteristics. The latter is useful so that the sample size is not too small
Include more detailed regional filtering: The PSUStatsReg_B01ID column int he NTS has the breakdown below. The metropolitan / non-metropolitan categorisation is useful for getting representative data (but will reduce our sample size)
Value = -10.0 Label = DEAD
Value = -9.0 Label = DNA
Value = -8.0 Label = NA
Value = 1.0 Label = Northern, Metropolitan
Value = 2.0 Label = Northern, Non-metropolitan
Value = 3.0 Label = Yorkshire / Humberside, Metropolitan
Value = 4.0 Label = Yorkshire / Humberside, Non-metropolitan
Value = 5.0 Label = East Midlands
Value = 6.0 Label = East Anglia
Value = 7.0 Label = South East (excluding London Boroughs)
Value = 8.0 Label = London Boroughs
Value = 9.0 Label = South West
Value = 10.0 Label = West Midlands, Metropolitan
Value = 11.0 Label = West Midlands, Non-metropolitan
Value = 12.0 Label = North West, Metropolitan
Value = 13.0 Label = North West, Non-metropolitan
Value = 14.0 Label = Wales
Value = 15.0 Label = Scotland
Travel distances vary across different areas. For example, we cannot assume that commute distances in London are the same as those in Cambridge.
Matching (#8) is done based on socioeconomic and demographic variables, but individuals/households in different parts of the country that share the same variables may exhibit different travel behaviour due to land use / transport options. If we don't filter, we may end up with travel distances in our study area that are too long, or a mode share that is not representative.
When carrying out matching for any area, we should filter the NTS survey data for that area. Initially I was doing this (see this function, but the sample became to small and matching at the household level notebook resulted in a low matching rate.
Possible workarounds:
filter_by_region()
but include all regions that would have similar travel patterns to the study areafilter_by_region()
to filter the NTS to the study area. Apply propensity score matching on the household level, as described in Improve matching approach #13The text was updated successfully, but these errors were encountered: