Filter NTS data to study area to avoid unrepresentative travel distances or mode share #16

Hussein-Mahfouz · 2024-04-18T09:08:18Z

Travel distances vary across different areas. For example, we cannot assume that commute distances in London are the same as those in Cambridge.

Matching (#8) is done based on socioeconomic and demographic variables, but individuals/households in different parts of the country that share the same variables may exhibit different travel behaviour due to land use / transport options. If we don't filter, we may end up with travel distances in our study area that are too long, or a mode share that is not representative.

When carrying out matching for any area, we should filter the NTS survey data for that area. Initially I was doing this (see this function, but the sample became to small and matching at the household level notebook resulted in a low matching rate.

Possible workarounds:

Use filter_by_region() but include all regions that would have similar travel patterns to the study area
Use filter_by_region() to filter the NTS to the study area. Apply propensity score matching on the household level, as described in Improve matching approach #13

The text was updated successfully, but these errors were encountered:

Hussein-Mahfouz · 2024-11-01T12:40:20Z

@sgreenbury @BZ-BowenZhang and I discussed this today, and we should implement it given regional variations in travel time and mode share.

Steps:

Uncomment filter_by_region() here and here in script 2
Remove regions from here and add to config

Nice to have

jupyter notebook showing regional variations in mode share, travel times etc. This could inform user choice on which regions from the NTS to include in their model. If a user is applying the model to LEeds, they may decide to use Yorkshire and The Humber only, or include other regions that ave similar characteristics. The latter is useful so that the sample size is not too small
Include more detailed regional filtering: The PSUStatsReg_B01ID column int he NTS has the breakdown below. The metropolitan / non-metropolitan categorisation is useful for getting representative data (but will reduce our sample size)

	Value = -10.0	Label = DEAD
	Value = -9.0	Label = DNA
	Value = -8.0	Label = NA
	Value = 1.0	Label = Northern, Metropolitan
	Value = 2.0	Label = Northern, Non-metropolitan
	Value = 3.0	Label = Yorkshire / Humberside, Metropolitan
	Value = 4.0	Label = Yorkshire / Humberside, Non-metropolitan
	Value = 5.0	Label = East Midlands
	Value = 6.0	Label = East Anglia
	Value = 7.0	Label = South East (excluding London Boroughs)
	Value = 8.0	Label = London Boroughs
	Value = 9.0	Label = South West
	Value = 10.0	Label = West Midlands, Metropolitan
	Value = 11.0	Label = West Midlands, Non-metropolitan
	Value = 12.0	Label = North West, Metropolitan
	Value = 13.0	Label = North West, Non-metropolitan
	Value = 14.0	Label = Wales
	Value = 15.0	Label = Scotland

Hussein-Mahfouz added the enhancement New feature or request label Apr 18, 2024

Hussein-Mahfouz self-assigned this Apr 18, 2024

Hussein-Mahfouz added the Task 1 creating activity chains label Apr 19, 2024

sgreenbury mentioned this issue Aug 7, 2024

Document data sources and script generating the required input data #39

Closed

6 tasks

Hussein-Mahfouz mentioned this issue Nov 1, 2024

Add region and year to config #64

Closed

Hussein-Mahfouz mentioned this issue Nov 14, 2024

add filter_by_region fn and migrate region, years, and travday to config #67

Merged

Hussein-Mahfouz linked a pull request Nov 14, 2024 that will close this issue

add filter_by_region fn and migrate region, years, and travday to config #67

Merged

Hussein-Mahfouz closed this as completed in #67 Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter NTS data to study area to avoid unrepresentative travel distances or mode share #16

Filter NTS data to study area to avoid unrepresentative travel distances or mode share #16

Hussein-Mahfouz commented Apr 18, 2024

Hussein-Mahfouz commented Nov 1, 2024

Filter NTS data to study area to avoid unrepresentative travel distances or mode share #16

Filter NTS data to study area to avoid unrepresentative travel distances or mode share #16

Comments

Hussein-Mahfouz commented Apr 18, 2024

Hussein-Mahfouz commented Nov 1, 2024

Steps:

Nice to have