From 914882b025dd532fd53cf645615c138a9a3b736d Mon Sep 17 00:00:00 2001 From: Sam Greenbury Date: Wed, 31 Jul 2024 15:37:13 +0100 Subject: [PATCH] Apply pre-commit --- BACKGROUND.md | 35 +++++++++++++++++------------------ 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/BACKGROUND.md b/BACKGROUND.md index a280da1..ccaf678 100644 --- a/BACKGROUND.md +++ b/BACKGROUND.md @@ -17,19 +17,19 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil ### Adding activity patterns to synthetic population -#### NTS data +#### NTS data - We are currently using the entire NTS sample, but this could include trips with unrepresentative distances (e.g. commuting distance in London is not the same as liverpool). See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/16 -#### Household level matching +#### Household level matching - We use categorical matching at the household level (level 1) and then propensity score matching (PSM) at the individual level (level 2) - We need to implement PSM from the beginning to ensure that each individual in the SPC is matched to at least one sample from the NTS. See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/13 - Matching variables are decided using trial and error (see [2_match_households_and_individuals](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/d2f9e747c3d55148316661b13b1650fac4a5a4ad/notebooks/2_match_households_and_individuals.ipynb). Using PSM would allow us to use all variables - For each SPC household, we randomly select one of the matched NTS households - Rest of the assumptions are outlined in the [wiki page](https://github.com/Urban-Analytics-Technology-Platform/acbm/wiki/Adding-activity-patterns-to-synthetic-population) -#### Individual level matching +#### Individual level matching - Done based on age_group and sex only. PSM without replacement - + ### Assigning activities to geographic locations #### Mode and trip purpose mapping  @@ -51,11 +51,11 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil - For education POIs, I've done the following:    > "kindergarden": ["education_kg", "work"], -> +> > "school": ["education_school", "work"], -> +> > "university": ["education_university", "work"], -> +> > "college": ["education_college", "work"], ##### Selecting feasible zones for each activity @@ -68,27 +68,27 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil - If an individual in the NTS has an "education" activity, I map their age to an education type. See the age_group_mapping dictionary in 3_locations_primary: > age_group_mapping = { -> +> > 1: "education_kg", # "0-4" -> +> > 2: "education_school", # "5-10" -> +> > 3: "education_school", # "11-16" -> +> > 4: "education_university", # "17-20" -> +> > 5: "education_university", # "21-29" -> +> > 6: "education_university", # "30-39" -> +> > 7: "education_university", # "40-49" -> +> > 8: "education_university", # "50-59" -> +> > 9: "education_university" # "60+" > } - + - When selecting a location for an education activity in [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578), we try to select a zone that has a POI that matches the persons age group. If we can't we choose any other feasible zone with an education POI - This logic should be moved upstream to the [get_possible_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L201). For each activity, we should always ensure that our list of feasible zones has a zone with our specific POI category. This should be added in the [filter_by_activity](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L374) logic. The filter_by_activity logic currently looks at activity purpose from the NTS (e.g. "education"). We need to add the extra level of detail from age_group_mapping, and then filter based on that instead - We select a zone from the feasible zones probabilistically based on total floor area of the POIs that match the relevant activity. See [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578) @@ -103,4 +103,3 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil - (**DONE** [here](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/6acecb928ea2b9bf26952eb45b86f2918a6dccdf)) migrate logic for age_group_mapping from `select_zone()` to `get_possible_zones()` - edit `get_possible_zones()` to ensure it never returns an empty list of zones. See above for how to do this - 14/05/2024: I created another function `fill_missing_zones()`. see [this commit](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/10ae82b3923cdc51474d3721df80e332ea74ba03#diff-48d91584494e303c162dd8c5b8881de35f33976f2f688cd5a56db01b7ff1f233) -