Skip to content

Commit

Permalink
Merge pull request #15 from Urban-Analytics-Technology-Platform/12-ta…
Browse files Browse the repository at this point in the history
…sk-2b-primary-locations-work

Adds methods, notebooks and script for adding primary locations (#12)
  • Loading branch information
sgreenbury authored Jul 31, 2024
2 parents 54a84c5 + 914882b commit ac834c1
Show file tree
Hide file tree
Showing 14 changed files with 7,905 additions and 462 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,6 @@ Thumbs.db

# Misc
.copier-answers.yml

# Ignore data path
data/
3 changes: 3 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Exclude notebooks from pre-commit
exclude: \.ipynb$

ci:
autoupdate_commit_msg: "chore: update pre-commit hooks"
autofix_commit_msg: "style: pre-commit fixes"
Expand Down
35 changes: 17 additions & 18 deletions BACKGROUND.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil

### Adding activity patterns to synthetic population

#### NTS data
#### NTS data
- We are currently using the entire NTS sample, but this could include trips with unrepresentative distances (e.g. commuting distance in London is not the same as liverpool). See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/16

#### Household level matching
#### Household level matching
- We use categorical matching at the household level (level 1) and then propensity score matching (PSM) at the individual level (level 2)
- We need to implement PSM from the beginning to ensure that each individual in the SPC is matched to at least one sample from the NTS. See https://github.com/Urban-Analytics-Technology-Platform/acbm/issues/13
- Matching variables are decided using trial and error (see [2_match_households_and_individuals](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/d2f9e747c3d55148316661b13b1650fac4a5a4ad/notebooks/2_match_households_and_individuals.ipynb). Using PSM would allow us to use all variables
- For each SPC household, we randomly select one of the matched NTS households
- Rest of the assumptions are outlined in the [wiki page](https://github.com/Urban-Analytics-Technology-Platform/acbm/wiki/Adding-activity-patterns-to-synthetic-population)

#### Individual level matching
#### Individual level matching
- Done based on age_group and sex only. PSM without replacement

### Assigning activities to geographic locations

#### Mode and trip purpose mapping 
Expand All @@ -51,11 +51,11 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- For education POIs, I've done the following:   

> "kindergarden": ["education_kg", "work"],
>
>
> "school": ["education_school", "work"],
>
>
> "university": ["education_university", "work"],
>
>
> "college": ["education_college", "work"],
##### Selecting feasible zones for each activity
Expand All @@ -68,27 +68,27 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- If an individual in the NTS has an "education" activity, I map their age to an education type. See the age_group_mapping dictionary in 3_locations_primary:

> age_group_mapping = {
>
>
> 1: "education_kg", # "0-4"
>
>
> 2: "education_school", # "5-10"
>
>
> 3: "education_school", # "11-16"
>
>
> 4: "education_university", # "17-20"
>
>
> 5: "education_university", # "21-29"
>
>
> 6: "education_university", # "30-39"
>
>
> 7: "education_university", # "40-49"
>
>
> 8: "education_university", # "50-59"
>
>
> 9: "education_university" # "60+"
> }

- When selecting a location for an education activity in [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578), we try to select a zone that has a POI that matches the persons age group. If we can't we choose any other feasible zone with an education POI
- This logic should be moved upstream to the [get_possible_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L201). For each activity, we should always ensure that our list of feasible zones has a zone with our specific POI category. This should be added in the [filter_by_activity](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L374) logic. The filter_by_activity logic currently looks at activity purpose from the NTS (e.g. "education"). We need to add the extra level of detail from age_group_mapping, and then filter based on that instead
- We select a zone from the feasible zones probabilistically based on total floor area of the POIs that match the relevant activity. See [select_zone](https://github.com/Urban-Analytics-Technology-Platform/acbm/blob/c548fa7a6398dd0afde1398f7799e418b6068cd6/src/acbm/assigning.py#L578)
Expand All @@ -103,4 +103,3 @@ Since the SPC currently uses 2011 OA11CD and MSOA11CD codes, 2011 boundaries wil
- (**DONE** [here](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/6acecb928ea2b9bf26952eb45b86f2918a6dccdf)) migrate logic for age_group_mapping from `select_zone()` to `get_possible_zones()`
- edit `get_possible_zones()` to ensure it never returns an empty list of zones. See above for how to do this
- 14/05/2024: I created another function `fill_missing_zones()`. see [this commit](https://github.com/Urban-Analytics-Technology-Platform/acbm/commit/10ae82b3923cdc51474d3721df80e332ea74ba03#diff-48d91584494e303c162dd8c5b8881de35f33976f2f688cd5a56db01b7ff1f233)

4 changes: 2 additions & 2 deletions notebooks/1_prep_synthpop.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
"source": [
"# Pick a region with SPC output saved\n",
"path = \"../data/external/spc_output/raw/\"\n",
"region = \"west-yorkshire\""
"region = \"leeds\""
]
},
{
Expand Down Expand Up @@ -195,7 +195,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
"version": "3.11.9"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit ac834c1

Please sign in to comment.