Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task 2b: Primary Locations (Work) #12

Closed
Hussein-Mahfouz opened this issue Apr 4, 2024 · 4 comments · Fixed by #15 or #29
Closed

Task 2b: Primary Locations (Work) #12

Hussein-Mahfouz opened this issue Apr 4, 2024 · 4 comments · Fixed by #15 or #29
Assignees
Labels
overview overview of steps for a subtask Task 2 assigning activities to geographic locations

Comments

@Hussein-Mahfouz
Copy link
Collaborator

How do you add work locations for the SPC after each individual has been mapped to an individual in the NTS? We start with the input data (let's call it spc_activity_chains). Each individual now has:

  • sic1d2007: Standard Industry Classification of economic activities 2007, 1st level (derived from UK TUS 2015) - (from the SPC)
  • an activity chain with TripPurpose | TripPurposeFrom | TripEnd | TripTotalTime | Mode - (from the NTS)

A potential workflow could include:

1. From spc_activity_chains, filter all individuals with TripPurpose = work

2. Identify spatial distribution of different jobs

  1. Use the businessRegistry.csv.gz file produced in the spc project link to documentation. It has a breakdown of business units by sic1d2007

3. Determine feasible locations (zones) of workplace

  1. Calculate travel time matrix by mode at the zone level. This can be done with a routing engine like r5r

  2. For each person in spc_activity_chains, identify zones that are reachable within a buffer time of TripTotalTime (e.g. TripTotalTime +- 15 minutes)

    • The actual time should be based on the Mode used by the person
    • The assumption is that work trips are home based. In step [1], we could focus only on trips where TripPurpFrom = Home

4. Choose a zone from feasible zones

  1. For each person, the probability of commuting to a zone from the feasible zones is proportional to the number of jobs in the zone that match their sic1d2007 
  2. We loop over each person in spc_activity_chains. Once they are assigned to zone i, we reduce the number of jobs from sic1d2007 in zone i by 1. This ensures that we don't overassign people to zones. Sample without replacement
  3. The assignment of people to zones needs to be constrained by the census flow data

5. Choose a specific workplace

  1. Same logic as step 4, but we need a spatial dataset of work locations. Options:
    • Workplaces created in SPC link. I think this makes most sense to start with but I should understand better how it was created
    • osmox: I think all workplaces are the same. No disaaggregation by sector
    • ONS POI data. Need to check if they allow free access to academics

Notes

Step 4 / 5

  • Not sure if step 4 is necessary - I could just jump to Step 5
  • This paper seems to do step 4 / 5 in a clever way, but I don't really understand it. The lit review could be useful

Other

  • how to handle people travelling together? TripTime could be distorted by a household member dropping off another household member on the way to work.
    • If this is not accounted for, some people will be assigned to work locations that are further away than reality
    • How many people does this affect? Negligible?
    • What is the actual detour? Negligible?
@Hussein-Mahfouz Hussein-Mahfouz added the overview overview of steps for a subtask label Apr 5, 2024
@Hussein-Mahfouz
Copy link
Collaborator Author

notes from meeting with @stuartlynn re handling people travelling together:

  • My assumption above is that all work trips are home -> work. I need to check the NTS data to see the distribution of TRipPurposeFrom when TripPurposeTo is work
  • It may make more sense to model trips to school first. It is reasonable to assume that all school trips originate at home. Once school trips are assigned to physical locations, I can then model the school -> work trip for adults dropping off their kids and then going to work.
  • The method would be the same as that outlined above, but feasible locations would be based on zones reachable within TotalTripTime from school location

@sgreenbury
Copy link
Collaborator

From discussion of options on matching people to workplaces (and other locations):

  • Current matching (code) of people to workplaces in SPC (also included in output)
  • Other options discussed include distance and time constraints to satisfy schedules matched from NTS

@Hussein-Mahfouz Hussein-Mahfouz added the Task 2 assigning activities to geographic locations label Apr 19, 2024
@BZ-BowenZhang BZ-BowenZhang self-assigned this Apr 26, 2024
@Hussein-Mahfouz
Copy link
Collaborator Author

How to constrain the flows to the census commuting data. From A dynamic microsimulation model for epidemics (dyme paper) section 2.4.2:

we initially adopt a stylized approach constructing ‘virtual workplaces’ which rely on the 2011 UK Census commuting origin-destination tables at the MSOA level for individuals with a fixed workplace. The UKTUS data includes a Standard Industry Classification (SIC) code for everyone in the dataset.Matching data from the UKTUS to SPENSER baseline data via the PSM process and the UKTUS we were able to assign to each of our synthetic resident workers an employer industry among the 21 divisions from the Standard Industry Classification (SIC) 2007. We assume that all workers have an equal ex ante probability to commute to all destinations independently from the SIC to which they belong. We build the set of possible destinations by multiplying the number of MSOAs in the study area, M = 107, to that of the SIC divisions, S = 21, obtaining 2,247 options. We then populate these virtual workplaces with synthetic workers based on their reference SIC and their Census relative probability to commute from Mi to any Mj, with j = 1...i...J, thus including the MSOA in which the worker resides.

@sgreenbury
Copy link
Collaborator

@BZ-BowenZhang: as discussed Friday, just adding some further detail on ideas for two options for the workplace locations, feel free to let me know if helpful to discuss further at all.

Aim: to assign a workplace location to a given person's schedule after matching NTS to SPC and measure the consistency/validity with observed data sources and modelling.

A. Current SPC approach

  • Workplace is assigned to individuals in the SPC already, but we might prefer an alternative approach (see below) for model consistency/performance. The SPC approach is described in docs and code. In the outputted data, the workplace is available for each person, and this indexes into the venues table. These are probably most easily accessible by using the Reader class that is available in acbm from the SPC toolkit (uatk-spc):
from uatk_spc.reader import Reader
spc = Reader(path, region, backend="pandas", input_type="parquet")
# Has workplace assigned
spc.people
# Has location for a given workplace
spc.venues_per_activity

B. Alternative approach with feasible zones

Validation and comparisons

  • Regarding validation Validation framework for model #17 and comparing the approaches A and B:
    • Given we have the travel time/mode from NTS, we could measure the performance of both approaches in terms of how well the expected travel time for the chosen location from the travel time matrices matches the one recorded in NTS
    • As described in the docs for validation of the current workplaces ($\rho = 0.7$ Pearson correlation coefficient with MSOA flows), we could also measure this for both approaches here.

@Hussein-Mahfouz Hussein-Mahfouz linked a pull request May 22, 2024 that will close this issue
sgreenbury added a commit that referenced this issue Jul 31, 2024
…sk-2b-primary-locations-work

Adds methods, notebooks and script for adding primary locations (#12)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
overview overview of steps for a subtask Task 2 assigning activities to geographic locations
Projects
None yet
3 participants