Skip to content

Implementation status

Hana Sevcikova edited this page Jan 29, 2019 · 35 revisions

This page describes the current status of the implementation on the dev branch.

Table of contents

Real estate price model

Residential

  • Estimation works
  • Needs a re-estimation
  • Config files: repmres.yaml (input to estimation), repmrescoef.yaml (output of estimation and input to simulation)

Non-Residential

  • Estimation works
  • Needs a re-estimation
  • Config files: repmnr.yaml (input to estimation), repmnrcoef.yaml (output of estimation and input to simulation)

Household location choice model

  • Currently, uses random sampling of alternatives with equal weights.
  • Would like to base the weights on current location (i.e. sample more alternatives from close by than from further away). Scott Bridwell wrote code for it - need to incorporate it.
  • Estimation (mechanically) works
  • Needs a re-estimation; one coefficient for the in-migrants sub-model results in value 3 (which is bad).
  • In the current estimation dataset we have 2063 residents and 568 in-migrants (is it correct?)
  • When we estimate, the estimation households are added to all households to create the households universe.
  • For simulation, currently there are no capacity restrictions (Scott's code should help).

Developer model

The developer model consists of two parts:

  1. Feasibility model
  2. Developer picker (runs for all projects at once, regardless if residential or not)

See UDST example.

The following text describes how it works.

Inputs

  1. Standard tables:

    • parcels
    • buildings
    • target_vacancies
    • development_templates, development_template_components
    • development_constraints
    • land_use_types, building_types
    • other dependent tables, e.g. households and jobs, for computing vacancies etc.
  2. Table zoning_heights with the following columns:

    • plan_type_id
    • max_far (maximum far allowed by zoning)
    • max_dua (maximum dwelling units per acre allowed by zoning)
    • max_height (maximum height in ft allowed by zoning)
    • max_coverage (between 0 and 1 with missing values being -1; it is applied to the height only as far and dua have that info already incorporated)
    • a column for each generic building type (office, commercial, industrial, mixed_use) with values 0 or 1 indicating which type is allowed. Residential building types are merged together into one column called "residential".
  3. Table parcel_zoning is automatically created by disaggregating zoning_heights to parcels.

  4. Configurations:

    See those files for additional constraints and assumptions.

  5. For computing rent, coefficients file from the Opus expected sales price model exported into csv, called "expected_sales_unit_price_component_model_coefficients.csv". It is expected to be in the data directory.

Feasibility model

It is implemented in the orca step proforma_feasibility and in the run_feasibility function.

  1. Using the parcel_zoning dataset, parcel attributes max_far, max_dua, max_height, and max_coverage are derived. Using the buildings table, attributes ave_unit_size and land_cost are computed.
  2. The parcel dataset is reduced to empty parcels and those that can be redeveloped. This dataset will be further called "sites". The redevelopment filter is set in the model, currently the variable capacity_opportunity_non_gov which does include empty parcels.
  3. Dataset proforma_settings is created using development templates and its components. It contains the universal set of forms that can be developed.
  4. Create a feasibility dataset, i.e. set of projects that can be developed on parcels (function run_feasibility; it is adopted from here):
    1. Object of class SqFtProForma is created. It holds default settings about forms.
    2. This object is updated, so that the default values are replaced with values coming from templates and template components. It results in 34 forms (from 8 building types).
    3. Price for each use (or building type) is computed on each site. The corresponding function is passed via the parcel_price_callback argument. Currently, we use a regression model estimated on observed CBA data, segmented by building type, where the dependent variable is log(expected price per sqft) and the independent variable is log(land value/sqft). It is implemented in parcel_sales_price_func which converts the value to a total price by exponentiating and multiplying by parcel size. These price arrays (one per building type) are appended to the dataset of sites as new columns with the same names as building types.
    4. For each form it is determined on which parcel it is allowed. The corresponding function is passed via the argument parcel_use_allowed_callback. Currently it is parcel_is_allowed_func. It compares the building type distribution in the form with the allowed zoning.
    5. For each form and parking configuration (surface, deck, underground), profitability is generated. This step is the core function of the model. It is implemented in _lookup_parking_cfg. The price column is weighted by the building type distribution to arrive at "weighted_rent". Various callback functions can be passed here, including such that modifies the dataset of sites, revenues, costs and profits. Also, the maximum profit for each form (column max_profit) is calculated.
    6. If more than one proposal per parcel is to be kept, the data frame is sorted by the max_profit column in a decreasing order, and the largest proposals_to_keep proposals per building type are kept. Otherwise, only the maximum profit proposal over all uses is kept.
    7. Residential cost can be converted to yearly rent by applying a cap rate.
    8. After iterating over all forms, results are concatenated. Multiple proposals per parcel are stored as separate records in a long format. Thus, the dataset index (which is parcel_id) does not need to be unique. Therefore, an enumerating column feasibility_id is added, that can be later used as a unique index.
    9. If percent_of_max_profit is set (in the yaml file) and is larger than zero, only proposals are kept that are within the given percentage of the maximum profit on each parcel. We want this to be 65%.
  5. The final dataset is stored in orca as a table feasibility and returned.

Developer Picker

Runs jointly for residential and non-residential proposals. It is implemented as an orca step developer_picker which calls the function run_developer.

  1. Using the target vacancies table and the existing number of agents in each building type, compute the required number of units (job spaces or residential units) to build.
  2. Residential units are computed as residential_sqft / ave_unit_size where ave_unit_size is a zonal average for each residential building type (passed in the argument ave_unit_size). Job spaces are computed as non_residential_sqft / bldg_sqft_per_job where bldg_sqft_per_job is taken from the building_sqft_per_job table.
  3. Current units for each parcel and building type are computed and matched with the feasibility dataset.
  4. Infeasible proposals are removed. These are proposals with negative maximum profit, and proposals on parcels larger than max_parcel_size (currently set to 50,000,000 so that no proposals are eliminated due to parcel size).
  5. Net units for each proposal are calculated (as proposed units minus current units by building type) and only those proposals are kept where either the sum over residential net units or the sum over non-residential net units is positive.
  6. Selection probabilities are computed. One can pass an own function for this computation (argument profit_to_prob_func). By default it is max_profit per sqft scaled to sum to 1 over all proposals.
  7. Proposals are selected, by default using the probabilities above. If multiple proposals per parcel are selected, only the first one per parcel is kept and more proposals are sampled until there are either no proposals left or the target vacancy is met. User can supply her/his own selection function via the argument custom_selection_func.
  8. The building type of new buildings is added using the callback function form_to_btype_callback. This is currently None, thus no building types are set. Note that there is one building per parcel, which makes it difficult to set building types for mix-use buildings. The sanfran project for example chooses a building type randomly from a set of types per form (see here).
  9. New buildings are added to the old buildings.

TODO: A big issue with this model is that it builds only one building per parcel. Thus, one cannot distinguish between different types of units on mix-use parcels, e.g. when multiple components are all non-residential or all residential. Can we modify the model to allow multiple buildings per parcel?