Impacts of Uncertainty & Differential Privacy on Title I

Author: Ryan Steed, with help from Terrance Liu

Paper co-authors: Terrance Liu, Steven Wu, Alessandro Acquisti

API documentation: rbsteed.com/dp-policy

For more, check out the paper and SI.

Installation

make dp_policy

Running the CLI

Use the CLI endpoints in dp_policy/api.py.

dp_policy --help
# to run a specific experiment
dp_policy run [experiment]
# to only produce the feather file for regression analysis (using cached results)
dp_policy run --just-join [name]
# to run all experiments
dp_policy run_all

Experiment options (described in detail the SI and passed to the Experiment.get_experiment factory method) include:

"baseline" - just the baseline settings (no experimental modifications).
Policy changes
- "hold_harmless" - treatments which add one or both of the post-formula provisions (hold harmless and the state minimum).
- "thresholds" - treatments which modify the thresholds for district funding eligibility.
- "moving_average" - treatments which use multiyear averages of varying size.
- "budget" - treatments which vary the overall Title I appropriation.
Robustness checks
- "post_processing" - treatments which modify the post-processing applied after noise injection.
- "epsilon" - treatments which modify the privacy parameter epsilon.
- "sampling" - treatments which vary the variance of simulated data error.
- "vary_total_children" - treatment where the number of total children is also noised.

Replicating Results

For general statistics, run cells in notebooks/results.ipynb.
Generate all the experimental results by running dp_policy run_all or running chosen experiments individually with dp_policy run [experiment].
Visualize experiment results with notebooks/policy_experiments.ipynb. (For example, Fig. 1 was produced with statistics from the Epsilon Sensitivity section of notebooks/policy_experiments.ipynb.)
Produce disparity plots and GAM smooth plots with R/plot_all.R. (For example, Fig. 2 is the race disparity plot for the "hold_harmless" experiment.)

data/
- discrimination/ - ACS 5-year data for discrimination analysis
- shapefiles/ - TIGER shapefiles for school districts
- titlei-allocations - official dep. of ed. figures, from Todd Stephenson
- saipe*, county_saipe* - district- and county-level SAIPE data
- fips_codes.csv - map of FIPS codes to postal codes and state names
- nslp19.csv - National School Lunch program data (exploration only)
- sppe* - state per-pupil expenditure data
dp_policy/ - codebase
- titlei/ - submodule for replicating the Title I allocation procedure, with noise
  - allocators.py - allocation procedures
  - bootstrap.py - exploratory functions for sampling experiments
  - evaluation.py - utility functions for evaluating results
  - mechanisms.py - randomization (noise injection) mechanisms
  - thresholders.py - thresholding mechanisms for formula
  - utils.py - utility functions
- api.py - endpoints for CLI
- config.py - settings
- experiments.py - set of experiment configurations for replicating results
logs/ - logs for recording runs
notebooks/ - Jupyter notebooks for exploration and visualization
- results.ipynb - main notebook for replicating and visualizing auxiliary experiment results
- policy-experiments.ipynb - notebook for visualizing results of policy experiments
- nslp.ipynb - exploring NSLP data as an alternative ground truth
- plot_sampling.ipynb - developing sampling mechanisms
plots/ - output plots
R/ - R scripts for regression and visualization
- exploration.Rmd - exploring results
- plot_all.R - plots/regressions for all experiments
- plot_experiment.R - plots/regressions for one experiment
- plots.R - endpoints for plotting results and running regressions
- regression_tables.R - endpoint for recording regression tables
- regressions.Rmd - exploring regression specifications
- utils.R - utility functions for plotting and regressions
results/ - cached results files
- policy_experiments/ - for experiment runs
- regressions/ - for regressions
scripts/ - miscellaneous bash scripts to make server runs easier

Documentation

Documentation for the dp-policy API is published at rbsteed.com/dp-policy.

To generate the documentation, use pdoc3:

pdoc --html --output-dir docs --force dp_policy --template-dir docs/templates
git subtree push --prefix docs/dp_policy origin gh-pages

Name		Name	Last commit message	Last commit date
Latest commit History 319 Commits
R		R
data		data
docs		docs
dp_policy		dp_policy
notebooks		notebooks
plots		plots
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Impacts of Uncertainty & Differential Privacy on Title I

Installation

Running the CLI

Replicating Results

Contents

Documentation

About

Releases 2

Contributors 2

Languages

License

ryansteed/dp-policy

Folders and files

Latest commit

History

Repository files navigation

Impacts of Uncertainty & Differential Privacy on Title I

Installation

Running the CLI

Replicating Results

Contents

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases 2

Contributors 2

Languages