Skip to content

Latest commit

 

History

History
87 lines (72 loc) · 4.26 KB

README.md

File metadata and controls

87 lines (72 loc) · 4.26 KB

Validation of Tax-Calculator Logic

The Tax-Calculator computes federal income and payroll taxes for a sample of tax filing units in years beginning with 2013. The Python code that performs the tax calculations has been validated in several ways. During the course of development Tax-Calculator results for a number of filing units have been compared to hand calculations performed using IRS tax forms and instructions. In addition, a more systematic program of cross-model validation is part of the ongoing development effort.

The premise behind cross-model validation work is that independently developed tax-simulation models or tax-preparation software are unlikely to contain the same bug, which means looking for differences between the output from two models using the same input is an effective way to locate bugs in tax-calculation logic.

The tools included in this directory support the following validation work flow:

  1. Generate a random sample of tax filing units (INPUT).
  2. Generate OUTPUT from INPUT using Tax-Calculator.
  3. Obtain OUTPUT from INPUT generated by another tax program.
  4. Generate tax differences by comparing the two OUTPUT files.

The working assumption in our cross-model validation work is that tax differences are more likely than not to be caused by bugs in Tax-Calculator. If exploration of specific differences do confirm a bug, it is corrected and the four-step validation process is repeated again until there are no meaningful differences in the two OUTPUT files.

This four-step validation process can be repeated for different sized INPUT files that vary in the number of input variables used to specify each filing unit's attributes and in the number of filing units included in the INPUT file. A more extensive list of input variables and a larger number of filing units increase the likelihood of finding cross-model differences. In our work, each INPUT file is generated randomly to insure a wide range of filing unit attributes.

Other Tax Programs Used for Cross-Model Validation

Our goal is to repeat the four-step cross-model validation process described above using more than one other tax program with which to compare Tax-Calculator results. The details and results of the four-step process are provided in a different sub-directory for each other model. Here are links to the cross-model validation results that are currently available:

Internet-TAXSIM

...

Details on Using the Validation Tools

The current version of the validation tools in this directory should work on Linux or Mac OS X without any changes and without adding any extra software. Those who want to use these validation tools on Windows will have to do three things: (a) install an AWK interpreter, (b) install a Tcl interpreter, and (c) translate each tests.sh bash script into a Windows batch file (tests.bat). The Free Software Foundation provides a free AWK interpreter for Windows (gawk.exe) and ActiveState provides a free Tcl interpreter for Windows (tclsh.exe).

The taxsim_in.tcl and csv_in.py scripts are used to randomly generate INPUT files, which have increasingly longer sets of filing unit attributes and contain as many as 100,000 filing units. Read the source code of the scripts for additional details on how to use them.

The taxdiffs.tcl script calls the taxdiff.awk script to compute the number of large and small tax differences between two OUTPUT files that are formatted like Internet-TAXSIM 28-variable output files. See this link for details on the space-delimited Internet-TAXSIM output file format. All dollar amount differences of one cent or more are reported but those differences are divided into small and large differences, where small is defined as being ten dollars or less and large being greater than ten dollars in absolute value. This small/large borderline is arbitrary and has been specified in an attempt to separate out differences that arise from repeatedly applying IRS-approved rounding-to-the-nearest dollar rules (which Tax-Calculator does not implement). Read the source code of the taxdiffs.tcl script for additional details on how to use it.