diff --git a/docs/guide/assumption_params.md b/docs/guide/assumption_params.md index 6c960264d..0f1a5ebc6 100644 --- a/docs/guide/assumption_params.md +++ b/docs/guide/assumption_params.md @@ -1,9 +1,24 @@ Assumption parameters ===================== -This section contains documentation of several sets of parameters that characterize responses to a tax reform. Consumption parameters are used to compute marginal tax rates and to compute the consumption value of in-kind benefits. Growdiff parameters are used to specify baseline differences and/or reform responses in the annual rate of growth in economic variables. (Note that behavior parameters used to compute changes in input variables caused by a tax reform in a partial-equilibrium setting are not part of Tax-Calculator, but can be used via the Behavioral-Response `behresp` package in a Python program.) - -The assumption parameters control advanced features of Tax-Calculator, so understanding the source code that uses them is essential. Default values of many assumption parameters are zero and are projected into the future at that value, which implies no response to the reform. The benefit value consumption parameters have a default value of one, which implies the consumption value of the in-kind benefits is equal to the government cost of providing the benefits. +This section contains documentation of several sets of parameters that +characterize responses to a tax reform. Consumption parameters are +used to compute marginal tax rates and to compute the consumption +value of in-kind benefits. Growdiff parameters are used to specify +baseline differences and/or reform responses in the annual rate of +growth in economic variables. (Note that behavior parameters used to +compute changes in input variables caused by a tax reform in a +partial-equilibrium setting are not part of Tax-Calculator, but can be +used via the Behavioral-Response `behresp` package in a Python +program.) + +The assumption parameters control advanced features of Tax-Calculator, +so understanding the source code that uses them is essential. Default +values of many assumption parameters are zero and are projected into +the future at that value, which implies no response to the reform. The +benefit value consumption parameters have a default value of one, +which implies the consumption value of the in-kind benefits is equal +to the government cost of providing the benefits. ## Growdiff diff --git a/docs/guide/cli.md b/docs/guide/cli.md index 75143f24e..f13e80b18 100644 --- a/docs/guide/cli.md +++ b/docs/guide/cli.md @@ -1,13 +1,18 @@ Command-line interface ====================== -You can use Tax-Calculator on your own computer via a command-line interface (CLI) called `tc`. -This approach requires the use of a text editor to prepare simple files that are read by `tc`. -Computer programming knowledge is not required, but this approach to using Tax-Calculator assumes you are willing to work at the command line (Terminal on Mac or Anaconda Prompt on Windows) and to use a text editor (for example, TextEdit on Mac or Notepad on Windows). +You can use Tax-Calculator on your own computer via a command-line +interface (CLI) called `tc`. This approach requires the use of a text +editor to prepare simple files that are read by `tc`. Computer +programming knowledge is not required, but this approach to using +Tax-Calculator assumes you are willing to work at the command line +(Terminal on Mac or Anaconda Prompt on Windows) and to use a text +editor (for example, TextEdit on Mac or Notepad on Windows). ## Test `tc` CLI -The `tc` CLI is part of the Tax-Calculator `taxcalc` package you installed on your computer as part of {doc}`../usage/starting`. +The `tc` CLI is part of the Tax-Calculator `taxcalc` package you +installed on your computer as part of {doc}`../usage/starting`. To check your installation of `tc`, enter the following command: @@ -15,44 +20,131 @@ To check your installation of `tc`, enter the following command: tc --test ``` -Expected output (after a number of seconds) is `PASSED TEST`. -If you get `FAILED TEST`, something went wrong in the installation process. If the installation test fails, please report your experience by [creating a new issue](https://github.com/PSLmodels/Tax-Calculator/issues). +Expected output (after a number of seconds) is `PASSED TEST`. If you +get `FAILED TEST`, something went wrong in the installation +process. If the installation test fails, please report your experience +by [creating a new +issue](https://github.com/PSLmodels/Tax-Calculator/issues). -If your installation passes the test, you are ready to begin using `tc` to analyze tax reforms. Continue reading this section for information about how to do that. But if you want a quick hint about the range of `tc` capabilities, enter the following: +If your installation passes the test, you are ready to begin using +`tc` to analyze tax reforms. Continue reading this section for +information about how to do that. But if you want a quick hint about +the range of `tc` capabilities, enter the following: ``` tc --help ``` -The basic idea of `tc` tax analysis is that each tax reform is specified in a text file using a simple method to describe the details of the reform. Read the next part of this section to see how policy reform files are formatted. +The basic idea of `tc` tax analysis is that each tax reform is +specified in a text file using a simple method to describe the details +of the reform. Read the next part of this section to see how policy +reform files are formatted. ## Specify tax reform -The details of a tax reform are contained in a text file that you write with a text editor. The reform is expressed by specifying which tax policy parameters are changed from their current-law values by the reform. The current-law values of each policy parameter are documented in [this section](#pol) of the guide. The timing and magnitude of these policy parameter changes are written in JSON, a simple and widely-used data-specification language. +The details of a tax reform are contained in a text file that you +write with a text editor. The reform is expressed by specifying which +tax policy parameters are changed from their current-law values by the +reform. The current-law values of each policy parameter are documented +in [this +section](https://taxcalc.pslmodels.org/guide/policy_params.html#policy-parameters) +of the guide. The timing and magnitude of these policy parameter +changes are written in JSON, a simple and widely-used +data-specification language. -For several examples of reform files and the general rules for writing JSON reform files, go to [this page](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/README.md#policy-reform-files). +For several examples of reform files and the general rules for writing +JSON reform files, go to [this +page](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/README.md#policy-reform-files). ## Specify analysis assumptions -This part explains how to specify economic assumption files used in static tax analysis. This is an advanced topic, so if you want to start out using the default assumptions (which are documented in [this section](#params) of the guide), you can skip this part now and come back to read it whenever you want to change the default assumptions. The [next part](#cli-spec-funits) of this section discusses filing-unit input files. - -The details of analysis assumptions are contained in a text file that you write with a text editor. The assumptions are expressed by specifying which parameters are changed from their default values. The timing and magnitude of these parameter changes are written in JSON, a simple and widely-used data-specification language. - -For examples of assumption files and the general rules for writing JSON assumption files, go to [this page](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/assumptions/README.md#economic-assumption-files). +This part explains how to specify economic assumption files used in +static tax analysis. This is an advanced topic, so if you want to +start out using the default assumptions (which are documented in [this +section](https://taxcalc.pslmodels.org/guide/policy_params.html#policy-parameters)) +of the guide), you can skip this part now and come back to read it +whenever you want to change the default assumptions. The [next +part](https://taxcalc.pslmodels.org/guide/cli.html#specify-filing-units) +of this section discusses filing-unit input files. + +The details of analysis assumptions are contained in a text file that +you write with a text editor. The assumptions are expressed by +specifying which parameters are changed from their default values. The +timing and magnitude of these parameter changes are written in JSON, a +simple and widely-used data-specification language. + +For examples of assumption files and the general rules for writing +JSON assumption files, go to [this +page](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/assumptions/README.md#economic-assumption-files). ## Specify filing units -The `taxcalc` package containing `tc` does not include an IRS-SOI-PUF-derived microsimulation sample. This is because, unlike Census public-use files, the IRS-SOI Public Use File (PUF) is proprietary. If you or your organization has paid IRS to use the PUF version being by Tax-Calculator, then it may be possible for us to share with you our PUF-derived sample, which we call `puf.csv` even though it contains CPS records that represent non-filers. Otherwise, you have two choices. - -**First**, you can easily create with a text editor a CSV-formatted file containing several filing units whose experience under your tax reform is of interest to you. Much of the public discussion of tax reforms is of this type: how is this family or that family affected by a reform; how do they fare under different reforms; etc. The test conducted to check the `tc` installation has left one such file. It is called `test.csv` and contains two filing units with only wage and salary income: a lower income family and a higher income family. You can use this `test.csv` file as `tc` input to analyze your tax reforms. Before creating your own input files be sure to read the short set of guidelines that appear after this list of two choices. Some people pursue this approach using a statistical pacakge like R or Stata, in which case the `tc` CLI program can be invoked from within the statistical package. There may be a need (especially on Windows) to [add to the system PATH](https://github.com/PSLmodels/Tax-Calculator/issues/2273#issuecomment-479572287) in order to do this. - -**Second**, the `taxcalc` does include a freely available microsimulation sample containing only filing units derived from several recent March CPS surveys. For several reasons, the results generated by this `cps.csv` file are substantially different from the results generated by the `puf.csv` file. The `cps.csv` file contains a sample of the population while the `puf.csv` file contains mostly a sample of income tax filers in which high-income filing units are over represented. Also, the `cps.csv` file has many income variables that are missing (and assumed to be zero by Tax-Calculator), which causes an understating of total incomes, especially for those with high incomes. All these differences mean that the aggregate revenue and distributional results generated when using the `cps.csv` file as input to Tax-Calculator can be substantially different from the results generated when using the `puf.csv` file as input. And this is particularly true when analyzing reforms that change the tax treatment of high-income filers. +The `taxcalc` package containing `tc` does not include an +IRS-SOI-PUF-derived microsimulation sample. This is because, unlike +Census public-use files, the IRS-SOI Public Use File (PUF) is +proprietary. If you or your organization has paid IRS to use the PUF +version being by Tax-Calculator, then it may be possible for us to +share with you our PUF-derived sample, which we call `puf.csv` even +though it contains CPS records that represent non-filers. Otherwise, +you have two choices. + +**First**, you can easily create with a text editor a CSV-formatted +file containing several filing units whose experience under your tax +reform is of interest to you. Much of the public discussion of tax +reforms is of this type: how is this family or that family affected by +a reform; how do they fare under different reforms; etc. The test +conducted to check the `tc` installation has left one such file. It is +called `test.csv` and contains two filing units with only wage and +salary income: a lower income family and a higher income family. You +can use this `test.csv` file as `tc` input to analyze your tax +reforms. Before creating your own input files be sure to read the +short set of guidelines that appear after this list of two +choices. Some people pursue this approach using a statistical pacakge +like R or Stata, in which case the `tc` CLI program can be invoked +from within the statistical package. There may be a need (especially +on Windows) to [add to the system +PATH](https://github.com/PSLmodels/Tax-Calculator/issues/2273#issuecomment-479572287) +in order to do this. + +**Second**, the `taxcalc` does include a freely available +microsimulation samplecontaining only filing units derived from +several recent March CPS surveys. For several reasons, the results +generated by this `cps.csv` file are substantially different from the +results generated by the `puf.csv` file. The `cps.csv` file contains a +sample of the population while the `puf.csv` file contains mostly a +sample of income tax filers in which high-income filing units are over +represented. Also, the `cps.csv` file has many income variables that +are missing (and assumed to be zero by Tax-Calculator), which causes +an understating of total incomes, especially for those with high +incomes. All these differences mean that the aggregate revenue and +distributional results generated when using the `cps.csv` file as +input to Tax-Calculator can be substantially different from the +results generated when using the `puf.csv` file as input. And this is +particularly true when analyzing reforms that change the tax treatment +of high-income filers. **Input-File-Preparation Guidelines** -The `tc` CLI to Tax-Calculator is flexible enough to read almost any kind of CSV-formatted input data on filing units as long as the variable names correspond to those expected by Tax-Calculator. The only required input variables are `RECID` (a unique filing-unit record identifier) and `MARS` (a positive-valued filing-status indicator). Other variables in the input file must have variable names that are listed in the [Input Variables](#input) section for them to affect the tax calculations. Any variable listed in Input Variables that is not in an input file is automatically set to zero for every filing unit. Variables in the input file that are not listed in Input Variables are ignored by Tax-Calculator. - -However, there are important data-preparation issues related to the fact that the payroll tax is a tax on individuals, not on income-tax filing units. Tax-Calculator expects that the filing-unit total for each of several earnings-related variables is split between the taxpayer and the spouse. It is the responsibility of anyone preparing data for Tax-Calculator input to do this earnings splitting. Here are the relationships between the filing-unit variable and the taxpayer (`p`) and spouse (`s`) variables expected by Tax-Calculator: +The `tc` CLI to Tax-Calculator is flexible enough to read almost any +kind of CSV-formatted input data on filing units as long as the +variable names correspond to those expected by Tax-Calculator. The +only required input variables are `RECID` (a unique filing-unit record +identifier) and `MARS` (a positive-valued filing-status +indicator). Other variables in the input file must have variable names +that are listed in the [Input Variables](#input) section for them to +affect the tax calculations. Any variable listed in Input Variables +that is not in an input file is automatically set to zero for every +filing unit. Variables in the input file that are not listed in Input +Variables are ignored by Tax-Calculator. + +However, there are important data-preparation issues related to the +fact that the payroll tax is a tax on individuals, not on income-tax +filing units. Tax-Calculator expects that the filing-unit total for +each of several earnings-related variables is split between the +taxpayer and the spouse. It is the responsibility of anyone preparing +data for Tax-Calculator input to do this earnings splitting. Here are +the relationships between the filing-unit variable and the taxpayer +(`p`) and spouse (`s`) variables expected by Tax-Calculator: ``` e00200 = e00200p + e00200s @@ -60,11 +152,28 @@ e00900 = e00900p + e00900s e02100 = e02100p + e02100s ``` -Obviously, when `MARS` is not equal to 2 (married filing jointly), the values of the three `s` variables are zero and the value of each `p` variable is equal to the value of its corresponding filing-unit variable. Note that the input file can omit any one, or all, of these three sets variables. If the three variables in one of these sets are omitted, the required relationship will be satisfied because zero equals zero plus zero. +Obviously, when `MARS` is not equal to 2 (married filing jointly), the +values of the three `s` variables are zero and the value of each `p` +variable is equal to the value of its corresponding filing-unit +variable. Note that the input file can omit any one, or all, of these +three sets variables. If the three variables in one of these sets are +omitted, the required relationship will be satisfied because zero +equals zero plus zero. -In addition to this earnings-splitting data-preparation issue, Tax-Calculator expects that the value of ordinary dividends (`e00600`) will be no less than the value of qualified dividends (`e00650`) for each filing unit. And it also expects that the value of total pension and annuity income (`e01500`) will be no less than the value of taxable pension and annuity income (`e01700`) for each filing unit. Tax-Calculator also expects the value of the required MARS variable to be in the range from one to five, and the value of the EIC variable to be in the range from zero to three. Again, it is your responsibility to prepare input data for Tax-Calculator in a way that ensures these relationships are true for each filing unit. +In addition to this earnings-splitting data-preparation issue, +Tax-Calculator expects that the value of ordinary dividends (`e00600`) +will be no less than the value of qualified dividends (`e00650`) for +each filing unit. And it also expects that the value of total pension +and annuity income (`e01500`) will be no less than the value of +taxable pension and annuity income (`e01700`) for each filing +unit. Tax-Calculator also expects the value of the required MARS +variable to be in the range from one to five, and the value of the EIC +variable to be in the range from zero to three. Again, it is your +responsibility to prepare input data for Tax-Calculator in a way that +ensures these relationships are true for each filing unit. -Here's an example of how to specify a few stylized filing units with and without young children: +Here's an example of how to specify a few stylized filing units with +and without young children: ``` RECID,MARS,XTOT,EIC,n24,... @@ -73,32 +182,97 @@ RECID,MARS,XTOT,EIC,n24,... 13 , 2 , 4 , 2 , 2 ,... <== married couple with two young kids ``` -Be sure to read the documentation of the `MARS`, `XTOT`, `EIC`, and `n24` input variables. Also, there may be a need to add other child-age input variables if you want to simulate reforms like a child credit bonus for young children. Also, the universal basic income (UBI) reform is implemented using its own set of three age-group-count input variables. +Be sure to read the documentation of the `MARS`, `XTOT`, `EIC`, and +`n24` input variables. Also, there may be a need to add other +child-age input variables if you want to simulate reforms like a child +credit bonus for young children. Also, the universal basic income +(UBI) reform is implemented using its own set of three age-group-count +input variables. -The name of your input data file is also relevant to how `tc` will behave. If your file name ends with "puf.csv" or "cps.csv", `tc` will automatically extrapolate your data from its base year to the year you specify for tax calculations to be calculated using built in growth factors, extrapolated weights, and other adjustment factors. If you are not using the "puf.csv" or "cps.csv" files produced by the TaxData project, it is likely that your data will not be compatible with these extrapolations and you should adopt filenames with alternative endings. +The name of your input data file is also relevant to how `tc` will +behave. If your file name ends with "puf.csv" or "cps.csv", `tc` will +automatically extrapolate your data from its base year to the year you +specify for tax calculations to be calculated using built in growth +factors, extrapolated weights, and other adjustment factors. If you +are not using the "puf.csv" or "cps.csv" files produced by the TaxData +project, it is likely that your data will not be compatible with these +extrapolations and you should adopt filenames with alternative +endings. ## Initiate reform analysis -Executing `tc` requires only two command-line arguments: the name of an input file containing one or more filing units and the year for which the tax calculations are done. A baseline policy file is optional; specifying no baseline file implies the baseline policy is current-law policy. A policy reform file is optional; specifying no reform file implies calculations are done for the baseline policy. An economic assumption file is also optional; no assumption file implies you want to use the default values of the assumption parameters. The output files written by `tc` are built-up from the name of the input file, tax year, baseline file, reform file, and assumption file using a `#` character if an option is not specified. - -Here we explain how to conduct tax analysis with `tc` by presenting a series of examples and explaining what output is produced in each example. There are several types of output that `tc` can generate so there will be more than a few examples. The examples are numbered in order to make it easier to refer to different examples. All the examples assume that the input file is `test.csv`, which was mentioned earlier in this guide. +Executing `tc` requires only two command-line arguments: the name of +an input file containing one or more filing units and the year for +which the tax calculations are done. A baseline policy file is +optional; specifying no baseline file implies the baseline policy is +current-law policy. A policy reform file is optional; specifying no +reform file implies calculations are done for the baseline policy. An +economic assumption file is also optional; no assumption file implies +you want to use the default values of the assumption parameters. The +output files written by `tc` are built-up from the name of the input +file, tax year, baseline file, reform file, and assumption file using +a `#` character if an option is not specified. + +Here we explain how to conduct tax analysis with `tc` by presenting a +series of examples and explaining what output is produced in each +example. There are several types of output that `tc` can generate so +there will be more than a few examples. The examples are numbered in +order to make it easier to refer to different examples. All the +examples assume that the input file is `test.csv`, which was mentioned +earlier in this guide. ``` tc test.csv 2020 ``` -This produces a minimal output file containing 2020 tax liabilities for each filing unit assuming the income amounts in the input file are amounts for 2020 and assuming current-law tax policy projected to 2020\. The name of the CSV-formatted output file is `test-20-#-#-#.csv`. The first `#` symbol indicates we did not specify a baseline file and the second `#` symbol indicates we did not specify a policy reform file and the third `#` symbol indicates we did not specify an economic assumption file. -The variables included in the minimal output file include: `RECID` (of filing unit in the input file), `YEAR` (specified when executing `tc`), `WEIGHT` (which is same as `s006`), `INCTAX` (which is same as `iitax`), `LSTAX` (which is same as `lumpsum_tax`) and `PAYTAX` (which is same as `payroll_tax`). +This produces a minimal output file containing 2020 tax liabilities +for each filing unit assuming the income amounts in the input file are +amounts for 2020 and assuming current-law tax policy projected to +2020\. The name of the CSV-formatted output file is +`test-20-#-#-#.csv`. The first `#` symbol indicates we did not specify +a baseline file and the second `#` symbol indicates we did not specify +a policy reform file and the third `#` symbol indicates we did not +specify an economic assumption file. The variables included in the +minimal output file include: `RECID` (of filing unit in the input +file), `YEAR` (specified when executing `tc`), `WEIGHT` (which is same +as `s006`), `INCTAX` (which is same as `iitax`), `LSTAX` (which is +same as `lumpsum_tax`) and `PAYTAX` (which is same as `payroll_tax`). -Also, documentation of the reform is always written to a text file ending in `-doc.text`, which in this example would be named `test-20-#-#-#-doc.text`. +Also, documentation of the reform is always written to a text file +ending in `-doc.text`, which in this example would be named +`test-20-#-#-#-doc.text`. ``` tc test.csv 2020 --dump ``` -This produces a much more complete output file with the same name `test-20-#-#-#.csv` as the minimal output file produced in example (1). No other output is generated other than the `test-20-#-#-#-doc.text` file. The `--dump` option causes **all** the input variables (including the ones understood by Tax-Calculator but not included in `test.csv`, which are all zero) and **all** the output variables calculated by Tax-Calculator to be included in the output file. For a complete list of input variables, see the [Input Variables](#input) section. For a complete list of output variables, see the [Output Variables](#output) section. Since Tax-Calculator ignores variables in the input file that are not in the Input Variables section, the dump output file in example (2) can be used as an input file and it will produce exactly the same tax liabilities (apart from rounding errors of one or two cents) as in the original dump output. - -This full dump output can be useful for debugging and is small when using just a few filing units as input. But when using large samples as input (for example, the `cps.csv` input file), the size of the dump output becomes quite large. There is a way to specify a **partial dump** that includes only variables of interest. To have `tc` do a partial dump, create a text file that lists the names of the variables to be included in the partial dump. You can put the varible names on separate lines and/or put several names on one line separated by spaces. Then point to that file using the `--dvars` option. So, for example, if your list of dump variables is in a file named `mydumpvars`, a partial dump file is created this way: +This produces a much more complete output file with the same name +`test-20-#-#-#.csv` as the minimal output file produced in example +(1). No other output is generated other than the +`test-20-#-#-#-doc.text` file. The `--dump` option causes **all** the +input variables (including the ones understood by Tax-Calculator but +not included in `test.csv`, which are all zero) and **all** the output +variables calculated by Tax-Calculator to be included in the output +file. For a complete list of input variables, see the [Input +Variables](#input) section. For a complete list of output variables, +see the [Output Variables](#output) section. Since Tax-Calculator +ignores variables in the input file that are not in the Input +Variables section, the dump output file in example (2) can be used as +an input file and it will produce exactly the same tax liabilities +(apart from rounding errors of one or two cents) as in the original +dump output. + +This full dump output can be useful for debugging and is small when +using just a few filing units as input. But when using large samples +as input (for example, the `cps.csv` input file), the size of the dump +output becomes quite large. There is a way to specify a **partial +dump** that includes only variables of interest. To have `tc` do a +partial dump, create a text file that lists the names of the variables +to be included in the partial dump. You can put the varible names on +separate lines and/or put several names on one line separated by +spaces. Then point to that file using the `--dvars` option. So, for +example, if your list of dump variables is in a file named +`mydumpvars`, a partial dump file is created this way: ``` tc cps.csv 2020 --dump --dvars mydumpvars @@ -110,25 +284,55 @@ If there is no `--dvars` option, the `--dump` option produces a full dump. tc test.csv 2020 --sqldb ``` -This produces the same dump output as example (2) except that the dump output is written not to a CSV-formatted file, but to the dump table in an SQLite3 database file, which is called `test-20-#-#-#.db` in this example. Because the `--dump` option is not used in example (3), minimal output will be written to the `test-20-#-#-#.csv` file. Note that use of the `--dvars` option causes the contents of the database file to be a partial dump. - -Pros and cons of putting dump output in a CSV file or an SQLite3 database table: The CSV file is almost twice as large as the database, but it can be easily imported into a wide range of statistical packages. The main advantage of the SQLite3 database is that the Anaconda Python distribution includes [sqlite3](https://www.sqlite.org/cli.html) (or sqlite3.exe on Windows), a command-line tool that can be used to tabulate dump output using structured query language (SQL). SQL is a language that you use to specify the tabulation you want and the SQL database figures out the procedure for generating your tabulation and then executes that procedure; there is no computer programming involved. We illustrate SQL tabulation of dump output in a [subsequent section](#cli-tab-results). +This produces the same dump output as example (2) except that the dump +output is written not to a CSV-formatted file, but to the dump table +in an SQLite3 database file, which is called `test-20-#-#-#.db` in +this example. Because the `--dump` option is not used in example (3), +minimal output will be written to the `test-20-#-#-#.csv` file. Note +that use of the `--dvars` option causes the contents of the database +file to be a partial dump. + +Pros and cons of putting dump output in a CSV file or an SQLite3 +database table: The CSV file is almost twice as large as the database, +but it can be easily imported into a wide range of statistical +packages. The main advantage of the SQLite3 database is that the +Anaconda Python distribution includes +[sqlite3](https://www.sqlite.org/cli.html) (or sqlite3.exe on +Windows), a command-line tool that can be used to tabulate dump output +using structured query language (SQL). SQL is a language that you use +to specify the tabulation you want and the SQL database figures out +the procedure for generating your tabulation and then executes that +procedure; there is no computer programming involved. We illustrate +SQL tabulation of dump output in a [subsequent +section](#cli-tab-results). ``` tc test.csv 2020 --dump --sqldb ``` -This shows that you can get dump output in the two different formats from a single `tc` run. +This shows that you can get dump output in the two different formats +from a single `tc` run. -The remaining examples use neither the `--dump` nor the `--sqldb` option, and thus, produce minimal output for the reform. But either or both of those options could be used in all the subsequent examples to generate more complete output for the reform. +The remaining examples use neither the `--dump` nor the `--sqldb` +option, and thus, produce minimal output for the reform. But either or +both of those options could be used in all the subsequent examples to +generate more complete output for the reform. ``` tc test.csv 2021 --reform ref3.json ``` -This produces 2021 output for the filing units in the `test.csv` file using the policy reform specified in the `ref3.json` file. The name of the output file in this example is `test-21-#-ref3-#.csv` because no baseline or assumption options were specified. +This produces 2021 output for the filing units in the `test.csv` file +using the policy reform specified in the `ref3.json` file. The name of +the output file in this example is `test-21-#-ref3-#.csv` because no +baseline or assumption options were specified. -If, in addition to `ref3.json`, there was a `ref4.json` reform file and analysis of the **compound reform** (consisting of first implementing the `ref3.json` reform relative to current-law policy and then implementing the `ref4.json` reform relative to the `ref3.json` reform) is desired, both reform files can be mentioned in the `--reform` option as follows: +If, in addition to `ref3.json`, there was a `ref4.json` reform file +and analysis of the **compound reform** (consisting of first +implementing the `ref3.json` reform relative to current-law policy and +then implementing the `ref4.json` reform relative to the `ref3.json` +reform) is desired, both reform files can be mentioned in the +`--reform` option as follows: ``` tc test.csv 2021 --reform ref3.json+ref4.json @@ -140,11 +344,25 @@ The above command generates an output file named `test-21-#-ref3+ref4-#.csv` tc test.csv 2021 --reform ref3.json --assump res1.json ``` -This produces 2021 output for the filing units in the `test.csv` file using the policy reform specified in the `ref3.json` file and the economic assumptions specified in the `eas1.json` file. The output results produced by this analysis are written to the `test-21-#-ref3-eas1.csv` file. +This produces 2021 output for the filing units in the `test.csv` file +using the policy reform specified in the `ref3.json` file and the +economic assumptions specified in the `eas1.json` file. The output +results produced by this analysis are written to the +`test-21-#-ref3-eas1.csv` file. -In the preceding examples, all the output files are written in the directory where the `tc` command was executed. If you want the output files to be written in a different directory, use the `--outdir` option. So, for example, if you have created the `myoutput` directory as a subdirectory of the directory from where you are running `tc`, output files will be written there if you use the `--outdir myoutput` option. +In the preceding examples, all the output files are written in the +directory where the `tc` command was executed. If you want the output +files to be written in a different directory, use the `--outdir` +option. So, for example, if you have created the `myoutput` directory +as a subdirectory of the directory from where you are running `tc`, +output files will be written there if you use the `--outdir myoutput` +option. -The following examples illustrate output options that work only if each filing unit in the input file has a positive sampling weight (`s006`). So, we are going to use the `cps.csv` file in these examples along with the policy reform specified in the `ref3.json` file, the content of which is: +The following examples illustrate output options that work only if +each filing unit in the input file has a positive sampling weight +(`s006`). So, we are going to use the `cps.csv` file in these examples +along with the policy reform specified in the `ref3.json` file, the +content of which is: ``` // ref3.json raises personal exemption amount to 8000 in 2022, @@ -154,7 +372,16 @@ The following examples illustrate output options that work only if each filing u } ``` -The output options illustrated in the following examples generate tables of the post-reform level and the reform-induced change in tax liability by income deciles as well as graphs of marginal and average tax rates and percentage change in aftertax income by income percentiles. These tables and graphs are meant to provide a quick glance at the impact of a reform. Any serious analysis of a reform will involve generating custom tables and graphs using [partial dump](#partdump) output. One of many examples of this sort of custom analysis is [here](https://www.washingtonpost.com/graphics/2017/business/tax-bill-calculator/?). +The output options illustrated in the following examples generate +tables of the post-reform level and the reform-induced change in tax +liability by income deciles as well as graphs of marginal and average +tax rates and percentage change in aftertax income by income +percentiles. These tables and graphs are meant to provide a quick +glance at the impact of a reform. Any serious analysis of a reform +will involve generating custom tables and graphs using [partial +dump](#partdump) output. One of many examples of this sort of custom +analysis is +[here](https://www.washingtonpost.com/graphics/2017/business/tax-bill-calculator/?). ``` $ tc cps.csv 2022 --reform ref3.json --tables @@ -197,12 +424,19 @@ Weighted Tax Differences by Baseline Expanded-Income Decile A 171.93 17325.2 -282.4 0.0 0.0 -282.4 ``` -This produces 2022 output for the filing units in the `cps.csv` file using the policy reform specified in the `ref3.json` file. Notice that Tax-Calculator knows to extrapolate (or "age") filing unit data in the `cps.csv` file to the specified tax year. -It knows to do that because of the special input file name `cps.csv`. -The tables produced by this analysis are written to the `cps-22-#-ref3-#-tab.text` file. -Note that on Windows you would use `dir` instead of `ls` and `type` instead of `cat`. +This produces 2022 output for the filing units in the `cps.csv` file +using the policy reform specified in the `ref3.json` file. Notice that +Tax-Calculator knows to extrapolate (or "age") filing unit data in the +`cps.csv` file to the specified tax year. It knows to do that because +of the special input file name `cps.csv`. The tables produced by this +analysis are written to the `cps-22-#-ref3-#-tab.text` file. Note +that on Windows you would use `dir` instead of `ls` and `type` instead +of `cat`. -Also note that the tables above in example (7) include in the bottom decile some filing units who have negative or zero expanded income in the baseline. If you want tables that somehow exclude those filing units, use the `--dump` option and tabulate your own tables. +Also note that the tables above in example (7) include in the bottom +decile some filing units who have negative or zero expanded income in +the baseline. If you want tables that somehow exclude those filing +units, use the `--dump` option and tabulate your own tables. ``` $ tc cps.csv 2024 --reform ref3.json --graphs @@ -215,24 +449,49 @@ cps-24-#-ref3-#-doc.text cps-24-#-ref3-#.csv cps-24-#-ref3-#-mtr.html ``` -This example is like the previous one, except we ask for 2024 static output and for graphs instead of tables, although we could ask for both. -The HTML files containing the graphs can be viewed in your browser. +This example is like the previous one, except we ask for 2024 static +output and for graphs instead of tables, although we could ask for +both. The HTML files containing the graphs can be viewed in your +browser. -Here is what the average tax rate graph in `cps-24-#-ref3-#-atr.html` looks like. +Here is what the average tax rate graph in `cps-24-#-ref3-#-atr.html` +looks like. ![atr graph](../_static/atr.png) -Here is what the marginal tax rate graph in `cps-24-#-ref3-#-mtr.html` looks like: +Here is what the marginal tax rate graph in `cps-24-#-ref3-#-mtr.html` +looks like: ![mtr graph](../_static/mtr.png) -Here is what the percentage change in aftertax income graph in `cps-24-#-ref3-#-pch.html` looks like: +Here is what the percentage change in aftertax income graph in +`cps-24-#-ref3-#-pch.html` looks like: ![pch graph](../_static/pch.png) -There is yet another `tc` output option that writes to the screen results from a normative welfare analysis of the specified policy reform. This `--ceeu` option produces experimental results that make sense only with input files that contain representative samples of the population such as the `cps.csv` file. The name of this option stands for certainty-equivalent expected utility. If you want to use this output option, you should read the commented Python source code for the `ce_aftertax_expanded_income` function in the `taxcalc/utils.py` file in the [Tax-Calculator repository](https://github.com/PSLmodels/Tax-Calculator). - -None of the above examples use the `--baseline` option, which means that baseline policy in those examples is current-law policy. The following example shows how to use the `--baseline` option to engage in counter-factual historical analysis. Suppose we want to analyze what would have happened if some alternative to TCJA had been enacted in late 2017\. To do this we need to have pre-TCJA policy be the baseline policy and we need to have the alternative reform be implemented relative to pre-TCJA policy. The following `tc` run does exactly that using a local copy of the [2017_law.json](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/2017_law.json) file and the `alt_reform.json` file containing the alternative reform defined relative to pre-TCJA law. +There is yet another `tc` output option that writes to the screen +results from a normative welfare analysis of the specified policy +reform. This `--ceeu` option produces experimental results that make +sense only with input files that contain representative samples of the +population such as the `cps.csv` file. The name of this option stands +for certainty-equivalent expected utility. If you want to use this +output option, you should read the commented Python source code for +the `ce_aftertax_expanded_income` function in the `taxcalc/utils.py` +file in the [Tax-Calculator +repository](https://github.com/PSLmodels/Tax-Calculator). + +None of the above examples use the `--baseline` option, which means +that baseline policy in those examples is current-law policy. The +following example shows how to use the `--baseline` option to engage +in counter-factual historical analysis. Suppose we want to analyze +what would have happened if some alternative to TCJA had been enacted +in late 2017\. To do this we need to have pre-TCJA policy be the +baseline policy and we need to have the alternative reform be +implemented relative to pre-TCJA policy. The following `tc` run does +exactly that using a local copy of the +[2017_law.json](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/2017_law.json) +file and the `alt_reform.json` file containing the alternative +reform defined relative to pre-TCJA law. ``` $ tc cps.csv 2019 --baseline 2017_law.json --reform 2017_law.json+alt_reform.json @@ -240,29 +499,53 @@ You loaded data for 2014. Tax-Calculator startup automatically extrapolated your data to 2019. ``` -In all the examples in this section, we have executed one `tc` run at at time. -But **what if you want to execute many `tc` runs** because you want results for many years and/or for several different reforms. -Unless you are asking for full-dump output, a single `tc` run should take no more than one minute on your computer (even if you are using the large `cps.csv` input file). -The easiest way to speed up the execution of many `tc` runs is to split them into groups of runs and execute each group of runs in a different command-prompt window. -On most modern computers that have four CPU cores and a fast disk drive, executing four runs in different windows will take not much more time than executing a single `tc` run. -If you have more than one run in each group, put them in a Unix/Mac bash script or a Windows batch file, and execute one script in each command-prompt window. -If it still takes too long, consider splitting the `tc` runs across more than one computer. +In all the examples in this section, we have executed one `tc` run at +at time. But **what if you want to execute many `tc` runs** because +you want results for many years and/or for several different reforms. +Unless you are asking for full-dump output, a single `tc` run should +take no more than one minute on your computer (even if you are using +the large `cps.csv` input file). The easiest way to speed up the +execution of many `tc` runs is to split them into groups of runs and +execute each group of runs in a different command-prompt window. On +most modern computers that have four CPU cores and a fast disk drive, +executing four runs in different windows will take not much more time +than executing a single `tc` run. If you have more than one run in +each group, put them in a Unix/Mac bash script or a Windows batch +file, and execute one script in each command-prompt window. If it +still takes too long, consider splitting the `tc` runs across more +than one computer. ## Tabulate reform results -Given that `tc` output can be written to either CSV-formatted files or SQLite3 database files, there is an enormous range of software tools that can be used to tabulate the output. You can use SAS or R, Stata or MATLAB, or even import output into a spreadsheet (but this would seem to be the least useful option). If you just want to compare the contents of two output files, you can use your favorite graphical diff program to view the two files side by side with highlighting of numbers that are different. The main point is to use a software tool that is available to you, that is appropriate for the task, and that you have experience using. - -Here we give some examples of using the `sqlite3` command-line tool that is part of the Anaconda distribution (so it is always available when using Tax-Calculator). The first step, of course, is to use the `--sqldb` option when running `tc`. Then you can use the `sqlite3` tool interactively or use it to execute SQL scripts you have saved in a text file. We'll provide examples of both those approaches. There are many online tutorials on the SQL select command; if you want to learn more, search the Internet. +Given that `tc` output can be written to either CSV-formatted files or +SQLite3 database files, there is an enormous range of software tools +that can be used to tabulate the output. You can use SAS or R, Stata +or MATLAB, or even import output into a spreadsheet (but this would +seem to be the least useful option). If you just want to compare the +contents of two output files, you can use your favorite graphical diff +program to view the two files side by side with highlighting of +numbers that are different. The main point is to use a software tool +that is available to you, that is appropriate for the task, and that +you have experience using. + +Here we give some examples of using the `sqlite3` command-line tool +that is part of the Anaconda distribution (so it is always available +when using Tax-Calculator). The first step, of course, is to use the +`--sqldb` option when running `tc`. Then you can use the `sqlite3` +tool interactively or use it to execute SQL scripts you have saved in +a text file. We'll provide examples of both those approaches. There +are many online tutorials on the SQL select command; if you want to +learn more, search the Internet. First, we provide a simple example of using `sqlite3` interactively. -This approach is ideal for exploratory data analysis. -Our example uses the `cps.csv` file as input, but you can do the following with -the output from any input file that has weights (`s006`). -Also, we specify no policy reform file, so the output is for current-law policy. -What you cannot see from the following record of the analysis is that the -`sqlite3` tool keeps a command history, so pressing the up-arrow key will bring -up the prior command for editing. -This feature reduces substantially the amount of typing required to conduct +This approach is ideal for exploratory data analysis. Our example +uses the `cps.csv` file as input, but you can do the following with +the output from any input file that has weights (`s006`). Also, we +specify no policy reform file, so the output is for current-law +policy. What you cannot see from the following record of the analysis +is that the `sqlite3` tool keeps a command history, so pressing the +up-arrow key will bring up the prior command for editing. This +feature reduces substantially the amount of typing required to conduct exploratory data analysis. ``` @@ -280,12 +563,11 @@ sqlite> YOUR FINAL SQL COMMAND GOES HERE sqlite> .quit ``` -Second, we provide a simple example of using `sqlite3` with SQL commands stored -in a text file. -This approach is useful if you want to tabulate many different output files in -the same way. -This second example assumes that the first example has already been done. -Note that on Windows you should replace `cat` with `type`. +Second, we provide a simple example of using `sqlite3` with SQL +commands stored in a text file. This approach is useful if you want +to tabulate many different output files in the same way. This second +example assumes that the first example has already been done. Note +that on Windows you should replace `cat` with `type`. ``` $ cat tab.sql @@ -340,21 +622,20 @@ bin number | weighted count | mean NON-NEGATIVE MTR in bin ``` The `cat` command writes the contents of the `tab.sql` file to stdout. -We do nothing but that in the first command in order to show you the file -contents. -The second command pipes the contents of the `tab.sql` file into the `sqlite3` -tool, which executes the SQL statements and writes the tabulation results to -stdout. -(If you're wondering about the validity of those high marginal tax rates, -rest assured that all filing units with marginal income tax rates greater than -sixty percent have been checked by hand and are valid: -most are caught in the rapid phase-out of non-refundable education credits or -in the phase-in of taxation of social security benefits. -The negative marginal tax rates are caused by refundable credits, -primarily the earned income tax credit.) - -If you want to use the `sqlite3` tool to tabulate the changes caused by a -reform, use `tc` to generate two database dump files -(one for current-law policy and the other for your reform) -and then use the SQLite3 ATTACH command to make both database files available -in your SQLite tabulation session. +We do nothing but that in the first command in order to show you the +file contents. The second command pipes the contents of the `tab.sql` +file into the `sqlite3` tool, which executes the SQL statements and +writes the tabulation results to stdout. (If you're wondering about +the validity of those high marginal tax rates, rest assured that all +filing units with marginal income tax rates greater than sixty percent +have been checked by hand and are valid: most are caught in the rapid +phase-out of non-refundable education credits or in the phase-in of +taxation of social security benefits. The negative marginal tax rates +are caused by refundable credits, primarily the earned income tax +credit.) + +If you want to use the `sqlite3` tool to tabulate the changes caused +by a reform, use `tc` to generate two database dump files (one for +current-law policy and the other for your reform) and then use the +SQLite3 ATTACH command to make both database files available in your +SQLite tabulation session. diff --git a/taxcalc/assumptions/README.md b/taxcalc/assumptions/README.md index 332f631d7..96aa235e7 100644 --- a/taxcalc/assumptions/README.md +++ b/taxcalc/assumptions/README.md @@ -9,7 +9,7 @@ Such an economic assumption file can then be used by the `tc` command-line interface to Tax-Calculator or be read in a Python program that imports the Tax-Calculator `taxcalc` package, as described in the [user -guide](https://PSLmodels.github.io/Tax-Calculator/uguide.html). +guide](https://taxcalc.pslmodels.org/guide/index.html#user-guide). [This document](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/assumptions/ASSUMPTIONS.md#how-to-specify-economic-assumptions-in-a-json-assumption-file) diff --git a/taxcalc/reforms/README.md b/taxcalc/reforms/README.md index 0bc24cec4..b05ae3a0c 100644 --- a/taxcalc/reforms/README.md +++ b/taxcalc/reforms/README.md @@ -8,15 +8,16 @@ are stored on your local computer. Such policy reform files can be analyzed on your local computer by using the `tc` command-line interface to Tax-Calculator as described in the [user -guide](https://PSLmodels.github.io/Tax-Calculator/uguide.html) or by -writing your own Python programs that import the Tax-Calculator -`taxcalc` package as described in the -[cookbook](https://PSLmodels.github.io/Tax-Calculator/cookbook.html). +guide](https://taxcalc.pslmodels.org/guide/cli.html#command-line-interface) +or by writing your own Python programs that import the Tax-Calculator +`taxcalc` package as described in the [cookbook +recipes](https://taxcalc.pslmodels.org/recipes/index.html#recipes). [This document](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/REFORMS.md#how-to-specify-a-tax-reform-in-a-json-policy-reform-file) provides access to several reform files that represent recent tax reform proposals and guidelines about how to prepare your own reform -files. Look in particular at the +files. Look in particular at the older [FAQ](https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/TCJA.md#tcja-faq) -on TCJA-related reforms. +on TCJA-related reforms and the newer section on [TCJA after +2025](https://taxcalc.pslmodels.org/usage/tcja_after_2025.html#tcja-after-2025).