Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different PUF weighted policy changes between API and tc CLI #2099

Closed
ernietedeschi opened this issue Nov 3, 2018 · 12 comments
Closed

Different PUF weighted policy changes between API and tc CLI #2099

ernietedeschi opened this issue Nov 3, 2018 · 12 comments
Labels

Comments

@ernietedeschi
Copy link
Contributor

ernietedeschi commented Nov 3, 2018

My policy reform is a very simple call to _EITC_basic_frac (thanks again for implementing):

// 	EBITC
{
   "policy": {
        "_EITC_basic_frac":
            {"2019": [0.5]}
  }
}

First, my CLI procedure:

python 'tc.py' 'puf.csv' 2019 --dump
python 'tc.py' 'puf.csv' 2019 --dump --reform 'ebitc.json'

And here's the weighted difference in combined that results using Stata. I'm not sure how the API handles non-filers when summing a policy change that involves a UBI, so I'm showing it both without and with nonfilers:

. table FLPDYR [iw=s006] if filer == 1, con(sum dtax) format(%20.0fc)

--------------------------------
   FLPDYR |            sum(dtax)
----------+---------------------
     2019 |       -4,620,714,171
--------------------------------


. table FLPDYR [iw=s006], con(sum dtax) format(%20.0fc)

--------------------------------
   FLPDYR |            sum(dtax)
----------+---------------------
     2019 |       -6,229,368,996
--------------------------------

So either $4.6B or $6.2B in 2019. Now for the API. Here's my code:

import pandas as pd
from taxcalc import *

recs = Records()

pol = Policy(start_year=2019)
calc1 = Calculator(policy=pol, records=recs)

cyr = 2019

calc1.advance_to_year(cyr)
calc1.calc_all()
itax_rev1 = calc1.weighted_total('combined')

reform_filename = 'ebitc.json'

params = Calculator.read_json_param_objects(reform=reform_filename, assump=None)

pol.implement_reform(params['policy'])
calc2 = Calculator(policy=pol, records=recs)

calc2.advance_to_year(cyr)
calc2.calc_all()
itax_rev2 = calc2.weighted_total('combined')
itax_diff = itax_rev2 - itax_rev1

print('{}_CLP_itax_rev($B)= {:.3f}'.format(cyr, itax_rev1 * 1e-9))
print('{}_REF_itax_rev($B)= {:.3f}'.format(cyr, itax_rev2 * 1e-9))
print('{}_DIFF_itax_rev($B)= {:.3f}'.format(cyr, itax_diff * 1e-9))

And here's the output:

2019_CLP_itax_rev($B)= 3031.060
2019_REF_itax_rev($B)= 3025.865
2019_DIFF_itax_rev($B)= -5.195

So the API is giving me a cost of $5.2B, which doesn't match up with the dumped PUF results from the CLI.

Any ideas?

@ernietedeschi
Copy link
Contributor Author

As an update: I figured out that if I remove start_year=2019 from the Policy() call, then I get a result that matches the CLI microdata dump.

So line 4 of my Python code becomes

pol = Policy()

And then the output is

You loaded data for 2011.
Tax-Calculator startup automatically extrapolated your data to 2013.
You loaded data for 2011.
Tax-Calculator startup automatically extrapolated your data to 2013.
2019_CLP_itax_rev($B)= 2807.972
2019_REF_itax_rev($B)= 2801.743
2019_DIFF_itax_rev($B)= -6.229

I thought the start_year option only affected behavioral assumptions (I left it in there because in my more extensive workflow I was toggling behavioral assumptions). Why does it make such a huge difference here, where I'm not making any behaviorial changes?

@martinholmer
Copy link
Collaborator

@evtedeschi3, I was in the midst of replicating the results you first reported in issue #2099, and I had gotten far enough to confirm your CLI results (with and without filtering on the filer variable, about which I'll have more to say below), when I received your second comment.

When I try to replicate your API results, I get this:

iMac:Tax-Calculator mrh$ cat ebitc.py 
import pandas as pd
from taxcalc import *

recs = Records()

pol = Policy(start_year=2019)
calc1 = Calculator(policy=pol, records=recs)

cyr = 2019

calc1.advance_to_year(cyr)
calc1.calc_all()
itax_rev1 = calc1.weighted_total('combined')

reform_filename = 'ebitc.json'

params = Calculator.read_json_param_objects(reform=reform_filename, assump=None)

pol.implement_reform(params['policy'])
calc2 = Calculator(policy=pol, records=recs)

calc2.advance_to_year(cyr)
calc2.calc_all()
itax_rev2 = calc2.weighted_total('combined')
itax_diff = itax_rev2 - itax_rev1

print('{}_CLP_itax_rev($B)= {:.3f}'.format(cyr, itax_rev1 * 1e-9))
print('{}_REF_itax_rev($B)= {:.3f}'.format(cyr, itax_rev2 * 1e-9))
print('{}_DIFF_itax_rev($B)= {:.3f}'.format(cyr, itax_diff * 1e-9))

iMac:Tax-Calculator mrh$ cat ebitc.json 
// 	EBITC
{
   "policy": {
        "_EITC_basic_frac": {"2019": [0.5]}
  }
}

iMac:Tax-Calculator mrh$ python ebitc.py
Traceback (most recent call last):
  File "ebitc.py", line 6, in <module>
    pol = Policy(start_year=2019)
  File "/Users/mrh/work/OSPC/Tax-Calculator/taxcalc/policy.py", line 76, in __init__
    self._inflation_rates = self._gfactors.price_inflation_rates(syr, lyr)
  File "/Users/mrh/work/OSPC/Tax-Calculator/taxcalc/growfactors.py", line 111, in price_inflation_rates
    raise ValueError(msg.format(lastyear, self.last_year))
ValueError: last_year=2033 > GrowFactors.last_year=2027

I'm wondering how you got any results from this ebitc.py script, because on my computer it generates a fatal error (using either the current conda package for release 0.22.2 or the source code at the tip of the master branch). How did you execute your Python script?

@ernietedeschi
Copy link
Contributor Author

@martinholmer wrote:

I'm wondering how you got any results from this ebitc.py script, because on my computer it generates a fatal error (using either the current conda package for release 0.22.2 or the source code at the tip of the master branch). How did you execute your Python script?

It ran fine on Saturday. This morning, I updated to 0.22.2-36-g1f327f0, and then started getting a fatal error too for the first time. That's what motivated me, based on the error log, to play around with the Policy() options.

I found too that if I also specify the num_years option, I avoid the fatal error:

pol = Policy(start_year=2019, num_years=9)

@martinholmer
Copy link
Collaborator

@evtedeschi3 said in issue #2099:

I'm not sure how the API handles non-filers when summing a policy change that involves a UBI, so I'm showing it both without and with nonfilers.

I'm confused by the "involves a UBI" phrase. Your ebita.json reform doesn't involve a UBI.

We probably need to clarify the documentation of the filer variable. What the docs say now is this:

1 if unit files an income tax return; 0 if not (not used in tax-calculation logic); in the puf.csv file a value of 1 indicates record is from IRS/SOI PUF and 0 indicates record is from CPS

The first phrase and the variable's name are misleading. In reality, the filer variable is an INPUT (not an OUTPUT) variable and its value is never used by Tax-Calculator. It's original role was to indicate (for debugging purposes) which records in the puf.csv file come from the IRS-SOI-PUF data set and which records come from the March CPS sample used to represent non-filers. Not only is the value of the filer variable never used in the tax (or UBI) calculations, it is never used in any results-tabulation logic by Tax-Calculator. The filer input variable is completely ignored by Tax-Calculator.

The filer variable also appears in the cps.csv file, but I'm not sure what its meaning is in that data set. Perhaps @andersonfrailey could explain how the filer variable is constructed for the cps.csv file.

Any suggestion about how to clarify the documentation of the filer input variable, would be welcome.

@martinholmer
Copy link
Collaborator

@evtedeschi3 said in issue #2099:

It [ebita.py] ran fine on Saturday. This morning, I updated to 0.22.2-36-g1f327f0, and then started getting a fatal error too for the first time.

That's interesting. Do you remember which version of Tax-Calculator you were using before?

I guess your experience shows that we're improving Tax-Calculator with each new release :)

@ernietedeschi
Copy link
Contributor Author

@martinholmer wrote:

I'm confused by the "involves a UBI" phrase. Your ebita.json reform doesn't involve a UBI.

To be more precise, what I meant was "How the API handles tax changes that affect nonfilers too and might motivate them to file, such as refundable credits claimable even at 0 income." So a true UBI is one example of this, and the EITC_basic_frac policy change is another. But your subsequent comments make clear that filing status isn't a relevant distinction for Tax Calculator output. In other words: for the purposes of aggregating tax changes, tc doesn't make any assumptions about shifts in the likelihood of filing. All tax units are included in the aggregations.

So with that answered, I think this from above is the only open question:

I thought the start_year option only affected behavioral assumptions (I left it in there because in my more extensive workflow I was toggling behavioral assumptions). Why does it make such a huge difference here, where I'm not making any behaviorial changes?

@martinholmer
Copy link
Collaborator

martinholmer commented Nov 5, 2018

@evtedeschi3 said in issue #2099:

I found too that if I also specify the num_years option, I avoid the fatal error:

pol = Policy(start_year=2019, num_years=9)

Yes, you can avoid the error of specifying the last Policy year being beyond the last year in the growfactors.csv file by lowering the value of num_years from its default value of 15 to 9.

The start_year and num_years arguments of the Policy class constructor are to be avoided. They are used only in some of our unit tests. I can't think of any tax-analysis situations that require their use.

In summary, always create a Policy object using Policy().

@martinholmer
Copy link
Collaborator

@evtedeschi3 said in issue #2099:

I thought the [Policy class constructor]start_year [argument] only affected behavioral assumptions (I left it in there because in my more extensive workflow I was toggling behavioral assumptions).

I'm pretty sure that is not true, but maybe I'm missing something.
When I said:

In summary, always create a Policy object using Policy()

that includes all situations; even situations where you're using behavioral assumptions. For example, advanced recipe 2 in the Cookbook of Tested Recipes for Python Programming with Tax-Calculator demonstrates using the Behavior class and that recipe constructs Policy objects using the standard Policy() call.

Maybe this is an opportunity for us to learn about the need to improve Tax-Calculator documentation. Did you read something in the documentation that made you think that when using behavioral assumptions you needed to construct the Policy object in a non-standard way?

@hdoupe
Copy link
Collaborator

hdoupe commented Nov 5, 2018

@martinholmer said:

The start_year and num_years arguments of the Policy class constructor are to be avoided. They are used only in some of our unit tests. I can't think of any tax-analysis situations that require their use.

Can you point me to a test that does this?

@martinholmer
Copy link
Collaborator

@hdoupe asked in issue #2099:

Can you point me to a test that does this?

The only one I could find is in test_calculator.py:

def test_make_calculator(cps_subsample):
    syr = 2014
    pol = Policy(start_year=syr, num_years=9)
    assert pol.current_year == syr
    rec = Records.cps_constructor(data=cps_subsample)
    consump = Consumption()
    consump.update_consumption({syr: {'_MPC_e20400': [0.05]}})
    assert consump.current_year == Consumption.JSON_START_YEAR
    calc = Calculator(policy=pol, records=rec,
                      consumption=consump, behavior=Behavior())
    assert calc.current_year == syr
    assert calc.records_current_year() == syr

So, this reinforces my thought that we should simply remove both start_year and num_years from the Policy class constructor signature. Does that make sense? We've not using these arguments and they seem to cause confusion even among our most advanced users. Seems like their presence is a net loss.

@MattHJensen

@hdoupe
Copy link
Collaborator

hdoupe commented Nov 5, 2018

@martinholmer said:

So, this reinforces my thought that we should simply remove both start_year and num_years from the Policy class constructor signature. Does that make sense? We've not using these arguments and they seem to cause confusion even among our most advanced users. Seems like their presence is a net loss.

That makes sense to me.

@ernietedeschi
Copy link
Contributor Author

@martinholmer wrote:

Maybe this is an opportunity for us to learn about the need to improve Tax-Calculator documentation. Did you read something in the documentation that made you think that when using behavioral assumptions you needed to construct the Policy object in a non-standard way?

Hmm, looking over the API documentation, I may have inadvertently been looking at the wrong class to get that impression.

At any rate, my question has been fully answered so I'm going to close this issue. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants