Add ability to read JSON reform and assumption files on web #2070

martinholmer · 2018-10-04T01:39:58Z

@andersonfrailey has brought to my attention that Tax-Calculator users would like the ability to specify the URL of a JSON reform or assumption file rather than downloading it from some website and specifying its file name on the local computer). This seems like a reasonable enhancement request.

It seems as if the easiest way to implement this enhancement is to generalize the logic of the Calculator.read_json_param_objects(reform, assump) method so that the two arguments can be strings containing the URL of an online JSON file. So, for example, the URL of the Brown-Khanna Grow American Incomes Now (GAIN) Act of 2017 that is on the Tax-Calculator GitHub website is as follows:

https://github.com/open-source-economics/Tax-Calculator/blob/master/taxcalc/reforms/BrownKhanna.json

The objective of the enhancement would be to make reading the JSON at a given URL something the Calculator.read_json_param_objects(reform, assump) method can do. This enhancement would be available both to those who develop Python scripts using the taxcalc package and to those who use the tc command-line interface to the taxcalc package.

Does this approach make sense as a way to provide this enhancement? Are there other easier approaches? The approach described here provides the flexibility to read any URL; is that ability important?

The text was updated successfully, but these errors were encountered:

MaxGhenis · 2018-10-04T02:30:35Z

See #1644 (comment) where I provided code to do this:

import urllib2

REFORM_PATH = 'https://raw.githubusercontent.com/open-source-economics/Tax-Calculator/master/taxcalc/reforms/'
url = REFORM_PATH + 'TCJA_Reconciliation.json'

reform_text = urllib2.urlopen(url).read()

reform = Calculator.read_json_param_objects(reform_text, 'JCT_Behavior.json')

feenberg · 2018-10-04T12:31:39Z

On Wed, 3 Oct 2018, Martin Holmer wrote: @andersonfrailey has brought to my attention that Tax-Calculator users would like the ability to specify the URL of a JSON reform or assumption file rather than downloading it from some website and specifying its file name on the local computer). This seems like a reasonable enhancement request.

This is a nice feature. The only packages I have seen that can do this are SAS and Stata - the input statements can take a file name or http address. I expect there is some "generalized" input procedure in Python that will do this - every language and package should have this ability. Dan

martinholmer · 2018-10-04T13:31:34Z

@MaxGhenis said in issue #2070:

See #1644 (comment) where I provided code to do this [that is, read JSON reform files directly from the Tax-Calculator website]

@MaxGhenis is absolutely correct about a Tax-Calculator enhancement being unnecessary because this can be done (in Python 2.7 using the urllib2 package) with only a couple Python statements.

A new user might not have found closed issue #1644 (which was open during Nov-Dec 2017), but how to read JSON reform files in Python 3.6 from the Tax-Calculator website is illustrated in recipe 1 of the Cookbook of Tested Recipes for Python Programming with Tax-Calculator. For those who have not read that recipe, here are the first few lines:

from urllib.request import urlopen
from taxcalc import *

# read two "old" reform files from Tax-Calculator website
# ("old" means the reform files are defined relative to pre-TCJA policy)
#   For more about the compound-reform technique used in this recipe,
#   read answer to Question 1 of FAQ at the following URL:
#   https://github.com/open-source-economics/Tax-Calculator/issues/1830
BASE_URL = ('https://raw.githubusercontent.com/'
            'open-source-economics/Tax-Calculator/master/taxcalc/reforms/')

baseline_name = '2017_law.json'  # pre-TCJA policy
baseline_text = urlopen(BASE_URL + baseline_name).read().decode()
baseline = Calculator.read_json_param_objects(baseline_text, None)

reform1_name = 'TCJA_Senate.json'  # TCJA as orginally proposed in Senate
reform1_text = urlopen(BASE_URL + reform1_name).read().decode()
reform2_name = 'TCJA_Reconciliation.json'  # TCJA as passed by Congress
reform2_text = urlopen(BASE_URL + reform2_name).read().decode()

# specify Policy object for static analysis of reform1 relative to pre-TCJA
reform1 = Calculator.read_json_param_objects(reform1_text, None)
policy1 = Policy()
policy1.implement_reform(baseline['policy'], print_warnings=False)
policy1.implement_reform(reform1['policy'], print_warnings=False)

# specify Policy object for static analysis of reform2 relative to pre-TCJA
reform2 = Calculator.read_json_param_objects(reform2_text, None)
policy2 = Policy()
policy1.implement_reform(baseline['policy'], print_warnings=False)
policy2.implement_reform(reform2['policy'], print_warnings=False)

So, given all this, what should we make of the fact that (new?) users are asking @andersonfrailey how to read JSON reform files directly from the Tax-Calculator website? Do these users think the approach illustrated in Cookbook recipe 1 is too difficult to do in a Python script? Or is it the case that these users have never read Cookbook recipe 1? If users have read Cookbook recipe 1, but think it is too difficult, then we have one kind of problem. But if users have never read the Cookbook, then we have a very different kink of problem. @andersonfrailey, which is it?

andersonfrailey · 2018-10-04T15:12:50Z

@martinholmer I believe it's both. To my knowledge the users I've talked to haven't read the cookbook, but they're also relatively new to Python and aren't familiar with packages like requests and urllib.

I agree that this enhancement isn't technically necessary because there is a solution as has been stated earlier in this conversation, but I also think providing an easy way for users to access at least the sample reforms we've created using a function of the taxcalc class would be a nice feature especially for inexperienced users. I'm envisioning them having a workflow that looks something like this:

import taxcalc

reform = taxcalc.reforms('2017_law')
params = taxcalc.Calculator.read_json_param_objects(reform, None)
pol = taxcalc.Policy()
pol.implement_reform(params['policy'])
...

The reforms function could work in a couple of ways. First, we could limit it to just accessing the reforms we have stored in taxcalc/reforms. The user would specify by name which one they wanted, then the function would pull the file from GitHub and return JSON text that's ready to be read by Calculator.read_json_param_objects or a dictionary that can be passed directly into Policy.implement_reform. This is what I laid out in the above example.

Second option would be to generalize reforms so that it can take as an argument any URL from which to pull the JSON text whether it's hosted on GitHub or some other site and return either JSON text or a dictionary. This would effectively take this part of the cookbook recipe and put it in taxcalc:

baseline_name = '2017_law.json'  # pre-TCJA policy
baseline_text = urlopen(BASE_URL + baseline_name).read().decode()

By making this an actual feature of taxcalc we could also extend this capability to the command line users by adding an argument that takes either a URL or reform name.

MaxGhenis · 2018-10-04T15:50:32Z

I provided the code to save some time implementing this feature. I think both of @andersonfrailey's solutions would be really nice additions, i.e. supporting both:

taxcalc.Calculator.read_json_param_objects(taxcalc.reforms('2017_law'), None)
taxcalc.Calculator.read_json_param_objects('http://...path.../2017_law.json'), None)

hdoupe · 2018-10-04T16:41:22Z

I think option 1 that @andersonfrailey outlines could be a nice feature. I think the limited scope of the first part of the proposal means that this could be added and maintained with a minimal amount of effort.

Many tutorials using scikit-learn make use of a built in "iris" data set. Users can easily follow along and replicate the analyses just by pip/conda installing the packages.

martinholmer · 2018-10-04T16:54:30Z

@andersonfrailey seems to be saying in issue #2070 that users generally aren't reading the Cookbook, but even if they had read the Cookbook they would want an easier way to read JSON reform/assumption files from a website. And @MaxGhenis agrees that an ease-of-use enhancement would be desirable.

If we are going to add such an ease-of-use enhancement, I don't like the idea of restricting access to only JSON files that are on the Tax-Calculator website. The whole idea of the Tax-Calculator code is that it is highly portable, so adding a Calculator method that is Open-Source-Economics centric is not desirable. We want to encourage users to post their own JSON reform files on their personal websites for others to use in their Python scripts. This would be a nice feature that might encourage the growth of an interactive user group. And it is no more difficult to implement what @andersonfrailey and @MaxGhenis called option 2 (see above comments) than it is to implement option 1.

The way to implement option 2 is along the lines I outlined in the first comment in this issue.

Would somebody like to volunteer to prepare a pull request that implements option 2?

andersonfrailey · 2018-10-04T17:30:50Z

@martinholmer I can take care of it.

MattHJensen · 2018-10-04T17:36:08Z

I agree with @martinholmer that we should encourage the development of JSON files on websites other than Tax-Calculator's.

Establishing the Tax-Calculator repo as a central hub for reform files places a significant review and maintenance burden on Tax-Calculator developers, and it creates a fairly serious reputational risk for the project.

As an aside:

At some point we may even want to move the existing JSON reform files out of the Tax-Calculator code base and into separate unaffiliated or semi-affiliated repositories. Yes, I was an advocate for putting them in Tax-Calculator to begin with, and it was probably the right decision at the time, but I am not so sure whether it is going forward. We should have a discussion about this at some point.

martinholmer · 2018-10-04T18:25:01Z

@andersonfrailey said:

I can take care of it.

Great! Look forward to your PR.

martinholmer · 2018-10-04T18:27:21Z

@MattHJensen said in issue #2070:

we should encourage the development of JSON files on websites other than Tax-Calculator's.

Establishing the Tax-Calculator repo as a central hub for reform files places a significant review and maintenance burden on Tax-Calculator developers, and it creates a fairly serious reputational risk for the project.

As an aside:

At some point we may even want to move the existing JSON reform files out of the Tax-Calculator code base and into separate unaffiliated or semi-affiliated repositories. Yes, I was an advocate for putting them in Tax-Calculator to begin with, and it was probably the right decision at the time, but I am not so sure whether it is going forward. We should have a discussion about this at some point.

@MattHJensen, thanks for making these excellent points. I hadn't thought of this angle, but I agree with your line of thinking.

martinholmer · 2018-10-04T18:34:16Z

@hdoupe said in issue #2070:

Many tutorials using scikit-learn make use of a built in "iris" data set. Users can easily follow along and replicate the analyses just by pip/conda installing the packages.

Yes and Tax-Calculator makes use of a built in CPS data set via a single statement:

recs = Records.cps_constructor()

So, on the data side Tax-Calculator is even easier to use than the "iris" data set in scikit-learn because users don't have to install a data package (the CPS data is built into the taxcalc package).

Making online JSON reform/assumption files useable with the Calculator.read_json_param_objects() method will help users write their Python programs. Does scikit-learn have that kind of capability?

hdoupe · 2018-10-04T19:12:07Z

@martinholmer I think it's great that we already have the cps_constructor method. I lopped the reform files into that same "data" category which already includes the cps file. So, I was trying to say that I wanted more data (the reform files) accessible from the python package.

martinholmer · 2018-10-04T19:15:38Z

@hdoupe said in issue #2070:

I think it's great that we already have the cps_constructor method. I lopped the reform files into that same "data" category which already includes the cps file. So, I was trying to say that I wanted more data (the reform files) accessible from the python package.

OK. That makes sense.

hdoupe · 2018-10-04T19:16:30Z

Yep, sorry for the misunderstanding. I'm excited to see @andersonfrailey's implementation of the reform file fetcher.

martinholmer · 2018-10-25T14:14:56Z

This issue was resolved by the merger of pull request #2079.

martinholmer added enhancement request labels Oct 4, 2018

andersonfrailey mentioned this issue Oct 10, 2018

Allow user to specify a reform file from a URL #2079

Merged

martinholmer closed this as completed Oct 25, 2018

MattHJensen mentioned this issue Aug 14, 2020

Function to get reform json from installed version of taxcalc #2445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to read JSON reform and assumption files on web #2070

Add ability to read JSON reform and assumption files on web #2070

martinholmer commented Oct 4, 2018

MaxGhenis commented Oct 4, 2018

feenberg commented Oct 4, 2018 via email

martinholmer commented Oct 4, 2018

andersonfrailey commented Oct 4, 2018

MaxGhenis commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 4, 2018 •

edited

Loading

andersonfrailey commented Oct 4, 2018

MattHJensen commented Oct 4, 2018

martinholmer commented Oct 4, 2018

martinholmer commented Oct 4, 2018 •

edited

Loading

martinholmer commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 25, 2018

Add ability to read JSON reform and assumption files on web #2070

Add ability to read JSON reform and assumption files on web #2070

Comments

martinholmer commented Oct 4, 2018

MaxGhenis commented Oct 4, 2018

feenberg commented Oct 4, 2018 via email

martinholmer commented Oct 4, 2018

andersonfrailey commented Oct 4, 2018

MaxGhenis commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 4, 2018 • edited Loading

andersonfrailey commented Oct 4, 2018

MattHJensen commented Oct 4, 2018

martinholmer commented Oct 4, 2018

martinholmer commented Oct 4, 2018 • edited Loading

martinholmer commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 4, 2018

hdoupe commented Oct 4, 2018

martinholmer commented Oct 25, 2018

martinholmer commented Oct 4, 2018 •

edited

Loading

martinholmer commented Oct 4, 2018 •

edited

Loading