-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are notebooks in the taxdata repository? #206
Comments
I believe the notebooks were originally used to develop the python scripts associated with whichever directory they're in. I'd say the stage 1 notebook is no longer needed, but the stage 2 notebook is still useful. The CyLP model used for PUF stage 2 and the PuLP model I used for the CPS stage 2 will do a "pre-solve" to see if the LP problem they're solving is feasible with the given tolerance. Unfortunately even if the problem is determined to be infeasible the models will still try and solve it with the only notification of it's infeasibility being a print statement in the terminal window. Unless you actively stop the program when you see the infeasibility notice it will keep running indefinitely. The notebook makes it easier for us to check that the problem will be solvable each year, update the .py files with the tolerances we know will result in a feasible problem, and then run the .py file without having to worry about whether or not the problem is feasible. Otherwise whoever is running the stage 2 scripts will have to keep checking the terminal and restart the entire thing every time a tolerance needs to be updated. |
@andersonfrailey said in #206:
OK. I'll prepare a PR that removes it.
Thanks for the explanation. I see the problem, but I think there is a better solution to this problem. I'll post a comment here that describes a solution that does not require an error-prone duplication of code (in the Python script and in the notebook) and does stop execution of the Python script once an "infeasibility notice" is written. |
@martinholmer it prints to stdout. Here is the message you see in terminal (on a mac) when using CyLP: |
Closing issue #206 following decision to not include any notebooks in the taxdata repository. |
One of the prime benefits of using a version control system (such as
git
) is that it provides a history of changes in a project's source code. But putting source code in notebooks makes it difficult to see the history of changes in source code, undercutting one of the major benefits of using version control.On Friday, May 18, 2018, @MattHJensen discussed the issues related to including notebooks in GitHub project repositories, and concluded that they should not be included in project repositories. Notebooks are a convenient way to share work with other people, and so he was open to creating a notebook-sharing repository if there was demand for such a thing among those working with taxdata, Tax-Calculator, and projects that use Tax-Calculator.
Given that decision, Tax-Calculator pull request 203 removed the only notebook in the Tax-Calculator repository (
read-the-docs/notebooks/10_Minutes_to_Tax-Calculator.ipynb
).Given all this, my question is why are notebooks included in the taxdata repository?
On the tip of the master branch, the following two notebooks exist:
taxdata/stage1/notebook/Stage I.ipynb
, which was last modified by @andersonfrailey more than a year ago on Feb 13, 2017, and is not documented (as far as I can see). What is its relationship totaxdata/stage1/stage1.py
? Why is it under version control?taxdata/puf_stage2/notebook/CLP_solver_16Years.ipynb
, which was last modified by @andersonfrailey about a month ago, and is not documented (as far as I can see). What is its relationship totaxdata/puf_stage2/stage2.py
? Why is it under version control?@MattHJensen @Abraham-Leventhal
The text was updated successfully, but these errors were encountered: