[WIP] Stan backend #347

hyunjimoon · 2022-08-01T14:00:37Z

Description

1. What is this PR doing and how?

The main contribution of this PR is inference; its benefits are estimating parameter values and calibration. Currently pysd is doing a good job in data generation conditional on parameter values but lacks modules that retrieves parameter from the generated data. This pr aims to fill the gap; Stan is a computational Bayesian statistics package, with a large eco-system, providing state of the art inference algorithms including MCMC (HMC-NUTS) and variational inference. Given the following input from users, our output is posterior samples of estimated parameters.

dynamic model in .mdl or .xmile or .stmx format
user's classification of

a. assumed parameter
a-1. time series of parameters (e.g. temperature or demand)
a-2. fixed value of parameter (e.g. size of one franchise branch)

b. estimated parameter (e.g. delay time of shipment)
b-1. prior distribution
b-2. prior parameter

c. observed state (stock variables with observed data)

Under the hood, files under pysd/builders/stan folder, we parse xmile, mdl, stmx files into one Stan code file .stan and pass it to Stan's state of the art MCMC infefinference engine, which serves inverse process of pysd's data generation. Stan has a large ecosystem that supports both calibration and model expansion (e.g. hierarchical regression, generalized linear).

These are tested in test_scripts folder, and main parsing logics are in here and here. (SQ3. could you add some detailes on each file in the last sentence?)

2. Future plans

This is part of SilkRoad project documented here.

SilkRoad, where User and Programs

Step	Program's work (P-rows have `.function(input)`)	User's work	added info (for U-row) or infrastructure (for P-row) -	out format
U1	`Vensim` assists U1.a()	a. Translate mental model to SD model	Relation btw Variables	`.mdl`
U2	`PySD` assists U2.a()	a. Classify parameters `est_param`, `ass_param`, b. Select `obs_state` among stocks	Want (`est_param`) and Have (assume: `ass_param`, observe: `obs_state`)	`.json`
P1	`PySD`, `.read_vensim`(U1.a): Switch lang. from FP to OOP		None	`.py`
U3		a. Supply value or series of `assmed_param`, b. Choose `family`(:= dist. of `msr_err_scale`)		DataFrame `object`, `.json`
U4		a. Choose `prior_family`(`est_param`'s prior dist. type) , b. Set `prior_param` (`est_param`'s prior param)		`.json`
P2	`PySD`, `.run`(U2.ab, U3.ab): Generate synthetic data			DataFrame `object`
P3	`PySD`,`.create_stan_program`(U2.ab, U3.ab): Switch lang. from OOP to PPL			`.stan`
U5		a. Set precision with `iter_sampling` (:= # of samples)		`int`
P4	`Stan`, `.sample`(P2, P3, U5.a)): Sample posterior draws (optimize measure with specified precision)			DataFrame `object`

Our work is casting dynamic model into statistical model structure as part of Bayesian workflow project which provides principled guidelines for model building.
Two works are planned for the near future and it would be tremendously helpful some support could be given in the second.

Translating vensim and/or stella function using python here
Currently integration, random generation, lookup are completed (but need testing) and smooth is in progress. Some parts we are missing: if then else, step (using if then else), pulse(point mass + divide by dt), delay (hard..), material memory (not possible in stan as it does not allow).

PR verification (to be filled by reviewers)

The code follows the PEP 8 style
The new code has been tested properly for all the possible cases
The overall coverage has not dropped and other features have not been broken.
If new features have been added, they have been properly documented
docs/whats_new.rst has been updated
PySD version has been updated

will document on sub-types of input and output

…-backend

pep8speaks · 2022-08-01T14:00:41Z

Hello @hyunjimoon! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file pysd/builders/stan/ast_walker.py:

Line 51:80: E501 line too long (86 > 79 characters)
Line 91:80: E501 line too long (80 > 79 characters)
Line 96:80: E501 line too long (168 > 79 characters)
Line 97:80: E501 line too long (98 > 79 characters)
Line 259:80: E501 line too long (121 > 79 characters)
Line 264:80: E501 line too long (121 > 79 characters)
Line 298:80: E501 line too long (101 > 79 characters)

In the file pysd/builders/stan/stan_block_builder.py:

Line 25:80: E501 line too long (151 > 79 characters)
Line 29:80: E501 line too long (133 > 79 characters)
Line 36:5: E303 too many blank lines (2)
Line 60:80: E501 line too long (102 > 79 characters)
Line 75:80: E501 line too long (164 > 79 characters)
Line 79:80: E501 line too long (109 > 79 characters)
Line 83:80: E501 line too long (114 > 79 characters)
Line 89:80: E501 line too long (154 > 79 characters)
Line 96:80: E501 line too long (85 > 79 characters)
Line 100:80: E501 line too long (107 > 79 characters)
Line 120:80: E501 line too long (86 > 79 characters)
Line 121:80: E501 line too long (103 > 79 characters)
Line 162:80: E501 line too long (80 > 79 characters)
Line 163:80: E501 line too long (122 > 79 characters)
Line 180:80: E501 line too long (146 > 79 characters)
Line 186:1: W293 blank line contains whitespace
Line 196:80: E501 line too long (147 > 79 characters)
Line 204:80: E501 line too long (127 > 79 characters)
Line 210:5: E303 too many blank lines (2)
Line 228:80: E501 line too long (83 > 79 characters)
Line 230:80: E501 line too long (152 > 79 characters)
Line 232:80: E501 line too long (141 > 79 characters)
Line 236:5: E303 too many blank lines (3)
Line 237:80: E501 line too long (132 > 79 characters)
Line 240:80: E501 line too long (99 > 79 characters)
Line 257:80: E501 line too long (134 > 79 characters)
Line 266:80: E501 line too long (80 > 79 characters)
Line 269:80: E501 line too long (95 > 79 characters)
Line 287:80: E501 line too long (83 > 79 characters)
Line 340:80: E501 line too long (82 > 79 characters)
Line 404:80: E501 line too long (107 > 79 characters)
Line 427:80: E501 line too long (93 > 79 characters)
Line 452:80: E501 line too long (103 > 79 characters)
Line 453:1: W391 blank line at end of file

In the file pysd/builders/stan/stan_model.py:

Line 3:11: E401 multiple imports on one line
Line 9:1: E302 expected 2 blank lines, found 1
Line 20:80: E501 line too long (131 > 79 characters)
Line 22:80: E501 line too long (105 > 79 characters)
Line 23:80: E501 line too long (119 > 79 characters)
Line 28:80: E501 line too long (150 > 79 characters)
Line 35:80: E501 line too long (107 > 79 characters)
Line 47:1: E302 expected 2 blank lines, found 1
Line 59:13: E722 do not use bare 'except'
Line 80:80: E501 line too long (82 > 79 characters)
Line 82:80: E501 line too long (99 > 79 characters)
Line 92:80: E501 line too long (98 > 79 characters)
Line 95:80: E501 line too long (151 > 79 characters)
Line 101:80: E501 line too long (90 > 79 characters)
Line 114:80: E501 line too long (96 > 79 characters)
Line 117:80: E501 line too long (106 > 79 characters)
Line 121:80: E501 line too long (90 > 79 characters)
Line 125:80: E501 line too long (128 > 79 characters)
Line 130:80: E501 line too long (83 > 79 characters)
Line 135:80: E501 line too long (85 > 79 characters)
Line 142:80: E501 line too long (109 > 79 characters)
Line 143:80: E501 line too long (82 > 79 characters)
Line 144:80: E501 line too long (86 > 79 characters)
Line 152:80: E501 line too long (134 > 79 characters)
Line 155:80: E501 line too long (83 > 79 characters)
Line 156:80: E501 line too long (90 > 79 characters)
Line 157:80: E501 line too long (179 > 79 characters)
Line 161:80: E501 line too long (98 > 79 characters)
Line 162:80: E501 line too long (84 > 79 characters)
Line 163:80: E501 line too long (126 > 79 characters)
Line 165:80: E501 line too long (139 > 79 characters)
Line 166:80: E501 line too long (98 > 79 characters)
Line 168:80: E501 line too long (145 > 79 characters)
Line 170:80: E501 line too long (169 > 79 characters)
Line 173:5: E303 too many blank lines (2)
Line 175:80: E501 line too long (115 > 79 characters)
Line 176:80: E501 line too long (121 > 79 characters)
Line 177:80: E501 line too long (115 > 79 characters)
Line 183:80: E501 line too long (93 > 79 characters)
Line 184:80: E501 line too long (140 > 79 characters)
Line 187:80: E501 line too long (100 > 79 characters)
Line 189:80: E501 line too long (150 > 79 characters)
Line 192:80: E501 line too long (97 > 79 characters)
Line 202:80: E501 line too long (86 > 79 characters)
Line 205:80: E501 line too long (108 > 79 characters)
Line 208:80: E501 line too long (125 > 79 characters)
Line 213:80: E501 line too long (94 > 79 characters)
Line 215:80: E501 line too long (102 > 79 characters)
Line 216:80: E501 line too long (106 > 79 characters)
Line 217:80: E501 line too long (110 > 79 characters)
Line 218:80: E501 line too long (99 > 79 characters)
Line 229:80: E501 line too long (97 > 79 characters)
Line 238:80: E501 line too long (112 > 79 characters)
Line 240:80: E501 line too long (86 > 79 characters)
Line 249:17: E265 block comment should start with '# '
Line 249:80: E501 line too long (174 > 79 characters)
Line 251:80: E501 line too long (125 > 79 characters)
Line 253:80: E501 line too long (94 > 79 characters)
Line 254:56: E127 continuation line over-indented for visual indent
Line 254:80: E501 line too long (102 > 79 characters)
Line 255:56: E127 continuation line over-indented for visual indent
Line 255:80: E501 line too long (106 > 79 characters)
Line 256:56: E127 continuation line over-indented for visual indent
Line 256:80: E501 line too long (95 > 79 characters)
Line 257:56: E127 continuation line over-indented for visual indent
Line 260:80: E501 line too long (95 > 79 characters)
Line 261:80: E501 line too long (163 > 79 characters)
Line 268:1: W391 blank line at end of file

In the file pysd/builders/stan/utilities.py:

Line 3:1: E302 expected 2 blank lines, found 1
Line 40:5: E303 too many blank lines (2)

In the file test_scripts/print_vensim_components.py:

Line 5:1: E265 block comment should start with '# '
Line 6:1: E265 block comment should start with '# '
Line 19:33: W292 no newline at end of file

In the file test_scripts/stan_vensim_integration.py:

Line 23:80: E501 line too long (106 > 79 characters)
Line 23:88: E251 unexpected spaces around keyword / parameter equals
Line 23:90: E251 unexpected spaces around keyword / parameter equals
Line 37:1: E265 block comment should start with '# '
Line 40:1: E265 block comment should start with '# '
Line 41:1: E265 block comment should start with '# '
Line 42:1: E265 block comment should start with '# '
Line 42:18: W292 no newline at end of file
Line 42:18: W292 no newline at end of file
Line 42:18: W292 no newline at end of file

Comment last updated at 2022-09-11 12:48:26 UTC

hyunjimoon · 2022-08-04T20:18:36Z

Tom Fiddaman (from Vensim) shared his recent talk on calibration which is relevant to our goal.
Fiddaman22_VensimCalib.pdf

Based on PySD team's warm greeting and Vensim team's recent interest, may I go one step further from inference to calibration on this PR? I carefully drafted my opinion of future plans, detailed list of works, and some questions.

Work for Vensim-PySD-Stan Calibration Collaboration

Table for the whole workflow is one the first pr documentation here.

Vensim-PySD-Stan

Revise the above table until we have consensus (feedback welcomed!)
- Time-orderly step from U1 to U5 and P1 to P4 describes interaction interface between programs and users as they evolve. (branched out from Bayesian workflow)
- P.a(), U.a() are function, P.a, U.a are output
- In U2: est_param : estimated parameter, ass_param: assumed parameter, obs_state: observed state
Adding hierarchy by connecting vensim's subscript to PySD object to Stan model block

PySD-Stan

Unify input for P3 and P4 by defining types for U2ab, U3ab (ref: Arviz.InferenceData type)
Define function with the following input (ref: brms brm() ) would enhance process' modularity
- family
- prior
- algorithm
- stan_model_args
Define prior class (ref: brms set_prior, comparison with pymc's prior system is needed)
Connect algorithm to Stan's different algorithm (optimize() for optimization, variational() for variational inference, look for Kalman filtering if needed)
Decision-based calibration using generate_quantities()
Add the following to notebook:
- a. prior predictive check: how users do U3 and U4 (especially U4 requires both stat + domain knowledge)
- b. calibration:
  - 1. revisit goals and resources in U2, assumptions in U3, prior knowledge in U4, system tolerance in U5 based until model passes SBC test
  - 1. sensitivity check on prior distribution and prior parameter
  - 1. compare posterior with prior
- c. posterior predictive: compare with the real observed state data

Question for discussion

U2b: which is better between "b. Select obs_state among stocks" and b. Classify obs_state and nonobs_state? It depends on the average ratio of obs_state among stock variable which was 1 for prey-predator, and .75 for demand-supply model.
Is it ok to view vensim as functional programming (FP) language? Compared to OOP, FP's traits fewer data types and many functions seem related (perhaps why Julia has great ODE lib) e.g. of object vs function orient: array.sort() vs sort(array)
do you agree
- FP is better for U1 elicit mental model? May I express this as translate prior knowledge to prior distribution (or archetype, policy, parameter)?
- OOP is better for P2 data generation?

enekomartinmartinez · 2022-08-12T09:49:28Z

HI @hyunjimoon @Dashadower
I just wanted to let you know that we are planning to move PySD to SDXorg in September.
You should not notice any change as the links will be automatically redirected, however, it is recommendable once the migration is done to update the link location to the new one. The issues and PR will be also migrated so everything should keep as before with the only change of the repository location.
You can follow the migration process in the following issue #348

enekomartinmartinez · 2022-08-12T09:51:44Z

@hyunjimoon @Dashadower also mention that currently the tests are "waiting for approval". I will not approve them until you have included the test for the new builder so we save sending jobs to GitHub Actions.
Please, let me know when you want the CI test to be running for this PR.

…nctions

Stan backend d2d

hyunjimoon · 2022-09-08T16:24:53Z

@Dashadower and I am planning for a test:

prey-predator.mdl's python translation
prey-predator.mdl's stan translation
compare ytilde for 1 and 2 (with alpha = .55, gamma = .8, beta = .028, delta = .024)

I would need @Dashadower's help on completing stan_builder's draws2data(). Minimum Viable Product in that estimated parameter has constant value.

@JamesPHoughton Could you confirm the above would be enough to build trust on stan-interface modeling on which our future collaboration can be based?

…o stan-backend

enekomartinmartinez · 2022-09-19T21:06:37Z

@hyunjimoon @Dashadower I would recommend you to merge the master branch from PySD into your branch. So you do not work with an outdated version of PySD.

We fixed some bugs related to the latest parsimonious and pandas releases. And there are several other bug fixes and new features.

Dashadower and others added 15 commits July 14, 2022 01:34

Initial implementation of stan translator

0ee4580

Update testing.py

73db7e8

change model input and output signature

6246c21

will document on sub-types of input and output

Add LOOKUP implementation. Refactor AST walkers to classes

b651530

Automatically identify stock variables and initial values

64eb140

WIP static RNG function

4f1e37e

Update test models and initial value generation

e912553

Allow non-default outcome variable passage

e7fa259

Add missing semicolon

8c9f767

Add float to auxillary name walker

ce5bc65

Update explanation on outcome signature

f37c720

Readme for PR preparation

2903470

Update readme.md

02971a3

Update readme.md

46df7c6

Merge branch 'master' of https://github.com/Dashadower/pysd into stan…

3569efd

…-backend

JamesPHoughton mentioned this pull request Aug 3, 2022

Add ESLint to github PR checks Watts-Lab/deliberation-empirica#254

Closed

hyunjimoon and others added 4 commits August 7, 2022 03:54

Extend to Hierarchical models

12039a9

Substitute match with isinstance.

1903312

Autoformat with black

af542c4

Minor update to logic

c744e17

hyunjimoon mentioned this pull request Aug 13, 2022

Pass Pysd CI test hyunjimoon/DataInDM#8

Open

hyunjimoon and others added 5 commits August 18, 2022 02:57

Add demand_supply model workflow in notebook file

b6305e5

Relocate notebook file

76eea02

Update return signature of ode function and change deprecated stan fu…

ae761d3

…nctions

Change output signature of ode call

dc961b6

Update notebook to include full flow

8cd38f2

hyunjimoon and others added 19 commits August 18, 2022 10:37

Add tables to workflow

76f5485

Update notebook for relation input (issue SDXorg#16)

c921d21

WIP stan model interface

ad76f2a

Two not compile tested stan files

b3c5a1f

data2draws implementation

1d9545e

Updates to data2draws

a009a47

Check if variables with priors exist in the ODE function declaration

161e55e

Update codegen to match stan 2.30

9d8e1f2

Update lookup codegen logic and create cmdstan model

5584248

Change filename

09c6086

Six manual comments each for draws2data and data2draws

e2d4df8

Add files via upload

3227876

Six comments each for interface upgrade

fd5e4c1

implement draws2data

2ff05de

Merge branch 'stan-backend' into stan-backend-d2d

4ba4e81

Merge pull request #1 from hyunjimoon/stan-backend-d2d

ad4b855

Stan backend d2d

Add constraints to parameters and better input data integration

5f1b79a

Change stan model directory

d57897c

Update data block

c0f88e1

hyunjimoon mentioned this pull request Sep 2, 2022

Vignette for large/complex model workflow hyunjimoon/SBC#41

Open

enekomartinmartinez force-pushed the master branch from 4c56ed6 to daf724c Compare September 5, 2022 15:35

Codegen cleanup and draws2data implementation

53febea

hyunjimoon and others added 5 commits September 11, 2022 08:24

Rearrange vensim file structure

bd75d72

Update draws2data

83440b0

Merge branch 'stan-backend' of https://github.com/hyunjimoon/pysd int…

156b6ac

…o stan-backend

Three checks with draws2data, data2draws function

f638ac5

Bug fix and analysis notebook for preypredator

53a2576

Merge remote-tracking branch 'upstream/master' into stan-backend

c022c72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Stan backend #347

[WIP] Stan backend #347

hyunjimoon commented Aug 1, 2022 •

edited

Loading

pep8speaks commented Aug 1, 2022 •

edited

Loading

hyunjimoon commented Aug 4, 2022 •

edited

Loading

enekomartinmartinez commented Aug 12, 2022

enekomartinmartinez commented Aug 12, 2022

hyunjimoon commented Sep 8, 2022 •

edited

Loading

enekomartinmartinez commented Sep 19, 2022

[WIP] Stan backend #347

Are you sure you want to change the base?

[WIP] Stan backend #347

Conversation

hyunjimoon commented Aug 1, 2022 • edited Loading

Description

1. What is this PR doing and how?

2. Future plans

SilkRoad, where User and Programs

PR verification (to be filled by reviewers)

pep8speaks commented Aug 1, 2022 • edited Loading

Comment last updated at 2022-09-11 12:48:26 UTC

hyunjimoon commented Aug 4, 2022 • edited Loading

Work for Vensim-PySD-Stan Calibration Collaboration

Vensim-PySD-Stan

PySD-Stan

Question for discussion

enekomartinmartinez commented Aug 12, 2022

enekomartinmartinez commented Aug 12, 2022

hyunjimoon commented Sep 8, 2022 • edited Loading

enekomartinmartinez commented Sep 19, 2022

hyunjimoon commented Aug 1, 2022 •

edited

Loading

pep8speaks commented Aug 1, 2022 •

edited

Loading

hyunjimoon commented Aug 4, 2022 •

edited

Loading

hyunjimoon commented Sep 8, 2022 •

edited

Loading