Disentangle solution and simulation frameworks in HARK.core and downstream classes #495

sbenthall · 2020-02-08T00:43:45Z

Related to #493

HARK.core.AgentType provides frameworks for both solving models and simulating them.

These two sets of processes are conceptually distinct. This ticket is for disentangling them in the code, decoupling the two kinds of functionality.

Ideally, it would be possible to use Dolo's simulation engine on a HARK model, and HARK's simulation engine on a Dolo model.

mnwhite · 2020-02-08T02:30:28Z

I don't understand this one. They already are decoupled. If you went and deleted all of the simulation methods from AgentType and/or any of its subclasses, their solve method would still work.

…

On Fri, Feb 7, 2020 at 7:43 PM Sebastian Benthall ***@***.***> wrote: Related to #493 <#493> HARK.core.AgentType provides frameworks for both solving models and simulating them. These two sets of processes are conceptually distinct. This ticket is for disentangling them in the code, decoupling the two kinds of functionality. Ideally, it would be possible to use Dolo's simulation engine on a HARK model, and HARK's simulation engine on a Dolo model. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#495?email_source=notifications&email_token=ADKRAFKLIBK2E777Z7DB5RLRBX54HA5CNFSM4KRXBIZ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IL6RMRQ>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFMYKDGNFSTEE6SWD23RBX54HANCNFSM4KRXBIZQ> .

sbenthall · 2020-02-08T03:07:54Z

Ah, ok.

If you delete all the solution methods from AgentType, would the simulation methods still work?

How can you tell (besides looking through all the code) which of the methods are simulation methods, and which are necessary for the solver?

llorracc · 2020-02-08T12:40:52Z

There's no need to delete anything in the near term. But Dolo has some very cool simulation tools, and I'd like to figure out how we can feed our agents into that. This will require figuring out some means by which we can provide the decision rules etc in whatever form Dolo requires. If we're going to do that, we might as well separate our own simulation tools and have the same interface; then it will be possible to choose either Dolo or HARK for any particular simulation, and switch between them.

…

On Sat, Feb 8, 2020 at 4:07 AM Sebastian Benthall ***@***.***> wrote: Ah, ok. If you delete all the solution methods from AgentType, would the simulation methods still work? How can you tell (besides looking through all the code) which of the methods are simulation methods, and which are necessary for the solver? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#495?email_source=notifications&email_token=AAKCK735Y324CHMOTMMHFQDRBYOYZA5CNFSM4KRXBIZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFIAKY#issuecomment-583696427>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK7663U74CTN7LZYZGDDRBYOYZANCNFSM4KRXBIZQ> .

-- - Chris Carroll

sbenthall · 2020-02-08T13:35:09Z

I wasn't proposing that we delete anything.
I think @mnwhite and I are discussing to what extent the simulation and solution functionalities are currently coupled.
This is instrumental to the goal of changing the HARK interface to be more modular.

mnwhite · 2020-02-08T14:19:30Z

If you delete the solution methods, the simulation can't run because there's nothing for it to act on. The simulator needs to know the policy function for each period in order to be able to simulate what agents do. With very little work, we could have AgentType subclasses do things like `cNrmNow = self.cFunc[t](mNrmNow)` rather than `cNrmNow = self.solution[t].cFunc(mNrmNow)`, so that the simulation code doesn't refer to the `solution` attribute at all. We'd just have the `postSolve()` method run `unpack('cFunc')` (and whatever else is relevant). In this way, you could run a simulation without ever solving, if you wanted. That is, you could say, "What would happen if my 50 period lifecycle consumers acted on this arbitrary linear consumption function?". And do `cFunc_fake = lambda m : 0.1*m + 0.3` and `MyType.cFunc = 50*[cFunc_fake]` and you'd be ready to simulate.

…

On Sat, Feb 8, 2020 at 8:35 AM Sebastian Benthall ***@***.***> wrote: I wasn't proposing that we delete anything. I think @mnwhite <https://github.com/mnwhite> and I are discussing to what extent the simulation and solution functionalities are currently coupled. This is instrumental to the goal of changing the HARK interface to be more modular. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#495?email_source=notifications&email_token=ADKRAFKANGYS2IACK4OC34DRB2YI5A5CNFSM4KRXBIZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFR6IQ#issuecomment-583737122>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFNA77Y2PI57GYI4HW3RB2YI5ANCNFSM4KRXBIZQ> .

sbenthall · 2020-02-08T14:25:48Z

I see. Thanks, that makes sense.

I think Dolo's representation of policies is a class called DecisionRule:
https://github.com/EconForge/dolo/blob/master/dolo/numeric/decision_rule.py

sbenthall · 2020-02-26T22:10:05Z

Revisiting this as I now think I understand how the simulation workflow is supposed to go.
This automated test uses the workflow from the PerfForesightConsumerType DemARK.
https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/tests/test_PerfForesightConsumerType.py#L31-L50

If I understand correctly, the way this works it hat the SimulationParams gets passed in as arguments to the callable AgentType and assigned as member variables on the object, enabling the simulation code to work (which also depends on there being a solution, with solve called previously.)

The thing that I find worrisome about this structure is how it allows for arbitrary variables to be assigned as member variables to the AgentType, without any kind of namespacing. Something could be overwritten and break up the conceptual coherence of the object, let alone its functionality.

For example, what if somebody passes in a new parameter value for PermShkStd while they also pass in the SimulationParams. Of course, this wouldn't automatically update the solution used in the simulation. But it's not clear how that could be signaled in the API docs. Rather, you have to just trust the user to not to try anything funny, or to have read the entire manual cover to cover, which is rarely a well-founded assumption. In fact, this architecture is resulting in a system where a number of necessary parameters are not in the API docs at all (see #493)

It would be quite easy to scaffold things differently.
For example, simulation parameters could be passed in as arguments to the simulate method.
If they are persisted on the object, they could be persisted in a different namespace (i.e., in a dictionary that is stored as a member variable).
That would prevent any accidental collisions.

scikit-learn has a possibly useful analogous coding pattern for classifiers and other trainable model classes. These have a lot of class parameters given in initialization, a fit method for fitting internal model parameters to a data set, and a predict method for applying the model to new data. fit has to be called before predict. The arguments needed for each phase of the class's use are passed in all documented and scoped to the function where they are used.

mnwhite · 2020-02-26T22:27:08Z

Overwriting parameters that were used in solving the model is actually intended behavior / a feature of HARK. It is meant to be easy to answer questions like "what would happen if people thought the standard deviation of (log) permanent income shocks were 0.10, but it was actually 0.14?". You're right that we don't "protect" anything namespace-wise. I didn't even know it was possible in Python, and haven't thought about it since I last worked with Java 19-20 years ago. I'm not sure I actually *do* want to stop the user from doing anything "funny".

…

On Wed, Feb 26, 2020 at 5:10 PM Sebastian Benthall ***@***.***> wrote: Revisiting this as I now think I understand how the simulation workflow is supposed to go. This automated test uses the workflow from the PerfForesightConsumerType DemARK. https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/tests/test_PerfForesightConsumerType.py#L31-L50 If I understand correctly, the way this works it hat the SimulationParams gets passed in as arguments to the *callable* AgentType and assigned as member variables on the object, enabling the simulation code to work (which also depends on there being a solution, with solve called previously.) The thing that I find worrisome about this structure is how it allows for arbitrary variables to be assigned as member variables to the AgentType, without any kind of namespacing. Something could be overwritten and break up the conceptual coherence of the object, let alone its functionality. For example, what if somebody passes in a new parameter value for PermShkStd while they also pass in the SimulationParams. Of course, this wouldn't automatically update the solution used in the simulation. But it's not clear how that could be signaled in the API docs. Rather, you have to just trust the user to not to try anything funny, or to have read the entire manual cover to cover, which is rarely a well-founded assumption. In fact, this architecture is resulting in a system where a number of necessary parameters are not in the API docs at all (see #493 <#493>) It would be quite easy to scaffold things differently. For example, simulation parameters could be passed in as arguments to the simulate method. If they are persisted on the object, they could be persisted in a different namespace (i.e., in a dictionary that is stored as a member variable). That would prevent any accidental collisions. scikit-learn has a possibly useful analogous coding pattern for classifiers <https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html> and other trainable model classes. These have a lot of class parameters given in initialization, a fit method for fitting internal model parameters to a data set, and a predict method for applying the model to new data. fit has to be called before predict. The arguments needed for each phase of the class's use are passed in all documented and scoped to the function where they are used. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#495?email_source=notifications&email_token=ADKRAFKXB4HRSOJIYGIBI43RE3SD3A5CNFSM4KRXBIZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENCCWVA#issuecomment-591670100>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFNDCJYGFP7CPHTAUI3RE3SD3ANCNFSM4KRXBIZQ> .

sbenthall · 2020-02-26T22:48:03Z

The current assignParameters functinality allows users to overwrite core methods, like solve, like so:

>>> from HARK.ConsumptionSaving.ConsIndShockModel import PerfForesightConsumerType
>>> ex = PerfForesightConsumerType()
>>> ex(solve="This breaks things.")
>>> ex.solve()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable

It's totally unnecessary to have this much flexibility.

llorracc · 2020-02-26T23:51:34Z

On the whole this sounds like a good direction to move in, but I agree with Matt that it is important to allow the simulation parameters to differ from the solution parameters, for the reasons he articulates. But I'd definitely be in favor of printing a warning message to the user to signal that to them; my guess is that 9 times out of 10 when that happens, it is just a mistake rather than intentional, which is the kind of circumstance that calls for a warning message. There's a somewhat similar issue that I have found repeatedly plagues newbies (and which even now I occasionally forget about/mess up): The fact that, after you solve a model, you can change the parameters (for example, increase relative risk aversion, or change the standard deviation of shocks) and not remember/realize that you need to invoke the ".solve()" method again in order to actually have the solution in the object be the solution for the parameter values that are its attributes. Unlike the point Matt made about how we might want to simulate with parameters different from those people believed, I can't really think of a use case for being able to change, say, the CRRA attribute without re-solving the model. That's basically going to be a mistake 99.99 percent of the time. I'm not sure exactly what the right fix for this is; certainly we would _not_ want to automatically trigger resolving the model whenever a parameter is changed, because the person might be changing 3 or 4 parameters at the same time and we would NOT want to have to solve the model 3 or 4 times in that circumstance (once after each parameter was changed). Probably the right way to handle this is for the solution object to keep a record of all the parameter settings at the time the solution was constructed (or at least the parameter settings that are likely to be fiddled with), and then to do a quick check when the solution is used to determine whether the parameters are the same as they were when the model was solved.

…

On Wed, Feb 26, 2020 at 11:48 PM Sebastian Benthall < ***@***.***> wrote: The current assignParameters functinality allows users to overwrite core methods, like solve, like so: >>> from HARK.ConsumptionSaving.ConsIndShockModel import PerfForesightConsumerType >>> ex = PerfForesightConsumerType() >>> ex(solve="This breaks things.") >>> ex.solve() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'str' object is not callable It's totally unnecessary to have this much flexibility. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#495?email_source=notifications&email_token=AAKCK734R652IJSBUQXO353RE3WSJA5CNFSM4KRXBIZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENCGFFY#issuecomment-591684247>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK7745DCVR2YPPU2W3RLRE3WSJANCNFSM4KRXBIZQ> .

-- - Chris Carroll

sbenthall · 2020-03-10T21:22:08Z

I've coded up this toy example of how a smooth interface to models and simulations might be designed. I present it for your consideration.

https://github.com/sbenthall/sketches/blob/master/economics/MDP%20Interfaces.ipynb

I know it lacks the sophisticated machinery of HARK, but I want to indicate a few things about it:

The simulation code is totally general
The model variables are tracked in a clean and extensible way
Accessing simulated values uses the variable names to namespace the results
It's easy to assign a decision rule to a control variable and run a simulation, even if it's not the optimal one.

This notebook doesn't have anything yet to solve models--i.e., find the optimal decision rule for each control variable. But I think it's a clean and lightweight way to define a model and simulate one.

It should in principle be possible to adapt a HARK model into this alternative form, and vice versa, no?

llorracc · 2020-03-10T22:37:36Z

(I originally posted this as an issue at your sbenthall/sketches repo, but now realize it should have gone here)

Seb,

Thanks for getting started on this!

My first response is something that you emphasized to Pablo earlier: It's not possible to distinguish between state variables and control variables, because the transition equations mean that today's choice of a control is mathematically identical to the choice of tomorrow's state. It seems that something Dolo/DolARK is going to need is that in every period the list of variables needs to be the same. But a given variable, at different stages, can be either a state or a control.

It IS necessary, at every stage (= subperiod in a cycle; = point at which a decision is made) every variable needs to be designated as either a state or a control. But that is properly accounted for in the definition and solution algorithm for the stage -- being a "state" or a "control" is not an intrinsic property of a variable.

Again, the simple example is the portfolio problem. As of the end of the period, assets are a state variable (having been determined by the consumption choice at the beginning). But, at the beginning, the problem can be formulated as making the choice of what $a$ to end the period with.

I think it would be useful to step back from trying to instantiate these ideas in code, and to do it more abstractly, as pseudocode.

With a given number of cycles n>0, each n would require something like your structure above, but the natural way to think about it is to identify each variable in the given subcycle as a state or a control. (The realization of exogenous/forcing/random variables would be treated, in this typology, as a state).

The problem is then defined by:

a payoff, which can is a function of states and controls
transition equations, which translate this period's states plus this period's controls into next period's states
a (discounted) value function for the next period

and in the Bellman formulation, the problem is to find, for any configuration of states in the current subperiod, the choice of controls that maximizes the sum of the payoff today and the resulting value tomorrow. You keep reverting to the Markov Decision Problem terminology, but the MDP framework is incomplete without a description of the criteria by which the decision rule is determined. That's why I keep emphasizing that we are solving Bellman problems. The key special sauce that turns an MDP into a Bellman problem is the value/utility maximization step, which determines the decision rule.

So, the most fundamental thing that needs to be kept track of is the value function. The decision rule is derived from maximizing the sum of today's value and the expected value from the resulting states. Decision rules are very useful objects DERIVED from the value function, but they are defined by their ability to maximize value. There are many cases where the problem cannot be solved without the value function.

llorracc · 2020-03-10T22:50:34Z

PS. The point is not that I want to insist that nobody should ever consider models in which people are not perfect maximizers. It is that we always want to be able to compare actual behavior to optimal behavior, which means we need a framework to calculate optimal behavior. The requirement that the problem must be formulated in such a way as to permit a Bellman solution provides the crucial extra discipline and coherence that is essential to describing a problem in a way that connects with the vast economics literature.

sbenthall · 2020-03-11T13:27:00Z

I expect it will be good to discuss this on our Thursday meeting to clear up any potential miscommunication, especially as this conversation is now happening in three places. I'm not sure I'm following you here, but let me try to respond.

I believe in your response, you are focusing on the challenge of modeling a multi-stage, and especially a cyclic, problem. To solve such a problem in the economic idiom, this requires a representation in Bellman form.
I've been looking for a Bellman form for a cyclic problem in HARK; I'm not sure where to locate it. I've made an issue for this: #563

I will take another look at the portfolio problem.

But I think we are talking past each other on a specific point.
In your response, you are talking about the process for solving a model.
Your description is, of course, accurate.

The title of this issue, which I'm trying to address in this work, is "Disentangle solution and simulation frameworks".

You have pointed out that in the "MDP Interfaces" notebook there is no representation of a reward function (yet), or a value function, and that therefore it is inadequate for solving the model as per the economics literature.

I know that. I said as much ("This notebook doesn't have anything yet to solve models--i.e., find the optimal decision rule for each control variable").

In my view, to "disentangle" solution and simulation functionality, it means that you have to be able to run a simulation of a model without having first solved it. Or, having attempted to solve it in different ways.

To put it another way: Currently in HARK, the workflow is:

Hard-code a model class, with solver and simulation functions
Instantiate the model with some parameters
Run solve()
Run simulate()

As you say earlier in this thread, it's possible to change the parameters after solving the model, to simulate it in a different way (so you can run a simulation where the person is using a misinformed or suboptimal policy), but that is quite confusing.

A different way to structure it, which would be quite mathematically well-founded, would be like this:

Define a model in Bellman form. This includes: transition functions, a utility function, and a value function.
A policy or decision rule for this model (take your pick on terminology) is a function mapping period state to decisions as choice variables.
The simulator works generically given a policy, however generated, and a model. (This is what I demoed in the MDP Interface notebook).
The solver takes a model and outputs a policy (hopefully, the best one).

A benefit here is that there shouldn't be custom simulation code for each model. It should be able to run using the model and policy definitions.

sbenthall · 2020-03-17T16:21:29Z

Making a note here as it's relevant to this issue:

Currently, tolerance and pseudo_terminal are passed to the AgentType in its constructor and saved to the instance. These are uses exclusively in the solution of the model.

A use case for a "disentangled" architecture would be to compare various solutions to the same model with different tolerance levels.

This would involve, at the very least, allowing these variables to be reset as arguments to the solver.

sbenthall · 2020-12-21T14:25:17Z

I'd like to revisit this issue in light of discussion about the Frame architecture in #865

Maybe it is time to refactor the solution code so that it is external to the AgentType.

sbenthall · 2021-01-13T14:29:42Z

I'll close this now. It's generated some other ideas but is too general in scope to be implemented.
We're getting in this direction with the Frame architecture, etc.

sbenthall self-assigned this Feb 8, 2020

sbenthall mentioned this issue Mar 3, 2020

Reset a simulation back to zero #550

Closed

sbenthall added the Function: Simulation label Mar 11, 2020

sbenthall added this to the 1.0.0 milestone Mar 11, 2020

This was referenced Mar 12, 2020

Generic simulation method for AgentType #565

Closed

Alignment of backward solution to forward simulation of models #566

Closed

forward/backward through data access, not data storage #569

Closed

sbenthall mentioned this issue Apr 7, 2020

Equivalency between 2 models/agents. #612

Closed

sbenthall mentioned this issue Apr 27, 2020

separate out model definition, solution method parameters, and simulation parameters #660

Closed

sbenthall mentioned this issue May 6, 2020

change _hist constructions to history namespace #674

Merged

sbenthall mentioned this issue May 21, 2020

[WIP] GenericModel class for simulation of models based on configuration object #696

Closed

sbenthall mentioned this issue Aug 4, 2020

Overhaul: persistent solvers #97

Closed

sbenthall mentioned this issue Sep 30, 2020

Time varying solver #835

Merged

sbenthall closed this as completed Jan 13, 2021

sbenthall mentioned this issue Jun 15, 2021

Portfolio model with income deduction scheme. #832

Merged

sbenthall mentioned this issue Feb 22, 2022

Separate FrameModel from FrameAgentType #1117

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disentangle solution and simulation frameworks in HARK.core and downstream classes #495

Disentangle solution and simulation frameworks in HARK.core and downstream classes #495

sbenthall commented Feb 8, 2020

mnwhite commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

llorracc commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

mnwhite commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

sbenthall commented Feb 26, 2020

mnwhite commented Feb 26, 2020 via email

sbenthall commented Feb 26, 2020

llorracc commented Feb 26, 2020 via email

sbenthall commented Mar 10, 2020

llorracc commented Mar 10, 2020 •

edited

Loading

llorracc commented Mar 10, 2020

sbenthall commented Mar 11, 2020 •

edited

Loading

sbenthall commented Mar 17, 2020

sbenthall commented Dec 21, 2020

sbenthall commented Jan 13, 2021

Disentangle solution and simulation frameworks in HARK.core and downstream classes #495

Disentangle solution and simulation frameworks in HARK.core and downstream classes #495

Comments

sbenthall commented Feb 8, 2020

mnwhite commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

llorracc commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

mnwhite commented Feb 8, 2020 via email

sbenthall commented Feb 8, 2020

sbenthall commented Feb 26, 2020

mnwhite commented Feb 26, 2020 via email

sbenthall commented Feb 26, 2020

llorracc commented Feb 26, 2020 via email

sbenthall commented Mar 10, 2020

llorracc commented Mar 10, 2020 • edited Loading

llorracc commented Mar 10, 2020

sbenthall commented Mar 11, 2020 • edited Loading

sbenthall commented Mar 17, 2020

sbenthall commented Dec 21, 2020

sbenthall commented Jan 13, 2021

llorracc commented Mar 10, 2020 •

edited

Loading

sbenthall commented Mar 11, 2020 •

edited

Loading