Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for capturing data for any variable at runtime #355

Merged
merged 15 commits into from
Sep 28, 2023

Conversation

chrispcampbell
Copy link
Contributor

Fixes #105

@ToddFincannonEI: I'm finally getting around to submitting a PR for this feature that I started over 2 years ago. We've been using this branch in production (for En-ROADS and C-ROADS) for the past 6 months, so I think it's stable enough to merge into main.

There are currently two use cases at CI where this feature comes in handy:

  • In our automated model checks, we use this to access internal variables that aren't normally accessible to model-check (things like internal/historical data that aren't declared as output variables).
  • For the new online En-ROADS Technical Reference, we use this to generate the hyperlinked Model Equations component in each chapter. This component allows anyone to see a graph of any equation in the model.

I realize this is a large PR, and there's no rush on reviewing it. We could wait to go through it together when we meet in person soon. I probably could've split the JSON listing part into its own PR, but it all kind of goes together so decided to leave it in.

I'll try to succinctly explain how it all fits together:

  1. sde generate --list now produces a JSON file in addition to the existing YAML and TXT outputs. I know we had talked about removing the YAML and TXT variants but I figure it's better to do that as a separate change since this one is already big enough.

  2. In that JSON file, for each variable, there is a varIndex field. This varIndex corresponds to the value that code gen emits in the new storeOutput function in the generated C code. For example, variable X might have a varIndex of 1. At runtime, the code that runs the model can request data for variable X by using that index value of 1.

  3. The code gen outputs a giant switch statement in the generated C code that handles the mapping of varIndex values to the appropriate internal C variable (and handles access to subscripted variables as well). The storeOutput function body is gated by a preprocessor flag that is off by default (it needs to be enabled at build time). Therefore, even though the generated C code is larger, there is no impact on the compiled code size or performance when the switch is off.

  4. Likewise, there's a new outputIndexBuffer argument to the runModelWithBuffers function, but in the common case where this parameter is left as NULL, then that code is completely unused and there is no impact on runtime performance or behavior.

  5. The JS-level runtime and runtime-async packages have been updated to support this new functionality, but I decided to apply @hidden tags for most of the new APIs so that they don't yet appear in the public API documentation. This is because a) I'm not ready to call this new API "done", as I've got some changes in mind to make it better and b) we (CI) are probably going to be the only users of this API in the beginning. Once this initial work gets merged in, I can look at the API again with fresh eyes (and ideally align it with how we incorporate your proposal for changes to how inputs are passed in).

  6. I added an automated E2E test of the new functionality in tests/integration/impl-var-access to make sure it all works together.

@travisfranck
Copy link
Collaborator

Cool new change. I can definitely see how this will be useful for unit tests.

@ToddFincannonEI
Copy link
Collaborator

I haven't reviewed the code, but I will comment on your description. This is not something I would use in my own work, but it's a big step toward duplicating the full functionality of Vensim. Having the giant switch statement off by default is important for me, since our models are so large.

I would like to keep the text listing output. That's something I use a lot. I think the JSON output could replace YAML. I have some model analysis tools that use YAML, but they could use JSON instead. I agree that JSON is a better choice than YAML for model listings.

@chrispcampbell
Copy link
Contributor Author

@ToddFincannonEI:

I haven't reviewed the code

Not a problem. Let me know if you mean "haven't reviewed yet" (in which case I'll await your review) or if you mean "unlikely to review".

Having the giant switch statement off by default is important for me, since our models are so large.

Yes, same here.

FWIW, I think I could get rid of the "giant switch statement" approach entirely if the code generator had support for pluggable/custom allocators (which I alluded to in a comment on the WebGPU feature).

But that's a separate topic. For now, with the switch off, there's no impact on build or runtime performance.

I would like to keep the text listing output. That's something I use a lot. I think the JSON output could replace YAML.

Fair enough, all three are still available. Later (separate from this) we can drop the YAML output in favor of JSON, and we can keep the TXT output.

@ToddFincannonEI
Copy link
Collaborator

Sounds good — I am unlikely to review the code itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for capturing data for any variable at runtime
3 participants