Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating C code is slow for EPS #167

Closed
ToddFincannon opened this issue Nov 19, 2021 · 4 comments · Fixed by #168 or #190
Closed

Generating C code is slow for EPS #167

ToddFincannon opened this issue Nov 19, 2021 · 4 comments · Fixed by #168 or #190
Assignees
Milestone

Comments

@ToddFincannon
Copy link
Collaborator

Generating C code for the EPS model takes 108 seconds on my Mac, which seems way too slow.

time node src/sde.js generate --genc --spec ~/Projects/eps-sde/co2e_spec.json ~/Projects/eps-sde/eps.mdl
@ToddFincannon
Copy link
Collaborator Author

Following Chris's advice in #62, I tried the Clinic Flame profiler. It crashed when running sde generate, so I tried the 0x project that is is based on.

0x -- node src/sde.js generate --genc --spec ~/Projects/eps-sde/co2e_spec.json ~/Projects/eps-sde/eps.mdl

This worked, although the flame graph is not as nice. However, the problem is clearly the readCsv function.

before

A quick check showed that readCsv is called repeatedly for the same file. This is because multiple equations in EPS use GET DIRECT DATA for one row or column of a CSV file — in one case, resulting in the same rather large CSV file being read and parsed 2725 times.

The solution is to cache the parsed CSV files so they are only read once. This reduced sde generate time from 108 to 20 seconds.

After this, the profiler did not show any standout CPU hogs, so I am satisfied with ending optimization efforts here.

after

@ToddFincannon
Copy link
Collaborator Author

ToddFincannon commented Nov 19, 2021

There is an additional problem that comes up when we generate code for the full output variable list in EPS. The speedup here was only 2.5x instead of 5x for the previous example with only one output variable.

recordRefsOfVariable

Here recordRefsOfVariable goes into a deep recursion that takes a lot of time. We might be able to improve this by turning referencedRefIds into a Set, so we don't do a linear array search.

@chrispcampbell chrispcampbell changed the title generating C code is slow for EPS Generating C code is slow for EPS Nov 19, 2021
@chrispcampbell
Copy link
Contributor

@ToddFincannon I was just about to point you at referencedRefIds and suggest trying a Set (or revamping that code in general) but you beat me to it. The idea behind referencedRefIds was to eliminate further (redundant) processing; I can see now that a Set would've been more appropriate there. I'm curious if that search is where the time is being spent in that graph above, or if there's something else going on. Let me know if you have other ideas to speed that code up; I don't think I spent too much time tuning it.

@ToddFincannon
Copy link
Collaborator Author

Using a Set reduced the genc time from 60 to 19.3 seconds — well worthwhile! Now I think we're done.

recordRefsOfVariable2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment