Code generation is slow #62

chrispcampbell · 2020-12-24T18:25:28Z

Running sde generate --genc with the En-ROADS model currently takes 2m 2s on my laptop (2019 MacBook Pro, 2.4 GHz 8-Core Intel Core i9). On a GitHub Actions runner (Ubuntu), it takes about 2m 43s. This seems suspiciously slow.

The text was updated successfully, but these errors were encountered:

chrispcampbell · 2020-12-24T18:28:53Z

After running the Clinic Flame profiler, it looks like there is a lot of time spent in varsWithName and varWithRefId:

Both of those functions use a linear search to find matching elements in the variables array. It is a simple change to keep a map (with varName as the key, and an array of related vars as the value) that we can use for faster lookups.

With that change in place, the same sde generate --genc command takes only 15s on my laptop.

chrispcampbell · 2020-12-24T18:52:25Z

After that first set of changes, one of the next big hot spots was sortVarsOfType:

This one was slow due to use of contains to scan for variables that have no dependencies. Adding a Set for faster lookup to avoid the linear search brings the time down to under 11s on my laptop.

chrispcampbell · 2020-12-24T19:31:24Z

The next hot spot was in sortInitVars:

This was almost identical to the previous one in sortVarsOfType, and the fix was the same (use a Set for faster lookup when determining the no-dependency vars). With this change, code gen time is down to 8s on my laptop.

chrispcampbell · 2020-12-24T19:32:58Z

The next hot spot, also in sortInitVars, but specifically in the addDepsToMap function:

This was another case where it was using R.contains, but can be made faster with a set for lookup. With this change to use a Set, code gen time is down to 5.1s on my laptop.

chrispcampbell · 2020-12-24T19:46:59Z

The next hot spot was in checkSpecVars:

The code here was using a linear search through the whole array of variables (if (!R.find(R.propEq('refId', varName), variables))). We can change this to use varWithRefId, which was already optimized in an earlier step.

With this change, code gen time is down to 4.7s on my laptop.

chrispcampbell · 2020-12-24T19:55:42Z

After all those changes, there isn't too much low-hanging fruit left:

The biggest hot spot that is left is parseModel, most of the time there is spent in antlr4-vensim and not easy to fix.

The second biggest hot spot is the orange bar on the right, which corresponds to readDat, which doesn't have any quick fixes.

I'll call this good enough for now: what used to take over 2 minutes now takes under 5 seconds.

* perf: keep variables in a map for faster lookup * perf: use a set in sortVarsOfType for faster lookup * perf: use a set in sortInitVars for faster lookup * perf: use a set for faster lookup in addDepsToMap * perf: use varWithRefId instead of linear search in checkSpecVars Fixes #62

jrissman · 2021-10-27T22:03:29Z

I just wanted to write to thank you, @chrispcampbell, for this detailed thread about optimization using Clinic Flame profiler.

@ToddFincannon says he's currently seeing SDE take about 15 minutes to convert the EPS to C code, so after we get SDE fully working with arbitrary numbers of subscripts, we might want to do another optimization of SDE's code generation. It seems like operating on the EPS causes SDE to encounter some poorly-optimized routines that it did not encounter when it was working on En-ROADS. One guess is that it might have to do with input data handling, given that the EPS has vastly more input data than En-ROADS.

If Clinic Flame profiler is a good tool for this, we might want to use it to optimize other aspects of model operation. In particular, it might be nice if it is helpful in optimizing either model execution time, or the delays in our front-end user interface. Regarding model runtime, Todd had the idea of moving input data into a separate file. Currently, SDE converts all the input data into C code. The EPS gets converted into 90,000 lines of C code, most of which are raw data. It is possible that putting the raw data into a separate file might be more efficient, or at least, more organized.

Regarding the front-end user interface, we are going to need to refactor the user interface to remove Rails. That should simplify the code considerably. After Rails is removed, that might be a good time to use Clinic Flame profiler to try to find and fix the inefficiencies in the user interface.

@chrispcampbell, if you happen to have any thoughts on whether Clinic Flame profiler is a good tool for us to use to do these three optimizations (code generation, model runtime, and front-end UI), or any tips or pitfalls we should be aware of, we'd love to hear.

chrispcampbell added bug Medium Priority labels Dec 24, 2020

chrispcampbell added this to the 0.6.0 milestone Dec 24, 2020

chrispcampbell self-assigned this Dec 24, 2020

chrispcampbell mentioned this issue Dec 24, 2020

perf: improve code gen performance by avoiding linear searches #63

Merged

chrispcampbell closed this as completed in #63 Dec 24, 2020

ToddFincannon mentioned this issue Nov 19, 2021

Generating C code is slow for EPS #167

Closed

climateinteractive-bot mentioned this issue Jun 4, 2022

chore: release sdeverywhere 0.6.0 #190

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code generation is slow #62

Code generation is slow #62

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

jrissman commented Oct 27, 2021 •

edited

Loading

Code generation is slow #62

Code generation is slow #62

Comments

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

chrispcampbell commented Dec 24, 2020

jrissman commented Oct 27, 2021 • edited Loading

jrissman commented Oct 27, 2021 •

edited

Loading