Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code generation is slow #62

Closed
chrispcampbell opened this issue Dec 24, 2020 · 7 comments · Fixed by #63 or #190
Closed

Code generation is slow #62

chrispcampbell opened this issue Dec 24, 2020 · 7 comments · Fixed by #63 or #190
Assignees
Milestone

Comments

@chrispcampbell
Copy link
Contributor

Running sde generate --genc with the En-ROADS model currently takes 2m 2s on my laptop (2019 MacBook Pro, 2.4 GHz 8-Core Intel Core i9). On a GitHub Actions runner (Ubuntu), it takes about 2m 43s. This seems suspiciously slow.

@chrispcampbell
Copy link
Contributor Author

After running the Clinic Flame profiler, it looks like there is a lot of time spent in varsWithName and varWithRefId:

flame1

Both of those functions use a linear search to find matching elements in the variables array. It is a simple change to keep a map (with varName as the key, and an array of related vars as the value) that we can use for faster lookups.

With that change in place, the same sde generate --genc command takes only 15s on my laptop.

@chrispcampbell
Copy link
Contributor Author

After that first set of changes, one of the next big hot spots was sortVarsOfType:

flame2

This one was slow due to use of contains to scan for variables that have no dependencies. Adding a Set for faster lookup to avoid the linear search brings the time down to under 11s on my laptop.

@chrispcampbell
Copy link
Contributor Author

The next hot spot was in sortInitVars:

flame3

This was almost identical to the previous one in sortVarsOfType, and the fix was the same (use a Set for faster lookup when determining the no-dependency vars). With this change, code gen time is down to 8s on my laptop.

@chrispcampbell
Copy link
Contributor Author

The next hot spot, also in sortInitVars, but specifically in the addDepsToMap function:

flame4

This was another case where it was using R.contains, but can be made faster with a set for lookup. With this change to use a Set, code gen time is down to 5.1s on my laptop.

@chrispcampbell
Copy link
Contributor Author

The next hot spot was in checkSpecVars:

flame5

The code here was using a linear search through the whole array of variables (if (!R.find(R.propEq('refId', varName), variables))). We can change this to use varWithRefId, which was already optimized in an earlier step.

With this change, code gen time is down to 4.7s on my laptop.

@chrispcampbell
Copy link
Contributor Author

After all those changes, there isn't too much low-hanging fruit left:

flame6

The biggest hot spot that is left is parseModel, most of the time there is spent in antlr4-vensim and not easy to fix.

The second biggest hot spot is the orange bar on the right, which corresponds to readDat, which doesn't have any quick fixes.

I'll call this good enough for now: what used to take over 2 minutes now takes under 5 seconds.

chrispcampbell added a commit that referenced this issue Dec 24, 2020
* perf: keep variables in a map for faster lookup

* perf: use a set in sortVarsOfType for faster lookup

* perf: use a set in sortInitVars for faster lookup

* perf: use a set for faster lookup in addDepsToMap

* perf: use varWithRefId instead of linear search in checkSpecVars

Fixes #62
@jrissman
Copy link

jrissman commented Oct 27, 2021

I just wanted to write to thank you, @chrispcampbell, for this detailed thread about optimization using Clinic Flame profiler.

@ToddFincannon says he's currently seeing SDE take about 15 minutes to convert the EPS to C code, so after we get SDE fully working with arbitrary numbers of subscripts, we might want to do another optimization of SDE's code generation. It seems like operating on the EPS causes SDE to encounter some poorly-optimized routines that it did not encounter when it was working on En-ROADS. One guess is that it might have to do with input data handling, given that the EPS has vastly more input data than En-ROADS.

If Clinic Flame profiler is a good tool for this, we might want to use it to optimize other aspects of model operation. In particular, it might be nice if it is helpful in optimizing either model execution time, or the delays in our front-end user interface. Regarding model runtime, Todd had the idea of moving input data into a separate file. Currently, SDE converts all the input data into C code. The EPS gets converted into 90,000 lines of C code, most of which are raw data. It is possible that putting the raw data into a separate file might be more efficient, or at least, more organized.

Regarding the front-end user interface, we are going to need to refactor the user interface to remove Rails. That should simplify the code considerably. After Rails is removed, that might be a good time to use Clinic Flame profiler to try to find and fix the inefficiencies in the user interface.

@chrispcampbell, if you happen to have any thoughts on whether Clinic Flame profiler is a good tool for us to use to do these three optimizations (code generation, model runtime, and front-end UI), or any tips or pitfalls we should be aware of, we'd love to hear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment