Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in v4 #598

Closed
davidmurdoch opened this issue Sep 18, 2019 · 10 comments
Closed

Performance regression in v4 #598

davidmurdoch opened this issue Sep 18, 2019 · 10 comments

Comments

@davidmurdoch
Copy link
Contributor

I initially tracked the majority of the slowdown to this commit: 4678325

My initial findings pointed me to: 4678325#diff-452d330f2397c53c4a10771e89e595b8R254-R256

But that code has since been refactored. The issue here was that the vm._emit function was getting "promisified" individually for every single op code run. On Node 8 on my machine I saw a 40% performance improvement by memoizing _emit in the constructor via this._emit = promisify(vm.emit.bind(vm));.

The refactored code seems to have moved the promisification to VM itself, but it still does it for every individual emit call (https://github.com/ethereumjs/ethereumjs-vm/blob/195bba7645151f975020bebe9f8ef146cc2a1680/lib/index.ts#L193). There are other callback functions getting the promisify treatment on every call (getBlockHash, applyTransactions, applyBlock, probably more?). We should probably memoize all promisify calls where reasonable and appropriate.

Another potential performance penalty (I haven't measured myself) is due to async/await is getting transpiled into TypeScript's __awaiter which uses Promise's then internally. Promise#then() has been reported to be slower than native await (https://v8.dev/blog/fast-async and https://mathiasbynens.be/notes/async-stack-traces). async/await is technically an ES2017 feature, so you'll need to update the TS target to use it. Updating may have other side effects that cause incompatibilities in some browsers or node version, I haven't checked. Maybe you can configure your linter to dissallow most ES2017 features except async/await?

@s1na
Copy link
Contributor

s1na commented Sep 19, 2019

Thanks for tracking this and the detailed comment, David.

The promisification of the VM is definitely still half-baked. I didn't know promisify has such a penalty (when occured frequently enough). We should generally move away from promisify as more of our code natively supports promises. But you're right we can at least try to minimize the number of calls to promisify.

Regarding TS's target, maybe @alcuadrado or @krzkaczor can chime in, they have more experience on that front.

@alcuadrado
Copy link
Member

Hey @davidmurdoch, thanks for taking the time to do this. I know that profiling the VM can be pretty challenging, so I really appreciate it.

I started collaborating in this library right after the TS migration, so I wasn't aware of the performance regression. But what you described matches the performance characteristics I've been observing lately, so I spent some time digging deeper. Here are my findings.

The first thing I did was patching util.promisify to count how many times it was called, and run a small portion of the Open Zepplin contracts' tests. The result: 750k times in ~30 seconds! These tests were run with an ethereumjs-vm-backed test runner I'm working one.

Then, I patched util.promisify again to group those calls by their callers and re-run the same tests. By far the most common caller was VM#_emit, with the only other major caller being PStateManager.

Then, I created a branch of the VM and started optimizing those calls.

To measure the impact of the changes, I set up Open Zeppelin Contracts tests, on its version 52c30edab8e872949a, and run the full test suite after each change.

The initial run, without any modification, took 10min (632.35s user 8.00s system 109% cpu 9:45.94 total).

I cached the VM#_emit promisifyed function, re-run the tests, and they took 8min (525.76s user 6.80s system 109% cpu 8:08.33 total). Note that it's a huge test suite, which uses lots of layers of abstraction on top of the VM, so that's why the improvement was less than 40%. I still consider it a great result.

Then, on top of that, I reduced the amount of PStateManagers created to one per VM instance and cached all of its promisifyed functions in its constructor. This time the performance improvement wasn't that big. It also took 8m (510.75s user 6.79s system 109% cpu 7:54.05 total)

Finally, I recompiled the project with both optimizations and TS's target set to ES2017, and the tests run in 5min 🤯(314.90s user 4.62s system 105% cpu 5:02.25 total)

I think the three optimizations are worth implementing, and I'll open a PR for the first two right now.

The third one requires a little more consideration about how to implement it. We don't support versions of node < 8, so we are good on that front, but how will browser support be handled? Publishing an ES2017 package may break some builds, as all the VM's versions were ES5. I suspect that most people are running bundlers that supports ES2017, and targeting modern browsers, so nothing will break. @yann300 @Aniket-Engg, how does/can Remix handle this?

If we decide that the possibility of a breaking change is not something we want to deal with, I think that we should publish a package with two builds. dist/ with an ES5 version (like now), and es/ (or another name) with an ES2017 version.

@holgerd77
Copy link
Member

Hi @davidmurdoch, I assume that you haven't released on v4 yet and this is currently a blocker for you? Do you think it will be enough on a first round to have the changes from #600 merged @alcuadrado implemented for a release? Then I would directly do a VM release after review and merging. Think the (eventual) ES2017 update will take some more time to get somewhat comfortable that there are no newly introduced side effects.

@davidmurdoch
Copy link
Contributor Author

We're actually planning to release a tagged version (istanbul) to npm tomorrow with v4.1.0. And we'll announce it on all our social channels.

As for releasing to stable, yes, this would be a blocker (especially since we recently shipped a version of ganache-core with its own performance regressions and are working on solving those now -- I don't want to compound these perf regressions :-) ).

After releasing a tagged ganache-core/cli with v4.1.0 i'll test the performance of #600 against ganache's tests to confirm that it solves the majority of the regression; if it does, I don't see any other reason to delay the release to stable any further!

@alcuadrado
Copy link
Member

If we decide that the possibility of a breaking change is not something we want to deal with, I think that we should publish a package with two builds. dist/ with an ES5 version (like now), and es/ (or another name) with an ES2017 version.

I forgot to mention that publishing packages with multiple TS/Babel builds is a very standard practice. For example, many projects release multiple their packages with CJS and ESM builds.

@s1na
Copy link
Contributor

s1na commented Sep 20, 2019

Amazing work, @alcuadrado! I just merged #600.

If we decide that the possibility of a breaking change is not something we want to deal with, I think that we should publish a package with two builds. dist/ with an ES5 version (like now), and es/ (or another name) with an ES2017 version.

This sounds like a good plan to me. I don't see any side-effect from having multiple builds and see no reason why this should be delayed.

@holgerd77
Copy link
Member

@s1na Great! 😄 Like always until February I can't do any substantial work myself but just give my 2 cents, which makes it even easier for me to give support for your great plans!! 😛 😛 😛

Do you think this can be so quickly realized that we can also put this in a next release? Or should we just do one in between now with #600 merged?

@alcuadrado
Copy link
Member

I created #603 which prepares the tooling to support releasing both builds. I think it's complete, but I haven't been involved in the release process of any of the libraries, so I may be missing something.

@davidmurdoch
Copy link
Contributor Author

Amazing work, @alcuadrado!

@davidmurdoch
Copy link
Contributor Author

@holgerd77 Shipped in https://github.com/trufflesuite/ganache-cli/releases/tag/v6.8.0-istanbul.0 and https://github.com/trufflesuite/ganache-core/releases/tag/v2.9.0-istanbul.0 as tagged released on npm (istanbul)

Thanks everyone for getting on these fixes so quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants