Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the speed of preorder operation on IR #11019

Merged

Conversation

4e6
Copy link
Contributor

@4e6 4e6 commented Sep 9, 2024

Pull Request Description

close #10537

Changelog:

  • add: implement IR.preorder method with callback
  • update: update IR.preorder method usages

Important Notes

Shows ~10% speed improvement during the compilation

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

@4e6 4e6 added the CI: No changelog needed Do not require a changelog entry for this PR. label Sep 9, 2024
@4e6 4e6 self-assigned this Sep 9, 2024
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to remove the old preorder() method altogether as the new callback approach seems sufficient for all the use-cases.

Is there any effect on performance? Any benchmarks affected?

Shows ~10% speed improvement during the compilation

What methodology is used to measure these 10%?

Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In runtime-benchmarks, there are benchmarks that measure speed of the compilation. This is the list:

[info] org.enso.compiler.benchmarks.exportimport.ExportImportResolutionBenchmark.importsAndExportsResolution
[info] org.enso.compiler.benchmarks.exportimport.ExportImportResolutionBenchmark.importsResolution
[info] org.enso.compiler.benchmarks.inline.InlineCompilerBenchmark.longExpression
[info] org.enso.compiler.benchmarks.inline.InlineCompilerErrorBenchmark.expressionWithErrors
[info] org.enso.compiler.benchmarks.module.ImportStandardLibrariesBenchmark.importStandardLibraries
[info] org.enso.compiler.benchmarks.module.ManyErrorsBenchmark.manyErrors
[info] org.enso.compiler.benchmarks.module.ManyLocalVarsBenchmark.longMethodWithLotOfLocalVars
[info] org.enso.compiler.benchmarks.module.ManyNestedBlocksBenchmark.manyNestedBlocks
[info] org.enso.compiler.benchmarks.module.ManySmallMethodsBenchmark.manySmallMethods

(Get this list via runtime-benchmarks/run -l)

Would you please verify the difference (locally on your machine is fine) to develop on this set of benchmarks? Run them with runtime-benchmarks/run <bench1> <bench2> ...

ir,
{ ir =>
if (ir.getExternalId.contains(externalId)) {
return Some(ir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those explicit returns are so not Scala, you make bunny cry.

* @param ir the node to traverse
* @param cb the callback to apply
*/
static void preorder(IR ir, Consumer<IR> cb) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I so much wish Consumer was a partial function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala.PartialFunction[IR, IR] is nice. But implementing that on java side, not so much. You have to explicitly override both isDefinedAt and apply methods. Which is unnecessary duplication.

Copy link
Member

@JaroslavTulach JaroslavTulach Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make a variant for a PartialFunction as well, but Pavel is right, it cannot be used as a lambda from Java.

@4e6
Copy link
Contributor Author

4e6 commented Sep 10, 2024

@Akirathan here are the results

develop

[info] Benchmark                                                                         Mode  Cnt    Score    Error  Units
[info] o.e.c.b.exportimport.ExportImportResolutionBenchmark.importsAndExportsResolution  avgt    4    2.345 ±  0.144  ms/op
[info] o.e.c.b.exportimport.ExportImportResolutionBenchmark.importsResolution            avgt    4    0.279 ±  0.002  ms/op
[info] o.e.c.b.inline.InlineCompilerBenchmark.longExpression                             avgt    4    2.167 ±  0.949  ms/op
[info] o.e.c.b.inline.InlineCompilerErrorBenchmark.expressionWithErrors                  avgt    4    2.584 ±  1.660  ms/op
[info] o.e.c.b.module.ImportStandardLibrariesBenchmark.importStandardLibraries           avgt    4  148.967 ±  5.192  ms/op
[info] o.e.c.b.module.ManyErrorsBenchmark.manyErrors                                     avgt    4   60.608 ±  9.992  ms/op
[info] o.e.c.b.module.ManyLocalVarsBenchmark.longMethodWithLotOfLocalVars                avgt    4   35.928 ±  2.019  ms/op
[info] o.e.c.b.module.ManyNestedBlocksBenchmark.manyNestedBlocks                         avgt    4   69.187 ±  3.778  ms/op
[info] o.e.c.b.module.ManySmallMethodsBenchmark.manySmallMethods                         avgt    4   88.048 ± 42.014  ms/op

wip/db/10537-improve-the-speed-of-preorder-operation-cb

[info] Benchmark                                                                         Mode  Cnt    Score    Error  Units
[info] o.e.c.b.exportimport.ExportImportResolutionBenchmark.importsAndExportsResolution  avgt    4    2.338 ±  0.046  ms/op
[info] o.e.c.b.exportimport.ExportImportResolutionBenchmark.importsResolution            avgt    4    0.276 ±  0.015  ms/op
[info] o.e.c.b.inline.InlineCompilerBenchmark.longExpression                             avgt    4    2.400 ±  1.884  ms/op
[info] o.e.c.b.inline.InlineCompilerErrorBenchmark.expressionWithErrors                  avgt    4    2.471 ±  2.517  ms/op
[info] o.e.c.b.module.ImportStandardLibrariesBenchmark.importStandardLibraries           avgt    4  131.518 ±  4.993  ms/op
[info] o.e.c.b.module.ManyErrorsBenchmark.manyErrors                                     avgt    4   69.835 ± 12.971  ms/op
[info] o.e.c.b.module.ManyLocalVarsBenchmark.longMethodWithLotOfLocalVars                avgt    4   31.602 ± 16.382  ms/op
[info] o.e.c.b.module.ManyNestedBlocksBenchmark.manyNestedBlocks                         avgt    4   67.468 ±  3.875  ms/op
[info] o.e.c.b.module.ManySmallMethodsBenchmark.manySmallMethods                         avgt    4   95.393 ±  4.209  ms/op

I was using ImportStandardLibrariesBenchmark in my measurements. That's where the 10% comes from.

Two benchmarks here look slightly worse:

  • org.enso.compiler.benchmarks.module.ManyErrorsBenchmark.manyErrors
  • org.enso.compiler.benchmarks.module.ManyLocalVarsBenchmark.longMethodWithLotOfLocalVars

However, when executed multiple times, they show a lot of variation in the results. And the ImportStandardLibrariesBenchmark is consistently better.

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Sep 10, 2024

This is a nice result:

o.e.c.b.module.ImportStandardLibrariesBenchmark.importStandardLibraries           avgt    4  148.967
o.e.c.b.module.ImportStandardLibrariesBenchmark.importStandardLibraries           avgt    4  131.518

10% is nice.

@4e6 4e6 added the CI: Ready to merge This PR is eligible for automatic merge label Sep 10, 2024
@mergify mergify bot merged commit ab0a5e0 into develop Sep 10, 2024
42 checks passed
@mergify mergify bot deleted the wip/db/10537-improve-the-speed-of-preorder-operation-cb branch September 10, 2024 14:47
@Akirathan
Copy link
Member

@4e6 Thanks for the measurements. Yes, 10% is very nice improvement. As for the variation - note the Error column in the results. That Error column basically stands for standard deviation. For manyErrors benchmark, this is very high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the speed of preorder operation on IR
4 participants