Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register type definitions lazily #11938

Merged
merged 10 commits into from
Jan 1, 2025

Conversation

4e6
Copy link
Contributor

@4e6 4e6 commented Dec 24, 2024

Pull Request Description

close #10923

Changelog:

  • update: registerTypeDefinitions registers constructors lazily

Important Notes

The improvement of time spent in the registerTypeDefinitions method (highlighted):

before

2024-12-23-192336_1193x245_scrot

after

2024-12-23-192401_1197x253_scrot

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • No regressions in benchmarks
  • Speedup in Startup benchmarks

@4e6 4e6 added the CI: No changelog needed Do not require a changelog entry for this PR. label Dec 24, 2024
@4e6 4e6 self-assigned this Dec 24, 2024
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Few style-like comments inline. Treat as opinions.
  • Overall this seems like the right change.
  • what's the impact on benchmarks?
  • explicitly: what's the impact on test/Benchmarks/src/Startup/*.enso startups?

* @param annotations the list of attached annotations
* @param args the list of argument definitions
*/
public record InitializationParts(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this new record have to be public? Because the record defines not just public constructor, but also public getters. I do not think we want people around the code base to call the getters.

Could we rather follow this example by @Akirathan where Pavel introduced FunctionSchema.Builder?

* @return the supplier providing the value with the mapping function applied
* @param <R> the result type
*/
public <R> CachingSupplier<R> map(Function<T, R> f) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "tendency towards functional style" is clear...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, on the other hand, really like it :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are benchmark regressions and the only change that doesn't seem straightforward to me is this map method. But of course, I may be biased...

this.constructorFunction =
buildConstructorFunction(
language, section, localScope, scopeBuilder, assignments, varReads, annotations, args);
initializationResultSupplier.map(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting use of the new map function. The alternative would be to have a single field Supplier<InitializationResult> stored in the AtomConstructor. I seem to like it a bit more - shows the "fixed" vs. the lazily computed part of the structure.

new RuntimeAnnotation(annotation.name, closureRootNode)
}

new InitializationParts(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the builder style, this would become:

AtomConstructor.newBuilder().
  section(...).
  scope(...).
  assignments(...).
  reads(...).
  args(defs);

Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Agree with @JaroslavTulach on the commenteds about style. You don't have to schedule the benchmarks and compare them on this PR, but remember to please at least add a comment on this PR about the results.

@4e6
Copy link
Contributor Author

4e6 commented Dec 28, 2024

Here is the benchmarks run, although I'm not sure how to compare it to the reference values https://github.com/enso-org/enso/actions/runs/12528446812

GitHub
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - Benchmark Engine · ca0fabc

@JaroslavTulach
Copy link
Member

although I'm not sure how to compare it to the reference values

@Akirathan has provided the information here.

bench_download -h for documentation and usage

I've just tried:

enso/tools/performance/engine-benchmarks$ ./bench_download.py -s engine -b wip/db/10923-register-type-definitions-overhead develop

there are many slowdowns. Looks like allocations of atoms are slower than then used to be.

There is another set of benchmarks including startup benchmarks. Run following CI action and then generate comparition with:

enso/tools/performance/engine-benchmarks$ ./bench_download.py -s stdlib -b wip/db/10923-register-type-definitions-overhead develop

At least the startup benchmarks should show some improvement.

@JaroslavTulach JaroslavTulach self-requested a review December 30, 2024 06:26
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to eliminate the performance regressions before integrating.

/** Builder required for initialization of the atom constructor. */
public static final class InitializationBuilder {

private final SourceSection section;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private fields are good.

this.args = args;
}

private SourceSection getSection() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the getters aren't really necessary and one could access directly the private fields.

* @return the supplier providing the value with the mapping function applied
* @param <R> the result type
*/
public <R> CachingSupplier<R> map(Function<T, R> f) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are benchmark regressions and the only change that doesn't seem straightforward to me is this map method. But of course, I may be biased...

@@ -278,8 +384,9 @@ public String toString() {
*
* @return the constructor function of this constructor.
*/
@TruffleBoundary
Copy link
Member

@JaroslavTulach JaroslavTulach Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slowing execution down.

obrazek

This method needs to be inlined, it cannot be @TruffleBoundary. The IGV graph has been generated by:

sbt:runtime-benchmarks> withDebug --dumpGraphs benchOnly -- AtomBenchmarks.benchGenerateList

* @param varReads the expressions that read field values from local vars
* @param initializationBuilderSupplier the function supplying the parts required for
* initialization
* @param fieldNames the argument names
* @return {@code this}, for convenience
*/
public AtomConstructor initializeFields(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can initializeFields be private now, when there is the builder?

@4e6
Copy link
Contributor Author

4e6 commented Dec 30, 2024

On my machine, after removing truffle boundaries b94f81f it is:

[info] Benchmark                                    Mode  Cnt   Score   Error  Units
[info] AtomBenchmarks.benchGenerateList             avgt    5   4.502 ± 1.034  ms/op
[info] AtomBenchmarks.benchGenerateListAutoscoping  avgt    5  10.528 ± 0.146  ms/op
[info] AtomBenchmarks.benchGenerateListQualified    avgt    5   4.560 ± 0.379  ms/op

Before:

[info] Benchmark                                    Mode  Cnt   Score    Error  Units
[info] AtomBenchmarks.benchGenerateList             avgt    5  20.476 ±  1.154  ms/op
[info] AtomBenchmarks.benchGenerateListAutoscoping  avgt    5  32.763 ± 19.428  ms/op
[info] AtomBenchmarks.benchGenerateListQualified    avgt    5  19.932 ±  2.811  ms/op

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no regression in engine benchmarks and there is some speedup in startup then feel free to integrate in current state.

@4e6
Copy link
Contributor Author

4e6 commented Jan 1, 2025

Startup benchmark results are about the same:

wip/db/10923-register-type-definitions-overhead

$ ./built-distribution/enso-engine*/enso*/bin/enso --run test/Benchmarks/src/Startup/Startup.enso
Benchmarking 'Startup.empty_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    2136.23 ms (+-191.653)
Benchmarking 'Startup.hello_world_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    3559.529 ms (+-47.995)
Benchmarking 'Startup.import_world_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    4100.263 ms (+-48.712)

develop

$ ./built-distribution/enso-engine*/enso*/bin/enso --run test/Benchmarks/src/Startup/Startup.enso
Benchmarking 'Startup.empty_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    2150.757 ms (+-168.048)
Benchmarking 'Startup.hello_world_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    3621.465 ms (+-54.777)
Benchmarking 'Startup.import_world_startup' with configuration: [warmup={2 iterations, 5 seconds each}, measurement={3 iterations, 5 seconds each}]
Measurement avg time:    4185.714 ms (+-49.514)

@4e6 4e6 added the CI: Ready to merge This PR is eligible for automatic merge label Jan 1, 2025
@mergify mergify bot merged commit bc6a8aa into develop Jan 1, 2025
49 checks passed
@mergify mergify bot deleted the wip/db/10923-register-type-definitions-overhead branch January 1, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove 519ms startup overhead caused by registerTypeDefinitions
4 participants