Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize import/export resolution #5700

Merged
merged 19 commits into from
Mar 1, 2023

Conversation

hubertp
Copy link
Collaborator

@hubertp hubertp commented Feb 20, 2023

Pull Request Description

This change adds serialization and deserialization of library bindings.
In order to be functional, one needs to first generate IR and
serialize bindings using --compiled <path-to-library> command. The bindings
will be stored under the library with .bindings suffix.
Bindings are being generated during buildEngineDistribution task, thus not
requiring any extra steps.

When resolving import/exports the compiler will first try to load
module's bindings from cache. If successful, it will not schedule its
imports/exports for immediate compilation, as we always did, but use the
bindings info to infer the dependent modules.

The current change does not make any optimizations when it comes to
compiling the modules, yet. It only delays the actual
compilation/loading IR from cache so that it can be done in bulk.
Further optimizations will come from this opportunity such as parallel
loading of caches or lazily inferring only the necessary modules.

Part of #5568 work.

Important Notes

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH
      ./run ide build and ./run ide watch.

@hubertp hubertp force-pushed the wip/hubert/optimizing-export-resolution-5568 branch from 4313473 to afdcd0a Compare February 21, 2023 14:52
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared infrastructure for caching is something other work, including #5068, shall benefit from. Looks good. Few (sometimes unrelated) comments left.


}

object Cache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a shared access to caches is a great move forward. I would, however, prefer if the new code was written in Java rather than Scala. If this was a new code (as I realized later)...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a big one but I did it just for you :)

@@ -146,7 +259,7 @@ class SerializationManager(compiler: Compiler) {
abort(module)
None
} else {
while (isSerializing(module)) {
while (isSerializingModule(module)) {
Thread.sleep(100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait(100) and a notifyAll() when serialization is over would be a nicer that unconditional sleep.

@hubertp hubertp force-pushed the wip/hubert/optimizing-export-resolution-5568 branch 2 times, most recently from 4e02c06 to d280271 Compare February 24, 2023 11:21
@hubertp hubertp changed the title wip: Optimize import/export resolution Optimize import/export resolution Feb 24, 2023
@hubertp hubertp marked this pull request as ready for review February 24, 2023 14:12
@hubertp hubertp requested a review from 4e6 as a code owner February 24, 2023 14:12
@hubertp hubertp force-pushed the wip/hubert/optimizing-export-resolution-5568 branch from 1068f9e to 8b76359 Compare February 24, 2023 14:12
@hubertp hubertp force-pushed the wip/hubert/optimizing-export-resolution-5568 branch from 8b76359 to b38db88 Compare February 28, 2023 11:56
This change adds serialization and deserialization of library bindings.
In order to be functional, one can needs to first generate IR and
serialize bindings `--compiled <path-to-library>` command. The bindings
will be stored under the library with `.bindings` suffix.

When resolving import/exports the compiler will first try to load
module's bindings from cache. If successful, it will not schedule its
imports/exports for immediate compilation, as we always did, but use the
bindings info to infer the dependent modules.

The current change does not make any optimizations when it comes to
compiling the modules, yet. It only delays the actual
compilation/loading IR from cache so that it can be done in bulk.
Further optimizations will come from this opportunity such as parallel
loading of caches or lazilly inferring only the necessary modules.

Other important things to note:
 - deserialization has to take into account dependencies of modules,
   otherwise one may not be able to fully restore abstract module
references
 - bindings map must have a stable serialization uid in order to avoid
   spurious and costly failed cache loading attempts
Race-condition could happen (as discovered by the failing test) if we
were serializing sources read from the module rather than the cache
entry. Had to modify the interface for consistency.
Seems like my code managed to trigger a case when comparing an
`AtomConstructor` wrapped in a `Warning` with another
`AtomConstructor`.

Somehow we are already past `equalsWithWarnings` and enter the fallback
specialization.
Not yet sure how to reproduce this in a single test but my weird failing
test in `Integrations_Tests` now succeeds.
Originally designed by Dmitry for generating suggestions during build
time, it also works for generating library caches.
Compiler needs to run ExportResolution's sorting of graph nodes to
ensure the correct order of resolved modules.
Otherwise some exported symbols are resolved before being added to
module's scope and lead to runtime No_Such_Method errors that are hart
to pin down.

Also added an optional way to parse modules in parallel which is turned
off by default. This will allow for easier experimentation on how much
impact that has.
Slowly moving the new code to Java, as per previous discussions.
Added some duplicate API for Java to avoid horrible wrapping with
converters that will quickly get out of hand.

Because of a bug in Frgaal, had to use a simple record-like class rather
than record itself.
@hubertp hubertp force-pushed the wip/hubert/optimizing-export-resolution-5568 branch from 501f16d to bde7304 Compare February 28, 2023 13:57
@hubertp hubertp added the CI: Ready to merge This PR is eligible for automatic merge label Mar 1, 2023
@mergify mergify bot merged commit 941512e into develop Mar 1, 2023
@mergify mergify bot deleted the wip/hubert/optimizing-export-resolution-5568 branch March 1, 2023 08:53
if (
module
.wasLoadedFromCache() && modules.exists(!_.wasLoadedFromCache())
.wasLoadedFromCache() && modules
.exists(m => !m.wasLoadedFromCache() && !m.isSynthetic)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This !m.isSynthetic check probably addresses #5608

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants