-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store whole IR.Module in .bindings cache #8924
Conversation
The enso$ find built-distribution/ -type f | grep bindings$ | xargs wc -c
30416 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Visualization/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Visualization.bindings
1341 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Google_Api/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Google_Api.bindings
52768 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/AWS/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/AWS.bindings
9655 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Searcher/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Searcher.bindings
4310 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Geo/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Geo.bindings
19415 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Image/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Image.bindings
8678 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Examples/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Examples.bindings
29898 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Test_New/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Test_New.bindings
926343 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Base.bindings
449874 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Database/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Database.bindings
377276 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Table.bindings
23754 built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Test/0.0.0-dev/.enso/cache/bindings/0.0.0-dev/Standard/Test.bindings
1933728 total e.g. 2MB, but now the size is 66MB:
|
3f41ecd
to
e570e37
Compare
bb1272d shall fix #8698 as verified by another benchmark run: Down by 3,5% to |
collection.concurrent.TrieMap[QualifiedName, Future[Boolean]]() | ||
|
||
/** The thread pool that handles serialization. */ | ||
private val pool: ThreadPoolExecutor = new ThreadPoolExecutor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR moves all the threading mangling into SerializationPool
to separate the work with threads from the remaining logic inside of SerializationManager
.
import java.util.logging.Level; | ||
import org.enso.pkg.QualifiedName; | ||
|
||
final class SerializationPool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SerializationPool
contains all the threading logic of original SerializationManager
. It replaces Thread.sleep
with Object.wait
and it trades the parallel executor for a newSingleThreadExecutor
. Otherwise the design with isSerializing
and isWaitingForSerialization
remains unchanged.
shutdownPools(); | ||
} | ||
synchronized (threads) { | ||
for (var t : threads.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this fixes the problem where context.close()
complains about dangling threads. Probably shutdownPool()
stops the pools, but the threads are left to finish later. Sometimes the context.close()
is called faster and some of the threads are not yet finished.
Originally with Thread.sleep(100)
this situation wasn't happening (too frequently), but now with wait/notify
it got way more common. This t.join()
seems to fix that problematic situation altogether.
engine/runtime/src/main/scala/org/enso/interpreter/runtime/SerializationManager.scala
Show resolved
Hide resolved
|
||
var meta = new ModuleCache.Metadata("hash", "code", CompilationStage.AFTER_CODEGEN.toString()); | ||
var cachedIr = module.getCache().deserialize(ensoCtx, arr, meta, null); | ||
var cachedIr = mc.deserialize(ensoCtx, ByteBuffer.wrap(arr), meta, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserialization now works with ByteBuffer
, not byte[]
to allow mmapped files.
* @return the read object | ||
* @throws java.io.IOException when an I/O problem happens | ||
*/ | ||
public static <T> Reference<T> read(ByteBuffer buf, Function<Object, Object> readResolve) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New method to allow us to pass in mmapped buffer. The internals were already prepared for working with ByteBuffer
.
public final class ModuleCache extends Cache<ModuleCache.CachedModule, ModuleCache.Metadata> { | ||
|
||
public final class ModuleCache | ||
implements Cache.Spi<ModuleCache.CachedModule, ModuleCache.Metadata> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache
is not final
class. All the cache implementations now implement the Spi
interface and create instance of Cache
with its create
method.
|
||
import com.oracle.truffle.api.TruffleFile; | ||
|
||
final class CacheUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methods of Cache
that are more helper utilities.
I believe the code is ready to be reviewed. There seems to be some suspicious StackOverflowError in the CI logs/runs. Not sure how frequent/dangerous it can be. I haven't seen the |
engine/runtime-compiler/src/main/scala/org/enso/compiler/Compiler.scala
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/SerializationPool.java
Show resolved
Hide resolved
// Bound the waiting loop | ||
int maxCount = 60; | ||
int counter = 0; | ||
while (this.hasJobsRemaining() && counter < maxCount) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log when max count was reached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is relict of the original code. I kept it, but I have no idea what it was trying to achieve. In my opinion one shall either:
- abort the serialization - in such case there is
executor.shutdownNow()
method - wait for all serializations to finish - in such case there is
executor.shutdown()
One calls either first or other method. I have no idea why the shutdown sequence is so complicated.
engine/runtime/src/main/java/org/enso/interpreter/runtime/SerializationPool.java
Show resolved
Hide resolved
pool.abort(toQualifiedName(libraryName)); | ||
return scala.Option.empty(); | ||
} else { | ||
pool.waitWhileSerializing(toQualifiedName(libraryName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this safe? nothing stops context from serializing the library in between waiting and loading the cache? Not that it is likely to happen, but stil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is as "safe" as the original implementation - I haven't changed the logic. True, the original code was probably not really multithreading ready. The fact that it was using Thread.sleep
for coordination of threads speaks for itself.
I haven't attempted to make the code bulletproof. That would require bigger rewrite. Btw. can the code even be really safe when multiple processes access the caches at the same time? That might require some co-ordination with filesystem. What goals do we have for the caching infrastructure?
My current implementation implies:
- caches are written down in a background executor thread
- reading the caches happens in a caller thread and waits until the background thread finishes
If we want more, we should probably sit down and collect other requirements/constraints. Then rewrite the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if the coordination of filesystem access to caches was encapsulated in a single class like SerializationPool
. To replace this waitWhileSerializing
method we would need something like:
public <T> T readCaches(QualifiedName name, Callbable<T> action) {...}
that would prevent serialization of name
key while executing the action. I can try to redesign that.
engine/runtime/src/test/java/org/enso/compiler/SerializationManagerTest.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Hubert Plociniczak <[email protected]>
…nged its return type
The way to reproduce the exception is to enso$ ./built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --ir-caches --no-compile-dependencies --no-global-cache --compile ./built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Google_Api/0.0.0-dev/
[ERROR] [2024-02-06T06:43:12+01:00] [org.enso.runner.Main] Unexpected internal error
org.graalvm.polyglot.PolyglotException: Resource exhausted: Stack overflow
~/NetBeansProjects/enso/enso$ Fixed by 947824e - when deserialization yields |
…n ensureParsed again
There is an error during serialization of suggestions however the exception seems to be swallowed in spite of being logged via TruffleLogger (ccing @hubertp). Trying to reveal the stacktrace with old plain good Update: With the help of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes seem generally good and understandable. I have just left very minor nitpicks.
engine/runtime/src/main/java/org/enso/interpreter/runtime/SerializationPool.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/SerializationPool.java
Show resolved
Hide resolved
@@ -179,7 +179,7 @@ class ImportResolver(compiler: Compiler) { | |||
u.compilationStage(CompilationStage.INITIAL) | |||
} | |||
) | |||
compiler.ensureParsed(mod) | |||
compiler.ensureParsed(mod, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use caches to avoid this NPE.
First CI failures are starting to appear after integration of this change: https://github.com/enso-org/enso/actions/runs/7869109006/job/21467537020?pr=9024#step:8:974
I wish I knew what is the other process?
|
Pull Request Description
Fixes #8639 by storing whole
IR.Module
inside of library.bindings
cache and using them instead of loading caches from individual.ir
files. The--compile
command is modified to not produce.ir
caches - they are already contained in.bindings
files.Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
style guides.
.bindings
fromTable
orBase
- restricted in c061494