-
-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share ModuleGraphs for all files #3232
Conversation
How do you deal with the cycles issue? If there are cycles in the module graph, then you end up with cycles in the build graph, and This has been discussed before: #1359, #2323 (comment) |
There is no rule cycle because we now construct the |
I have concerns about this. It's not just build churn, but the fact that to load a module first I need to compute the module graph for the entire codebase. How is this going to affect init times for large codebases with >10.000 modules?
By all means, include this in the PR. But it doesn't solve the startup cost issue. |
d286a48
to
aea4709
Compare
aea4709
to
6ec2635
Compare
@pepeiborra I've replace checkForImportCycles with a fullModuleGraph option that tries to restore the older behaviour. |
where is the option? I cannot see it |
Check now, the push was still going when I made the comment. |
2f059f8
to
616d328
Compare
The option looks good, but I kind of want to see a benchmark in a large code base. Both with the full module graph and without it. Do you still want to get this merged before the next release? If it cannot wait, then go ahead and merge it |
fbc1bd5
to
0b64655
Compare
This needs #3391 to work because the eval plugin currently relies on the driver (it calls We shouldn't be using the GHC driver anyway as we want to perform loading and recompilation checking using our own logic. The rework of the eval plugin fixes this. I will land them both after the release. |
@wz1000 I benchmarked this change in Bigcorp codebase and the results were not good. 6X longer startup times with full module graph and 5X without full module graph. This is compared to my O(imports)
Full results: BEFORE
|
@pepeiborra do you mean your hack is separate from what lives in HLS master as of now? Is there any regression in I think we could use something like your hack if we can augment hls-graph cycle detection to produce recoverable errors. |
Actually ignore the hack, I was confused, the BEFORE results are not using the hack. Reason: we are still on ghcide 1.4, while upgrading to ghcide 1.8 I deleted the hack since
Yes, the regression is 2m10s -> 9m46s for startup. More specifically, by startup I mean loading the first module, which is handpicked to have a very large transitive module graph.
I don't think this can be done cleanly. |
299795a
to
63d9ec6
Compare
With the strictness fix, the regression is less dramatic: 50% longer to startup and 32% more total memory in use (17G of instead of 12.8G). Not quite ready yet, but less bad.
|
63d9ec6
to
a8c13f1
Compare
I had another look at this and fixed a couple of leaks and it seems like this patch is a big win for performance in most common cases. There are two options in
Both of these options are enabled by default. In baseline HLS, With the patch,
With the patch, either of these options when enabled result in calls to I will be testing two configurations, I tested this patch with a project generated via the following script:
I focused testing on the two extremes, Here are the results of running
So this branch is much faster in the (default) GRAPH config for However, this is not the full story for the GRAPH config on Testing
Changing the And we see that this branch is equivalent to or outperforms master for most realistic scenarios and configurations. |
7f06097
to
0220797
Compare
Thanks for pushing ahead with this branch and sharing the revised benchmarks! My only concern is that the benchmarks specifically test startup scenarios, and I would like to see bench numbers for the edit-module-headers scenario that is expected to regress:
Did you look into this? |
318d89a
to
75469a4
Compare
@pepeiborra I've added a benchmark to edit the header and also added my script to the benchmarks. After benchmarking, I made a few more rules early cutoff, this dramatically improves I tested three commits, baseline(master), module-graph(this patch without improvements to early cutoff) and module-graph-cutoff(this patch). Here are the results of benchmarking:
|
#endif | ||
(catMaybes mss) | ||
#endif | ||
pure (fingerprintToBS $ Util.fingerprintFingerprints $ map (maybe fingerprint0 msrFingerprint) msrs, processDependencyInformation rawDepInfo bm mg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the fingerprint based on the module summaries? I can picture changed to module summaries that would not alter the module graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this may be too course. I can try to implement a finer grained hash, though I'm not sure how much it would improve things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to do this because now the dependency information contains a GHC.ModuleGraph
which is essentially a [ModSummary]
.
We may still want a per module early cutoff right? The new benchmark scenario Edit-header doesn't alter the module graph, so it doesn't tell us what happens when a portion of the module graph is altered. Previously it would only dirty the modules in that portion, whereas now (even with the ModuleGraph early cutoff) it will dirty all the modules. Concretely, I'm suggesting to keep GetDependencyInformation rule, reimplemented as a lookup on GetModuleGraph with early cutoff. I thought this was your original proposal. I might be wrong and this might be unnecessary, only benchmarks can tell |
Do you still think we need |
17a9330
to
366e23e
Compare
(do queueForEvaluation st nfp | ||
setSomethingModified VFSUnmodified st [toKey IsEvaluating nfp] "Eval") | ||
(do unqueueForEvaluation st nfp | ||
setSomethingModified VFSUnmodified st [toKey IsEvaluating nfp] "Eval") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pepeiborra do you think we need this setSomethingModified
call after calling unqueueForEvaluation
?
Computing and storing GetDependencyInformation for each file essentially individually means that we perform downsweep on each file individually, wasting a lot of work and using an excessive amount of memory to store all these duplicated graphs individually. However, we already have the `GetModuleGraph` rule, which we need to compute before compiling files any way due to being depended on by `NeedsCompilation`, which needs to know if any reverse dependencies of the module we are compiling requires TH, which meant that each file already depends on the results of downsweep for the whole project. Instead, we can compute the whole graph once when we execute the `GetModuleGraph` rule and even use this inside `HscEnv.hsc_mod_graph` to avoid reconstructing the `ModuleGraph` on each invocation of `GhcSessionDeps`. There may be concerns about excessive build churn due to any change to the result of `GetModuleGraph` invalidating the result of `GhcSessionDeps` too often, but note that this only happens when something in the header of a module changes, and this could be solved easily be re-introducing a version of `GetDependencyInformation` with early cutoff that essentially returns the result of `GetModuleGraph` but includes the hash of only the `ModSummary`s in the downward dependency closure of the file.
early cutoff for eval plugin
366e23e
to
9aea2f6
Compare
Remove GetDependencyInformation in favour of GetModuleGraph
Computing and storing GetDependencyInformation for each file essentially individually means
that we perform downsweep on each file individually, wasting a lot of work and using an excessive
amount of memory to store all these duplicated graphs individually.
However, we already have the
GetModuleGraph
rule, which we need to compute before compilingfiles any way due to being depended on by
NeedsCompilation
, which needs to know if any reversedependencies of the module we are compiling requires TH, which meant that each file already depends on
the results of downsweep for the whole project.
Instead, we can compute the whole graph once when we execute the
GetModuleGraph
rule and even use this insideHscEnv.hsc_mod_graph
to avoid reconstructing theModuleGraph
on each invocation ofGhcSessionDeps
.There may be concerns about excessive build churn due to any change to the result of
GetModuleGraph
invalidating the result of
GhcSessionDeps
too often, but note that this only happens when somethingin the header of a module changes, and this could be solved easily be re-introducing
a version of
GetDependencyInformation
with early cutoff that essentially returns the result ofGetModuleGraph
but includes the hash of only the
ModSummary
s in the downward dependency closure of the file.