-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: no closure inlining even in the simple case? #35196
Comments
Our current inline heuristic thinks that Even if I hack the compiler to inline in this case, it still doesn't inline the anonymous func. But it does do so in simpler cases. Not sure what's going on there, probably a phase ordering issue. |
Hmmm... which criteria is used which results in 3 function calls being considered expensive? Wouldn't the assembly for that case be something like 3 CALL instructions with some low-weight scaffolding for the arguments and the stack? Or is it just the middle call, the one which calls the closure function, the one which drives the cost up, since it's a call via a function pointer? Re: hacking the compiler - could the "inlining threshold" be a compiler argument? |
We have a cost budget of 80, and each call is cost 57. That normally allows only one call in an inlineable body (with a few exceptions, like intrinsics). See #19348 (comment)
The closureness of the call isn't used as part of the cost, other than it must be able to resolve the target function.
We do have a |
Thank you, I appreciate your answers and your expertise. On my system, the But even if inlining worked, is the compiler smart enough to inline the closure as a whole inside the one and only caller (since the closure is an anonymous function used exactly once)? FWIW, 2 more cents: I'm guessing the issue is because "go build" also builds required packages, so there could be mixed results from multiple invocations. Doesn't that point to flags like inline thresholds being global settings on a build machine, for all packages. Maybe in a config file, an environment variable, etc.? |
This code optimizes to just a call to
Sure, but that doesn't really solve the problem, it just punts it to the user. I import 2 packages, A and B. A says "I am fastest when the inlining threshold is 63". B says "I am fastest when the inlining threshold is 95". What should I set my inlining threshold to? We've basically decided that tuning to this level is not worth the complication it introduces to the ecosystem. |
I'm wondering about what I think is this issue, but in reading through this thread a few times I'm not exactly sure what the precise problem under discussion is. @randall77 said
Is that the core problem here to be fixed? In my case, I'm looking at code like this:
The call to
However, if I call
Is that the same as this issue? Is it another issue? Should I open a new issue? |
Heavy edit: I thought the issue was something else. How do you know that Regarding whether it's similar or not - the original issue is about anonymous function, which I think warrant aggressive inlining since the majority of them will only be called from a single context (judging how code translates to inlining budget, I'd say push it to 500, that should cover many logging / formatting handlers, various wrappers and error handlers). |
Setting aside the original example and looking at this one (passing function f to method E):
In your second example you've replaced the indirect call with a direct call; the Go compiler has no problems with inlining those as long as the callee is small enough (and various other criteria are met). Hope this helps. |
If it were inlined, it would be mentioned in the
(But I also confirmed by looking at the assembly.) |
@thanm thanks, that is very helpful. To summarize, it sounds like the key optimization for me would be inlining of indirect calls (in some cases where the target is static), which is difficult because it requires heavyweight analysis. |
I hesitate to call this an issue, since I know it's a complex topic, and I've read about half a dozen issue reports touching this topic, which kind-of sort-of fix some inlining cases, but not all of them.
I'm using a simple pattern for avoiding lock leakage:
in practice it's used like this:
Full example here: https://go.godbolt.org/z/mkvMbZ
I'd expect that when the closure passed to
WithWlock()
is simple enough (i.e. no need for a new stack), inlining would collapse all the scaffolding and compile it to something like:But looking at the assembly code (https://go.godbolt.org/z/mkvMbZ), the inner function is compiled once, its address taken, passed to the
WithWLock()
function, i.e. a full closure call is being done.This is on go 1.13.
It seems that a full closure call is an expensive choice for this particular pattern?
The text was updated successfully, but these errors were encountered: