-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ER: basic TCO #86
Comments
Current TCO implementation of gojq limits the optimization to functions with no local variables. There might be something we can do here. Line 1393 in 244f9f7
|
I fixed this, thanks for reporting. |
Despite this enhancement, gojq still seems to consume far more memory than one would expect. Compare:
Is that what you'd expect? |
Reusing the local variable will reduce the linear growth of memory consumption (which is important in ordinary languages) but it cannot always be applied in jq. For example, |
Oops, 86ee519 breaks the behavior of the query in my comment... |
The optimization of jq simply changes the return address to avoid the quadratic growth of instructions, and does not care the stack frame growth. On the other hand, the optimization of gojq replaces the call instruction to jump. This is why I limited to functions with no variables, like |
Since this issue was closed based on an update that has since been reverted, |
I improved the optimization algorithm. |
Yes, a huge improvement, but I was surprised that the (Given that Go is supposed to be as fast as C for comparable programs, the CPU time difference is also a bit surprising, especially as gojq CPU times are are often less than for jq when memory reclamation is not an issue.)
|
The function is already optimized by jump instruction to avoid stack frame allocation. However, there is a fork instruction, and thus it makes the stack still growing. Not sure how jq handles this situation but worth investigating. |
The fork analysis in the tail call optimization, introduced by 97658f3, is incomplete, and produces critical behavior differences in some recursions. This commit now again limits tail call using the jump instruction to the functions with no local variables.
To summarize, one problem was return address, resolved by the |
Hello, sorry for commenting on an old issues. I just wanted to get some clarifications about current state of TCO in gojq. If i understand it correctly gojq currently can in some cases optimize stack use from quadratic to linear? but constant is not possible? If that is the case i'm willing to spend some time working making it constant if that is possible or very hard? in my case i would like to make |
Tail call using |
Okay, values grows so range does not run in constant memory. |
That is true, but do you think that is the main part of the memory increase? |
Interesting, |
Here's possible solution to suppress the values growth, but needs more investigation not to introduce different bugs. diff --git a/execute.go b/execute.go
index 6bba740..afb0cbe 100644
--- a/execute.go
+++ b/execute.go
@@ -228,7 +228,11 @@ loop:
if backtrack {
break loop
}
+ f := env.scopes.index > env.scopes.limit
s := env.scopes.pop().(scope)
+ if f {
+ env.offset = s.offset
+ }
pc, env.scopes.index = s.pc, s.saveindex
if env.scopes.empty() {
return env.pop(), true |
Thanks, i tried the patch and it seems to have fixed most of the memory increase issue. The allocations done by gojq master with offset patch and pprof (https://gist.github.com/wader/87b034dbc986b4d50c8ad5f41c39d2b7 and a go build -o gojq cmd/gojq/main.go && MEMPROFILE=gojq.mem.prof ./gojq -n 'range(20000000) | empty' && go tool pprof -http :5555 gojq gojq.mem.prof |
Nice, thanks! the script where i originally noticed this seems to still increase in memory usage. I managed to minimize it down to this: |
How about now? The problem is not in tail call optimization, anymore. Maybe I need to stop somewhere, the code complexity is growing. I'll cut a new tag soon. |
Not much difference. I think the issue is still |
|
I'm trying to understand the code related to How come that after the first $ GOJQ_DEBUG=1 go run -tags debug cmd/gojq/main.go -n 'def f($v): 1; f(1) | f(1)'
0 scope [0,3,0]
1 store [0,0] ## $ARGS
2 jump 13
3 scope [1,3,1] ## f
4 store [1,0]
5 store [1,1] ## v
6 load [1,0]
7 load [1,1]
8 callpc null
9 store [1,2] ## $v
10 load [1,0]
11 const 1
12 ret null ## end of f
13 store [0,1]
14 jump 18
15 scope [2,0,0] ## lambda:15
16 const 1
17 ret null ## end of lambda:15
18 pushpc 15
19 load [0,1]
20 call 3 ## call f
21 store [0,2]
22 jump 26
23 scope [3,0,0] ## lambda:23
24 const 1
25 ret null ## end of lambda:23
26 pushpc 23
27 load [0,2]
28 call 3 ## call f
29 ret null
----------------------------------------+
0 scope [0,3,0] | null {"named":{}}
2021/08/31 00:32:54 opscope start: env.offset=0
2021/08/31 00:32:54 opscope end: env.offset=3
1 store [0,0] | null {"named":{}} ## $ARGS
2 jump 13 | null
13 store [0,1] | null
14 jump 18 |
18 pushpc 15 |
19 load [0,1] | [15,0]
20 call 3 | [15,0] null ## call f
3 scope [1,3,1] | [15,0] null ## f
2021/08/31 00:32:54 opscope start: env.offset=3
2021/08/31 00:32:54 opscope end: env.offset=6
4 store [1,0] | [15,0] null
5 store [1,1] | [15,0] ## v
6 load [1,0] |
7 load [1,1] | null
8 callpc null | null [15,0]
15 scope [2,0,0] | null ## lambda:15
2021/08/31 00:32:54 opscope start: env.offset=6
2021/08/31 00:32:54 opscope end: env.offset=6
16 const 1 | null
17 ret null | 1 ## end of lambda:15
2021/08/31 00:32:54 opret start: env.offset=6
2021/08/31 00:32:54 opret end: env.offset=6
9 store [1,2] | 1 ## $v
10 load [1,0] |
11 const 1 | null
12 ret null | 1 ## end of f
2021/08/31 00:32:54 opret start: env.offset=6
2021/08/31 00:32:54 opret end: env.offset=6
21 store [0,2] | 1
22 jump 26 |
26 pushpc 23 |
27 load [0,2] | [23,0]
28 call 3 | [23,0] 1 ## call f
3 scope [1,3,1] | [23,0] 1 ## f
2021/08/31 00:32:54 opscope start: env.offset=6
2021/08/31 00:32:54 opscope end: env.offset=9
4 store [1,0] | [23,0] 1
5 store [1,1] | [23,0] ## v
6 load [1,0] |
7 load [1,1] | 1
8 callpc null | 1 [23,0]
23 scope [3,0,0] | 1 ## lambda:23
2021/08/31 00:32:54 opscope start: env.offset=9
2021/08/31 00:32:54 opscope end: env.offset=9
24 const 1 | 1
25 ret null | 1 ## end of lambda:23
2021/08/31 00:32:54 opret start: env.offset=9
2021/08/31 00:32:54 opret end: env.offset=9
9 store [1,2] | 1 ## $v
10 load [1,0] |
11 const 1 | 1
12 ret null | 1 ## end of f
2021/08/31 00:32:54 opret start: env.offset=9
2021/08/31 00:32:54 opret end: env.offset=9
29 ret null | 1
2021/08/31 00:32:54 opret start: env.offset=9
2021/08/31 00:32:54 opret end: env.offset=9
1
29 ret <backtrack> null | |
I understand the issue exists in calling the filter argument, and it needs be fix some day. It's not an issue of tail call, and I think there's no critical issues on tail call optimization (the |
👍 I would like to know more about jq:s stack model so while learning more about i will keep this issue in mind. Does jq:s stack implementation differ a lot compared to gojq:s in this case? |
Basic TCO support would substantially improve gojq's speed and memory efficiency for cetain jq programs.
This note first considers a simple and direct test of recursion limits (see [*Recursion] below), and then a more practical test involving an optimized form of
walk
and a well-known large JSON filehttps://github.com/ryanwholey/jeopardy_bot/blob/master/JEOPARDY_QUESTIONS1.json
[*Recursion]
In abbreviated form, the output for jq is:
For gojq, the program fails to complete in a reasonable amount of time. In fact, it takes many hours to reach 600,000.
Setting max to 100,000 gives these performance indicators:
user 78.83
sys 0.65
47173632 maximum resident set size
[*walk]
Results:
The text was updated successfully, but these errors were encountered: