Estimated size information lost when `.map()` result is chained #900

jpbetz · 2024-02-27T18:23:28Z

Estimated size information lost with .map() when .filter() is called on the result (presumably applies to other chaining operations).

To Reproduce
Check which components this affects:

parser
checker
interpreter

Sample expression and input that reproduces the issue:

[1,2,3,4,5].map(x, x).filter(x, x % 2 == 0)

Test setup:

checker/cost_test.go:

		{
			name:   ".map.filter size",
			expr:   `[1,2,3,4,5].map(x, x).filter(x, x % 2 == 0)`,
			wanted: CostEstimate{Min: 97, Max: 97},
		},

Expected behavior

Expected a bounded cost estimate. got: [97, 18446744073709551615]

Additional context

Both these test cases pass when added:

		{
			name:   ".filter size",
			expr:   `[1,2,3,4,5].filter(x, x % 2 == 0)`,
			wanted: CostEstimate{Min: 41, Max: 101},
		},
		{
			name:   ".map size",
			expr:   `[1,2,3,4,5].map(x, x)`,
			wanted: CostEstimate{Min: 86, Max: 86},
		},

The text was updated successfully, but these errors were encountered:

jpbetz · 2024-02-27T18:24:28Z

cc @cici37 @TristonianJones

TristonianJones · 2024-02-27T18:51:37Z

@jpbetz why do you expected unbounded cost on the upperbound when the input size is fixed?

stevekuznetsov · 2024-02-27T18:59:54Z

@TristonianJones I think it's the opposite - we expect a bounded cost, but the current implementation returns an unbounded one.

jpbetz · 2024-02-27T19:31:25Z

Joe Betz why do you expected unbounded cost on the upperbound when the input size is fixed?

I expected bounded but got unbounded.

stevekuznetsov · 2024-02-27T19:44:05Z

It seems like there's something more pernicious than adding .filter() to a .map() - adding the following two cases:

		{
			name: "list n squared",
			expr: `[1,2,3].all(i, i in [1,2,3])`,
			vars: []*decls.VariableDecl{
				decls.NewVariable("list1", types.NewListType(types.IntType)),
				decls.NewVariable("list2", types.NewListType(types.IntType)),
			},
			wanted: CostEstimate{Min: 5, Max: 5},
		},
		{
			name: "list n squared with map",
			expr: `[1,2,3].all(i, i in [1,2,3].map(j, j + j))`,
			vars: []*decls.VariableDecl{
				decls.NewVariable("list1", types.NewListType(types.IntType)),
				decls.NewVariable("list2", types.NewListType(types.IntType)),
			},
			wanted: CostEstimate{Min: 5, Max: 5},
		},

We get:

$ go test ./checker/... -run TestCost/list_n_squared
?   	github.com/google/cel-go/checker/decls	[no test files]
--- FAIL: TestCost (0.00s)
    --- FAIL: TestCost/list_n_squared (0.00s)
        cost_test.go:566: Got cost interval [20, 62], wanted [5, 5]
    --- FAIL: TestCost/list_n_squared_with_map (0.00s)
        cost_test.go:566: Got cost interval [20, 18446744073709551615], wanted [5, 5]
FAIL
FAIL	github.com/google/cel-go/checker	0.003s

The cost assertions failing aside, the fact that the second evaluates to an unbounded cost is wrong and reproduces something similar to the original report without having to chain anything.

TristonianJones · 2024-02-27T19:51:42Z

@jpbetz @stevekuznetsov, my apologies, I misread the report about the expectations.

My guess is that all comprehensions are looking for fixed size iteration ranges, so any chaining / nesting with more than one comprehension will likely trigger the issue. Happy to have more test cases though.

-Tristan

stevekuznetsov · 2024-02-27T19:55:01Z

@TristonianJones I'm new to the project so I don't know if I understand your comment about "all comprehensions are looking for fixed size iteration ranges" but I believe my second test case has fixed-size inputs for all and still evaluates to an unbounded result:

 `[1,2,3].all(i, i in [1,2,3].map(j, j + j))`

For the original report from @jpbetz with a chained [].map().filter() are you saying that the current approach is not able to determine that .filter() is getting a bounded input; and, if so, is that an issue or something working as expected?

Edit: I guess in both cases it seems like the evaluator is not able to determine that the output of .map() is a fixed size.

Edit 2: we hit this case on the output of the .map() call, which brings our cost to unbounded.

jpbetz mentioned this issue Feb 27, 2024

Track sizes of comprehension results #901

Merged

jpbetz closed this as completed in #901 Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimated size information lost when `.map()` result is chained #900

Estimated size information lost when `.map()` result is chained #900

jpbetz commented Feb 27, 2024 •

edited

Loading

jpbetz commented Feb 27, 2024

TristonianJones commented Feb 27, 2024

stevekuznetsov commented Feb 27, 2024

jpbetz commented Feb 27, 2024

stevekuznetsov commented Feb 27, 2024

TristonianJones commented Feb 27, 2024 •

edited

Loading

stevekuznetsov commented Feb 27, 2024 •

edited

Loading

Estimated size information lost when .map() result is chained #900

Estimated size information lost when .map() result is chained #900

Comments

jpbetz commented Feb 27, 2024 • edited Loading

jpbetz commented Feb 27, 2024

TristonianJones commented Feb 27, 2024

stevekuznetsov commented Feb 27, 2024

jpbetz commented Feb 27, 2024

stevekuznetsov commented Feb 27, 2024

TristonianJones commented Feb 27, 2024 • edited Loading

stevekuznetsov commented Feb 27, 2024 • edited Loading

Estimated size information lost when `.map()` result is chained #900

Estimated size information lost when `.map()` result is chained #900

jpbetz commented Feb 27, 2024 •

edited

Loading

TristonianJones commented Feb 27, 2024 •

edited

Loading

stevekuznetsov commented Feb 27, 2024 •

edited

Loading