You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Full Outer Joins results are expected to have unmatched rows from either table with nulls for the other table's columns. There is a bug in Velox's implementation where when x rows (x >= 2) match on the equality clause (ex. t0 = u0) but don't pass the filter (ex. t0 > 4), for some reason x - 1 rows are still considered a match and therefore results in missing rows in the output. See the following example.
SELECT
*
FROM (
VALUES (1), (2), (3)
) t(t0)
),
u AS (
SELECT
*
FROM (
VALUES (2), (3), (4)
) t(u0)
)
SELECT
*
FROM t
FULL OUTER JOIN u
ON t.t0 = u.u0
AND t.t0 > 4;
Nothing should match since nothing passes the filter. This query should produce the table:
However Velox thinks there is a match on 2 and produces the table:
t0 | u0
1 | NULL
2 | 2
3 | NULL
NULL | 3
NULL | 4
2 | 2 is an extra row and 2 | NULL and NULL | 2 are missing rows.
Here is the unit test I made within MergeJoinTest.cpp:
TEST_F(MergeJoinTest, fullOuterJoinFilteredOutMatches) {
auto left = makeRowVector({"t0"}, {makeFlatVector<int64_t>({1, 2, 3})});
auto right = makeRowVector({"u0"}, {makeFlatVector<int64_t>({2, 3, 4})});
createDuckDbTable("t", {left});
createDuckDbTable("u", {right});
// Full outer join.
auto planNodeIdGenerator = std::make_shared<core::PlanNodeIdGenerator>();
auto plan =
PlanBuilder(planNodeIdGenerator)
.values({left})
.mergeJoin(
{"t0"},
{"u0"},
PlanBuilder(planNodeIdGenerator).values({right}).planNode(),
"t0 > 4",
{"t0", "u0"},
core::JoinType::kFull)
.planNode();
AssertQueryBuilder(plan, duckDbQueryRunner_)
.assertResults(
"SELECT * FROM t FULL OUTER JOIN u ON t.t0 = u.u0 AND t.t0 > 4");
}
System information
Wrote unit test on my linux dev server.
Relevant logs
No response
The text was updated successfully, but these errors were encountered:
Bug description
Full Outer Joins results are expected to have unmatched rows from either table with nulls for the other table's columns. There is a bug in Velox's implementation where when x rows (x >= 2) match on the equality clause (ex. t0 = u0) but don't pass the filter (ex. t0 > 4), for some reason x - 1 rows are still considered a match and therefore results in missing rows in the output. See the following example.
Nothing should match since nothing passes the filter. This query should produce the table:
However Velox thinks there is a match on 2 and produces the table:
2 | 2
is an extra row and2 | NULL
andNULL | 2
are missing rows.Here is the unit test I made within MergeJoinTest.cpp:
System information
Wrote unit test on my linux dev server.
Relevant logs
No response
The text was updated successfully, but these errors were encountered: