Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add PhysicalJoinType #572

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 18 additions & 39 deletions proto/substrait/algebra.proto
Original file line number Diff line number Diff line change
Expand Up @@ -558,6 +558,18 @@ message WriteRel {
}
}

enum PhysicalJoinType {
PHYSICAL_JOIN_TYPE_UNSPECIFIED = 0;
PHYSICAL_JOIN_TYPE_INNER = 1;
PHYSICAL_JOIN_TYPE_OUTER = 2;
PHYSICAL_JOIN_TYPE_LEFT = 3;
PHYSICAL_JOIN_TYPE_RIGHT = 4;
PHYSICAL_JOIN_TYPE_LEFT_SEMI = 5;
PHYSICAL_JOIN_TYPE_RIGHT_SEMI = 6;
PHYSICAL_JOIN_TYPE_LEFT_ANTI = 7;
PHYSICAL_JOIN_TYPE_RIGHT_ANTI = 8;
}

// The hash equijoin join operator will build a hash table out of the right input based on a set of join keys.
// It will then probe that hash table for incoming inputs, finding matches.
message HashJoinRel {
Expand All @@ -568,19 +580,8 @@ message HashJoinRel {
repeated Expression.FieldReference right_keys = 5;
Expression post_join_filter = 6;

JoinType type = 7;

enum JoinType {
JOIN_TYPE_UNSPECIFIED = 0;
JOIN_TYPE_INNER = 1;
JOIN_TYPE_OUTER = 2;
JOIN_TYPE_LEFT = 3;
JOIN_TYPE_RIGHT = 4;
JOIN_TYPE_LEFT_SEMI = 5;
JOIN_TYPE_RIGHT_SEMI = 6;
JOIN_TYPE_LEFT_ANTI = 7;
JOIN_TYPE_RIGHT_ANTI = 8;
}
PhysicalJoinType type = 7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For protobuf compatibility it is better to make this 8 and mark 7 as deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose if we keep all of the values the same as you've done here it would be wire compatible which mean we wouldn't need to renumber the field for the "type" change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's still technically wire compatible! The Anti Join will have to default to one of the two options though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I change this to be

  PHYSICAL_JOIN_TYPE_LEFT_ANTI = 7;
  PHYSICAL_JOIN_TYPE_RIGHT_ANTI = 8;
  PHYSICAL_JOIN_TYPE_LEFT_ANTI_NULL_AWARE = 9;
  PHYSICAL_JOIN_TYPE_RIGHT_ANTI_NULL_AWARE = 10;

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to this ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated again to use a separate is_null_aware field in the rels

bool is_null_aware = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does this boolean enable? It would be good to document it.


substrait.extensions.AdvancedExtension advanced_extension = 10;
}
Expand All @@ -595,19 +596,8 @@ message MergeJoinRel {
repeated Expression.FieldReference right_keys = 5;
Expression post_join_filter = 6;

JoinType type = 7;

enum JoinType {
JOIN_TYPE_UNSPECIFIED = 0;
JOIN_TYPE_INNER = 1;
JOIN_TYPE_OUTER = 2;
JOIN_TYPE_LEFT = 3;
JOIN_TYPE_RIGHT = 4;
JOIN_TYPE_LEFT_SEMI = 5;
JOIN_TYPE_RIGHT_SEMI = 6;
JOIN_TYPE_LEFT_ANTI = 7;
JOIN_TYPE_RIGHT_ANTI = 8;
}
PhysicalJoinType type = 7;
bool is_null_aware = 8;

substrait.extensions.AdvancedExtension advanced_extension = 10;
}
Expand All @@ -621,19 +611,8 @@ message NestedLoopJoinRel {
// optional, defaults to true (a cartesian join)
Expression expression = 4;

JoinType type = 5;

enum JoinType {
JOIN_TYPE_UNSPECIFIED = 0;
JOIN_TYPE_INNER = 1;
JOIN_TYPE_OUTER = 2;
JOIN_TYPE_LEFT = 3;
JOIN_TYPE_RIGHT = 4;
JOIN_TYPE_LEFT_SEMI = 5;
JOIN_TYPE_RIGHT_SEMI = 6;
JOIN_TYPE_LEFT_ANTI = 7;
JOIN_TYPE_RIGHT_ANTI = 8;
}
PhysicalJoinType type = 5;
bool is_null_aware = 6;

substrait.extensions.AdvancedExtension advanced_extension = 10;
}
Expand Down
Loading