-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,distsqlrun,gossip: don't plan on incompatible nodes #17747
sql,distsqlrun,gossip: don't plan on incompatible nodes #17747
Conversation
@bdarnell for gossip stuff I ran into trouble with #17497 which was supposed to make us resilient to DistSQL version mismatch errors (and other errors) through a runtime fallback mechanism - the fundamental issue is that it's hard to decide when to fallback because it's hard to conclude that you've connected to the wrong node. |
Review status: 0 of 13 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/distsql_physical_planner.go, line 517 at r1 (raw file):
If we have an older node, then we upgrade it, it will be stuck as non-compatible forever no? Ideally these entries should time out pkg/sql/executor.go, line 339 at r1 (raw file):
[nit] Would be cleaner for this to return Comments from Reviewable |
Review status: 0 of 13 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/distsql_physical_planner.go, line 517 at r1 (raw file): Previously, RaduBerinde wrote…
Comments from Reviewable |
Review status: 0 of 13 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/distsql_physical_planner.go, line 517 at r1 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Ah, indeed, sorry. Comments from Reviewable |
110558c
to
8c859ba
Compare
Review status: 0 of 13 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/executor.go, line 339 at r1 (raw file): Previously, RaduBerinde wrote…
done. I had written it like this to resemble the gossip interface, but I think I had already departed from that untyped interface. Comments from Reviewable |
8c859ba
to
779879f
Compare
If we attempt to schedule a flow on a node who's DistSQL version is incompatible, the scheduling is going to return an error. This patch attempts to minimize the occurrence of these errors by not planning flows on incompatible nodes. This is done by having each node gossip it's range of accepted versions, and having planning consult gossip before deciding to map key spans onto a node. Spans owned by incompatible nodes are mapped to the gateway. The planning version check is done in distSQLPlanner.partitionSpans(). This may not sounds like the right place for it, but there's currently no better place. That's the layer that's currently similarly concerned with node health because everything planned above TableReaders currently mechanically follows the set of nodes decided in partitionSpans(). An alternative would be to lift the remapping onto the gateway to a step done after the plan has been build, but that would lead to worse plans.
779879f
to
02ff564
Compare
If we attempt to schedule a flow on a node who's DistSQL version is
incompatible, the scheduling is going to return an error.
This patch attempts to minimize the occurrence of these errors by not
planning flows on incompatible nodes. This is done by having each node
gossip it's range of accepted versions, and having planning consult
gossip before deciding to map key spans onto a node. Spans owned by
incompatible nodes are mapped to the gateway.
The planning version check is done in distSQLPlanner.partitionSpans().
This may not sounds like the right place for it, but there's currently
no better place. That's the layer that's currently similarly concerned
with node health because everything planned above TableReaders currently
mechanically follows the set of nodes decided in partitionSpans().
An alternative would be to lift the remapping onto the gateway to a
step done after the plan has been build, but that would lead to worse
plans.