-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: new command 'start-single-node' #28495
Conversation
2ca0215
to
08688e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea, but I don't think we should merge it for 2.1 at this point because it's too disruptive to the docs. We use the start-without-join mode in some places (including the "start a local cluster" tutorial) and we'd need to audit all the docs and tutorials to make sure we aren't giving people instructions that will lead to a "don't do it this way" warning as their first impression.
Product may also want to weigh in since this is a significant change to the initial user experience.
Should start-single-node
also default to --listen-addr=localhost
?
Reviewed 21 of 21 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained
pkg/cli/start.go, line 77 at r1 (raw file):
storage devices, specified via --store flags. Specify the --join flag to point to any other healthy node ( or list
This needs to be updated to reflect the fact that it is now normal to start nodes (at least the first one, by necessity) with --join
pointing to a node that hasn't been started yet.
pkg/cli/start.go, line 676 at r1 (raw file):
if initialBoot && disableReplication { cmd := []string{ "zone", "set", "." + config.DefaultZoneName,
It's more complicated than this - setting the default zone won't automatically cascade into the other system zones that get default zone configs.
I'm good with keeping this open for a few weeks. I appreciate the pressure it would put on docs. However I do recommend we merge it early in the release cycle so we can start working on the UX improvements early on. |
@bdarnell do you want to revive this? |
That question should probably go to PM/docs/support to see if we have the docs bandwidth to update everything, and if this would be enough of an improvement to be worth going through the deprecation cycle. |
08688e8
to
a8b0fed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased, PTAL.
Should start-single-node also default to --listen-addr=localhost?
I don't know. I don't think so - the windows dev would be starting their single-node cluster in a linux vm and expect it to be reachable from the host.
What do you think?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)
pkg/cli/start.go, line 77 at r1 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
This needs to be updated to reflect the fact that it is now normal to start nodes (at least the first one, by necessity) with
--join
pointing to a node that hasn't been started yet.
Done.
pkg/cli/start.go, line 676 at r1 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
It's more complicated than this - setting the default zone won't automatically cascade into the other system zones that get default zone configs.
I still have to investigate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 19 of 19 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @bdarnell and @knz)
pkg/cli/start.go, line 77 at r1 (raw file):
Previously, knz (kena) wrote…
Done.
This still isn't saying what I was looking for. We need to say something about how the join nodes don't need to be started yet. Something like:
Specify the --join flag to point to another node or nodes that are part of the same cluster. The other nodes do not need to be started yet, and if the address of the other nodes to be added are not yet known it is legal for the first node to join itself.
pkg/cli/start.go, line 568 at r2 (raw file):
if waitForInit { initialBoot = true
This is the multi-node join path, not start-single-node. This seems to disagree with the previous comment about this variable.
pkg/cli/start.go, line 705 at r2 (raw file):
cmd := []string{ "sql", "-e", "alter range default configure zone using num_replicas = 1",
I guess this works, but I'm surprised to see this go through the full command machinery. Is there not some function in sql_util we could use more directly?
a8b0fed
to
18313b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)
pkg/cli/start.go, line 77 at r1 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
This still isn't saying what I was looking for. We need to say something about how the join nodes don't need to be started yet. Something like:
Specify the --join flag to point to another node or nodes that are part of the same cluster. The other nodes do not need to be started yet, and if the address of the other nodes to be added are not yet known it is legal for the first node to join itself.
Done.
pkg/cli/start.go, line 676 at r1 (raw file):
Previously, knz (kena) wrote…
I still have to investigate this.
Done.
pkg/cli/start.go, line 568 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
This is the multi-node join path, not start-single-node. This seems to disagree with the previous comment about this variable.
This entire approach was incorrect/flawed. I used a simpler/better approach. PTAL.
pkg/cli/start.go, line 705 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
I guess this works, but I'm surprised to see this go through the full command machinery. Is there not some function in sql_util we could use more directly?
see above
5852081
to
6c06686
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/cli/context.go, line 111 at r3 (raw file):
serverCfg.SocketFile = "" serverCfg.JoinList = nil serverCfg.DefaultSystemZoneConfig = config.DefaultZoneConfig()
Looks like this first line should be removed (or just remove the word System
?)
6c06686
to
99e7943
Compare
99e7943
to
cf1bbf1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)
pkg/cli/context.go, line 111 at r3 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Looks like this first line should be removed (or just remove the word
System
?)
typo indeed. fixed.
Suggested/requested by @bdarnell: > There are currently two ways to initialize a cluster: Starting the first node without a --join flag, or starting all nodes with --join flags and running a separate cockroach init command. This redundancy is confusing (our docs use both methods without explaining the difference) and it's easy to accidentally initialize a new cluster by restarting your first node with the wrong flags. (This could have catastrophic consequences in 1.1 because data from the two clusters could get mixed. We've added safeguards against this in 2.0 but it's still an easy way to break things.) > I think that we should deprecate this implicit initialization mode and make the --join flag mandatory when starting a node (and therefore the init command would be mandatory too). To avoid adding too much complexity to single-node clusters, I propose a new `cockroach start-single-node` command to perform the start and init combination that is today implicit in the absence of a --join flag. > This would have a side benefit that we would know when the user intends to keep the cluster as a single node instead of adding more later. The `start-single-node` command could therefore set the replication factor to 1 to disable the "underreplicated ranges" warning. This patch implements the new `start-single-node` command as follows: - the new command does not accept `--join`. - it attempts to set the replication factor to 1 upon starting up. Meanwhile `cockroach start` is changed to produce a deprecation warning if `--join` is not specified. Release note (cli change): a new command `cockroach start-single-node` is introduced to start single-node clusters with replication disabled. Release note (cli change): using `cockroach start` without `--join` is now deprecated and this mode of execution will be removed in a later version of CockroachDB. Consider using `cockroach start-single-node` instead or combine `cockroach start` with `cockroach init`.
cf1bbf1
to
47474cb
Compare
TFYR bors r+ |
28495: cli: new command 'start-single-node' r=knz a=knz Fixes #24118. Fixes #21429. Suggested/requested by @bdarnell: > There are currently two ways to initialize a cluster: Starting the first node without a --join flag, or starting all nodes with --join flags and running a separate cockroach init command. This redundancy is confusing (our docs use both methods without explaining the difference) and it's easy to accidentally initialize a new cluster by restarting your first node with the wrong flags. (This could have catastrophic consequences in 1.1 because data from the two clusters could get mixed. We've added safeguards against this in 2.0 but it's still an easy way to break things.) > I think that we should deprecate this implicit initialization mode and make the --join flag mandatory when starting a node (and therefore the init command would be mandatory too). To avoid adding too much complexity to single-node clusters, I propose a new `cockroach start-single-node` command to perform the start and init combination that is today implicit in the absence of a --join flag. > This would have a side benefit that we would know when the user intends to keep the cluster as a single node instead of adding more later. The `start-single-node` command could therefore set the replication factor to 1 to disable the "underreplicated ranges" warning. This patch implements the new `start-single-node` command as follows: - the new command does not accept `--join`. - it attempts to set the replication factor to 1 upon starting up. Meanwhile `cockroach start` is changed to produce a deprecation warning if `--join` is not specified. Release note (cli change): a new command `cockroach start-single-node` is introduced to start single-node clusters with replication disabled. Release note (cli change): using `cockroach start` without `--join` is now deprecated and this mode of execution will be removed in a later version of CockroachDB. Consider using `cockroach start-single-node` instead or combine `cockroach start` with `cockroach init`. Co-authored-by: Raphael 'kena' Poss <[email protected]>
Build succeeded |
Fixes #24118.
Fixes #21429.
Suggested/requested by @bdarnell:
This patch implements the new
start-single-node
command as follows:--join
.Meanwhile
cockroach start
is changed to produce a deprecationwarning if
--join
is not specified.Release note (cli change): a new command
cockroach start-single-node
is introduced to start single-node clusters with replication disabled.
Release note (cli change): using
cockroach start
without--join
isnow deprecated and this mode of execution will be removed in a later
version of CockroachDB. Consider using
cockroach start-single-node
instead or combine
cockroach start
withcockroach init
.