Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sled-agent] Separate CockroachDB "start" from CockroachDB "init" #2954

Merged
merged 62 commits into from
Jun 29, 2023

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Apr 30, 2023

This is a necessary part of #727

History

  • Previously, CockroachDB was initialized via https://www.cockroachlabs.com/docs/stable/cockroach-start-single-node.html
  • Additionally, the CRDB tables were explicitly initialized via a request to run sql using dbinit.sql
  • This was historically done on CRDB datasets immediately when they were initialized - see:
    // If requested, format the cluster with the initial tables.
    if do_format {
    info!(log, "Formatting CRDB");
    zone.run_cmd(&[
    "/opt/oxide/cockroachdb/bin/cockroach",
    "sql",
    "--insecure",
    "--host",
    &dataset_info.address.to_string(),
    "--file",
    "/opt/oxide/cockroachdb/sql/dbwipe.sql",
    ])?;
    zone.run_cmd(&[
    "/opt/oxide/cockroachdb/bin/cockroach",
    "sql",
    "--insecure",
    "--host",
    &dataset_info.address.to_string(),
    "--file",
    "/opt/oxide/cockroachdb/sql/dbinit.sql",
    ])?;
    info!(log, "Formatting CRDB - Completed");
    }
  • In [sled-agent] Refactor service management out of StorageManager #2946 this was changed to just "unconditionally format during zone initialization", which is wrong, but was a holdover for...

This PR

@smklein smklein changed the title Separate CockroachDB "start" from CockroachDB "init" [sled-agent] Separate CockroachDB "start" from CockroachDB "init" Apr 30, 2023
@smklein smklein added Sled Agent Related to the Per-Sled Configuration and Management storage Related to storage. labels Apr 30, 2023
## Before this PR

Running on rack2 and calling `omicron-package uninstall` would involve a
fatal termination of the connection, as it would delete the `cxgbe0/ll`
and `cxgbe1/ll` IP addresses necessary for contacting the sled.

## After this PR

Those addresses are left alone. This is pretty useful for development,
as it allows us to run `uninstall` to cleanly wipe a Gimlet, preparing
it for future "clean installs".
@smklein smklein changed the base branch from storage-manager-cleanup to rss-explicit April 30, 2023 06:27
@davepacheco davepacheco self-assigned this Jun 27, 2023
@davepacheco davepacheco added this to the FCS milestone Jun 27, 2023
@davepacheco
Copy link
Collaborator

I've taken on this PR. The biggest change I've made so far here is that previously, when provisioning a CockroachDB zone, Sled Agent would look up the Cockroach service in DNS, take the addresses it found, and write those into SMF properties. Then the start method would pull those out and use those as the "join" addresses so that Cockroach can find the other nodes in its cluster. The problem with this is that there's nothing to keep those addresses in sync with what's in DNS (which is the source of truth about who's in the cluster). Instead, I've added some code to configure DNS in the CockroachDB zones (using the dns/install SMF service provided by illumos, which I didn't previously know about!) and then use that in the start method to locate the CockroachDB nodes.

It's worth mentioning explicitly that we've designed internal DNS to be able to start without any external runtime dependencies, including CockroachDB (see RFD 248), which means that using DNS here should not cause a problem for cold boot. As I write this, though, I realize that using dig from the start method like I'm currently doing probably isn't going to cut it. We need something that will keep retrying until it succeeds in case it takes a little while for internal-DNS to come up.

There's another bit of this I'm a little worried about, which is exposing an API in sled agent to initialize the CockroachDB cluster. That seems a little dangerous and also overkill since we only ever intend to do this once, and only before the control plane is initialized. I considered changing this to instead have RSS pass configuration to an SMF service which would do this. The problem is that there's not a great way to propagate success/failure information back to RSS so that it can decide whether to proceed (or, I guess, burn down the world and try again). I'm going to defer fixing this for now because we really need to start playing with multi-node CockroachDB.

I originally thought it made sense to put this into "internal-dns" since
it's a straightforward CLI that goes with the library exposed by
"internal-dns".  But it doesn't work due to a few surprising behaviors:

- To ship "dnswait", we'd need to build its parent Rust package as an
  Omicron package.  omicron-package requires that the Omicron package
  name be "internal-dns" if it's going to come from building the
  "internal-dns" Rust package.
- "omicron-package" does some Cargo operation based on packages in
  Cargo.lock.
- Omicron actually has _two_ different "internal-dns" packages in
  Cargo.lock: one from the local workspace, and one from the copy of
  Propolis that pulls in the latest Omicron "main".

As a result, we cannot package up "dnswait" (or any other Rust binary)
if it's in a package in the workspace that's also depended-on by
something else external to Omicron (like Propolis) in the workspace.
@davepacheco davepacheco marked this pull request as ready for review June 28, 2023 00:17
@davepacheco davepacheco mentioned this pull request Jun 28, 2023
2 tasks
Copy link
Collaborator Author

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't approve this because technically it's my PR, but I do approve! This looks great!

// Note that when we configure the dns/install service, we're
// supplying values for an existing property group on the SMF
// *service*. We're not creating a new property group, nor are
// we configuring a property group on the instance.t by default.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// we configuring a property group on the instance.t by default.
// we configuring a property group on the instance by default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Comment on lines +28 to +30
JOIN_ADDRS="$(/opt/oxide/internal-dns-cli/bin/dnswait cockroach \
| head -n 5 \
| tr '\n' ,)"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this solution. Thanks for fitting everything into the "self-assembling zone" approach -- it's neat that we can now refresh all our service info by simply restarting the CRDB service.

Comment on lines +230 to +242
/// Initializes a CockroachDB cluster
#[endpoint {
method = POST,
path = "/cockroachdb",
}]
async fn cockroachdb_init(
rqctx: RequestContext<SledAgent>,
) -> Result<HttpResponseUpdatedNoContent, HttpError> {
let sa = rqctx.context();
sa.cockroachdb_initialize().await?;
Ok(HttpResponseUpdatedNoContent())
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with your comment that "this is sketchy, and it would be cool to not have it".

It seems like it could also be possible to have the method script auto-initialize the DB if it's empty, but that seems like it could raise other problems.

All that said, this approach is probably good enough to last a while.

@@ -1074,6 +1085,40 @@ impl ServiceManager {
let Some(info) = self.inner.sled_info.get() else {
return Err(Error::SledAgentNotReady);
};

// We want to configure the dns/install SMF service inside the
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pull this out into a separate method? I'd be shocked if this was the only service for which we wanted to populate /etc/resolv.conf.

(I mean, arguably, shouldn't we populate basically all our services with /etc/resolv.conf?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we should do this, but it's more complicated to use it:
#2122 (comment)

so I'd like to defer it for now.

@@ -0,0 +1,98 @@
// This Source Code Form is subject to the terms of the Mozilla Public
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I'm clear - the main reason for using this instead of dig +short <name> is the timeout?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had dig at first, but I realized it will only try a handful of times and only wait a few seconds, and you can't tell from its exit status that it's failed. I needed something that's going to retry an arbitrary number of times, waits until it finds some results, too, and if it's going to fail, fails clearly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but maybe it makes sense to move this utility into the omicron-brand zone alongside some of the other network setup that might be more general? Seems a bit like a pain-in-the-butt to manually package it into each zone that could need it

@davepacheco davepacheco enabled auto-merge (squash) June 29, 2023 00:29
@davepacheco
Copy link
Collaborator

I've gone ahead and enabled auto-merge: however one accounts this, @smklein both worked on this and both reviewed it.

@davepacheco davepacheco enabled auto-merge (squash) June 29, 2023 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sled Agent Related to the Per-Sled Configuration and Management storage Related to storage.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants