Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sled Agent startup logging is confusing #1215

Open
davepacheco opened this issue Jun 15, 2022 · 1 comment
Open

Sled Agent startup logging is confusing #1215

davepacheco opened this issue Jun 15, 2022 · 1 comment
Labels
Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more) development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better Sled Agent Related to the Per-Sled Configuration and Management

Comments

@davepacheco
Copy link
Collaborator

While setting up the stack a few weeks ago, I found the Sled Agent logging pretty confusing about the current state of things.

The first thing to know is that Sled Agent startup makes calls out to Nexus, but the first time it starts up, Nexus isn't running yet. Eventually, it will deploy the first Nexus instance.

I'll include the full log here for reference:

dap@sock ~/omicron $ BUNYAN_NO_PAGER=1 bunyan  $(svcs -L sled-agent) 
[ Jun  1 23:07:29 Enabled. ]
[ Jun  1 23:07:29 Executing start method ("ctrun -l child -o noorphan,regent /opt/oxide/sled-agent/sled-agent run /opt/oxide/sled-agent/pkg/config.toml &"). ]
[ Jun  1 23:07:29 Method "start" exited with status 0. ]
[ Jun  1 23:07:29 Rereading configuration. ]
[ Jun  1 23:07:29 No 'refresh' method defined.  Treating as :true. ]
note: configured to log to "/dev/stdout"
[2022-06-01T23:07:29.498928583Z]  INFO: SledAgent/2909 on sock: setting up bootstrap agent server
[2022-06-01T23:07:29.49930742Z]  INFO: SledAgent/2909 on sock: Ensuring config directory exists (path=/var/oxide)
[2022-06-01T23:07:29.525905774Z]  INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: listening (local_addr=[fdb0:18c0:4d0d:a02a::1]:12346)
[2022-06-01T23:07:29.526285081Z]  INFO: SledAgent/BootstrapAgent/2909 on sock: bootstrap service initializing (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.527375441Z]  INFO: SledAgent/RSS/2909 on sock:
    Injecting RSS configuration: SetupServiceConfig {
        rack_subnet: fd00:1122:3344:100::,
        requests: [
            SledRequest {
                datasets: [
                    DatasetEnsureBody {
                        zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b,
                        dataset_kind: Crucible,
                        address: [fd00:1122:3344:101::6]:32345,
                    },
                    DatasetEnsureBody {
                        zpool_uuid: e4b4dc87ab4649fba4b4d361ae214c03,
                        dataset_kind: Crucible,
                        address: [fd00:1122:3344:101::7]:32345,
                    },
                    DatasetEnsureBody {
                        zpool_uuid: f4b4dc87ab4649fba4b4d361ae214c03,
                        dataset_kind: Crucible,
                        address: [fd00:1122:3344:101::8]:32345,
                    },
                    DatasetEnsureBody {
                        zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b,
                        dataset_kind: CockroachDb {
                            all_addresses: [
                                [fd00:1122:3344:101::2]:32221,
                            ],
                        },
                        address: [fd00:1122:3344:101::2]:32221,
                    },
                    DatasetEnsureBody {
                        zpool_uuid: d462a7f7b62840fe80ff4e4189e2d62b,
                        dataset_kind: Clickhouse,
                        address: [fd00:1122:3344:101::5]:8123,
                    },
                ],
                services: [
                    ServiceRequest {
                        name: "nexus",
                        addresses: [
                            fd00:1122:3344:101::3,
                        ],
                        gz_addresses: [],
                    },
                    ServiceRequest {
                        name: "oximeter",
                        addresses: [
                            fd00:1122:3344:101::4,
                        ],
                        gz_addresses: [],
                    },
                ],
                dns_services: [],
            },
        ],
    }
[2022-06-01T23:07:29.527725438Z]  INFO: SledAgent/RSS/2909 on sock: RSS configuration has not been fully applied yet
[2022-06-01T23:07:29.527999026Z]  INFO: SledAgent/RSS/2909 on sock: Enough peers exist to enact RSS plan
[2022-06-01T23:07:29.528261924Z]  INFO: SledAgent/RSS/2909 on sock: Creating new allocation plan
[2022-06-01T23:07:29.528615701Z]  INFO: SledAgent/RSS/2909 on sock:
    dns_subnets: [
        DnsSubnet {
            subnet: Ipv6Subnet {
                net: Ipv6Net(
                    Ipv6Network {
                        addr: fd00:1122:3344:1::,
                        prefix: 64,
                    },
                ),
            },
        },
    ]
[2022-06-01T23:07:29.528911088Z]  INFO: SledAgent/RSS/2909 on sock: Serializing plan
[2022-06-01T23:07:29.529183685Z]  INFO: SledAgent/RSS/2909 on sock: Creating plan for the sled at fdb0:18c0:4d0d:a02a::1
[2022-06-01T23:07:29.529955478Z]  INFO: SledAgent/RSS/2909 on sock:
    Plan serialized as: ["[fdb0:18c0:4d0d:a02a::1]:12346".initialization_request.subnet]
    net = "fd00:1122:3344:101::/64"
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]]
    address = "[fd00:1122:3344:101::6]:32345"
    zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b"
    
    ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind]
    type = "crucible"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]]
    address = "[fd00:1122:3344:101::7]:32345"
    zpool_uuid = "e4b4dc87-ab46-49fb-a4b4-d361ae214c03"
    
    ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind]
    type = "crucible"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]]
    address = "[fd00:1122:3344:101::8]:32345"
    zpool_uuid = "f4b4dc87-ab46-49fb-a4b4-d361ae214c03"
    
    ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind]
    type = "crucible"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]]
    address = "[fd00:1122:3344:101::2]:32221"
    zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b"
    
    ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind]
    all_addresses = ["[fd00:1122:3344:101::2]:32221"]
    type = "cockroach_db"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset]]
    address = "[fd00:1122:3344:101::5]:8123"
    zpool_uuid = "d462a7f7-b628-40fe-80ff-4e4189e2d62b"
    
    ["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dataset.dataset_kind]
    type = "clickhouse"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.dns_service]]
    addresses = ["fd00:1122:3344:1::1"]
    gz_addresses = ["fd00:1122:3344:1::2"]
    name = "internal-dns"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.service]]
    addresses = ["fd00:1122:3344:101::3"]
    gz_addresses = []
    name = "nexus"
    
    [["[fdb0:18c0:4d0d:a02a::1]:12346".services_request.service]]
    addresses = ["fd00:1122:3344:101::4"]
    gz_addresses = []
    name = "oximeter"
    
[2022-06-01T23:07:29.530100767Z]  INFO: SledAgent/RSS/2909 on sock: Plan written to storage
[2022-06-01T23:07:29.530252186Z]  INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Received initialization request from RSS (target_sled=[fdb0:18c0:4d0d:a02a::1]:12346)
    request: SledAgentRequest { subnet: Ipv6Subnet { net: Ipv6Net(Ipv6Network { addr: fd00:1122:3344:101::, prefix: 64 }) } }
[2022-06-01T23:07:29.537780049Z]  INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Sending request to peer agent: http://[fdb0:18c0:4d0d:a02a::1]:12346
[2022-06-01T23:07:29.538535593Z]  INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: accepted connection (local_addr=[fdb0:18c0:4d0d:a02a::1]:12346, remote_addr=[fdb0:18c0:4d0d:a02a::1]:60324)
[2022-06-01T23:07:29.539598743Z]  INFO: SledAgent/BootstrapAgent/2909 on sock: Loading Sled Agent: SledAgentRequest { subnet: Ipv6Subnet { net: Ipv6Net(Ipv6Network { addr: fd00:1122:3344:101::, prefix: 64 }) } } (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.539764542Z]  INFO: SledAgent/2909 on sock: setting up sled agent server
[2022-06-01T23:07:29.545187864Z]  INFO: SledAgent/SledAgent/2909 on sock: created sled agent (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.584224588Z]  INFO: SledAgent/SledAgent/2909 on sock: using '[PhysicalLink("net0"), PhysicalLink("net1")]' as data links for xde driver (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.724829094Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService0)
[2022-06-01T23:07:29.738362333Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage0)
[2022-06-01T23:07:29.754063645Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage1)
[2022-06-01T23:07:29.766831082Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage2)
[2022-06-01T23:07:29.779140633Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage3)
[2022-06-01T23:07:29.790549182Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlStorage5)
[2022-06-01T23:07:29.805585209Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService1)
[2022-06-01T23:07:29.819949122Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlService2)
[2022-06-01T23:07:29.83595521Z]  WARN: SledAgent/SledAgent/2909 on sock: Deleting existing VNIC (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7, vnic_kind=OxideControl, vnic_name=oxControlInstance1)
[2022-06-01T23:07:29.847069752Z]  INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.85634795Z]  INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.866062884Z]  INFO: SledAgent/SledAgent/2909 on sock: Sled Agent upserting zpool to Storage Manager: oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 (sled_id=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.878078798Z]  INFO: SledAgent/ServiceManager/2909 on sock: No sled services found at /var/oxide/services.toml
[2022-06-01T23:07:29.878531404Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: listening (local_addr=[fd00:1122:3344:101::1]:12345)
[2022-06-01T23:07:29.878791281Z]  INFO: SledAgent/BootstrapAgent/2909 on sock: Sled Agent loaded; recording configuration (server=fb0f7546-4d46-40ca-9d56-cbb810684ca7)
[2022-06-01T23:07:29.880930132Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:07:29.881547217Z]  INFO: SledAgent/StorageManager/2909 on sock:
    Storage manager processing zpool: ZpoolInfo {
        name: "oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b",
        size: 9663676416,
        allocated: 105984,
        free: 9663570432,
        health: Online,
    }
[2022-06-01T23:07:29.882482039Z]  INFO: SledAgent/dropshot (Bootstrap)/2909 on sock: request completed (req_id=d7cc109d-1193-44c1-ba09-cbc38512f545, uri=/start_sled, method=PUT, remote_addr=[fdb0:18c0:4d0d:a02a::1]:60324, local_addr=[fdb0:18c0:4d0d:a02a::1]:12346, response_code=200)
[2022-06-01T23:07:29.883345721Z]  INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Peer agent at http://[fdb0:18c0:4d0d:a02a::1]:12346 initialized
[2022-06-01T23:07:29.88573007Z]  INFO: SledAgent/BootstrapAgentRssHandler/2909 on sock: Initialized sled agent (target_sled=[fdb0:18c0:4d0d:a02a::1]:12346)
[2022-06-01T23:07:29.890661886Z]  INFO: SledAgent/RSS/2909 on sock: sending service requests...
[2022-06-01T23:07:29.891013443Z]  INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:07:29.893303833Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:65156)
[2022-06-01T23:07:29.893720379Z]  INFO: SledAgent/StorageManager/2909 on sock:
    Storage manager processing zpool: ZpoolInfo {
        name: "oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03",
        size: 9663676416,
        allocated: 104448,
        free: 9663571968,
        health: Online,
    }
[2022-06-01T23:07:29.89470061Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:07:29.899493828Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service internal-dns does not yet exist
[2022-06-01T23:07:29.904128317Z]  INFO: SledAgent/StorageManager/2909 on sock:
    Storage manager processing zpool: ZpoolInfo {
        name: "oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03",
        size: 9663676416,
        allocated: 540672,
        free: 9663135744,
        health: Online,
    }
[2022-06-01T23:07:29.922388205Z]  INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_internal-dns
[2022-06-01T23:07:29.942309849Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_internal-dns
[2022-06-01T23:07:33.369817443Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_internal-dns)
[2022-06-01T23:07:37.563198118Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists
[2022-06-01T23:07:37.563371876Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:1::1, prefix: 64 })) (zone=oxz_internal-dns)
[2022-06-01T23:07:38.147611576Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists - OK
[2022-06-01T23:07:38.14828473Z]  INFO: SledAgent/ServiceManager/2909 on sock:
    GZ addresses: [
        fd00:1122:3344:1::2,
    ]
[2022-06-01T23:07:38.148801645Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring GZ address fd00:1122:3344:1::2 exists
[2022-06-01T23:07:40.491786898Z]  INFO: SledAgent/ServiceManager/2909 on sock: halt_and_remove_logged: Previous zone state: Running (zone=oxz_internal-dns)
[2022-06-01T23:07:40.495268997Z]  INFO: SledAgent/ServiceManager/2909 on sock: Stopped and uninstalled zone (zone=oxz_internal-dns)
[2022-06-01T23:07:40.504126388Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=847eff8e-3986-4940-951a-1daedbc567db, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:65156, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500)
    error_message_internal: Failed to do 'importing manifest' by running command in zone: Error running command in zone 'oxz_internal-dns': Command [/usr/sbin/zlogin oxz_internal-dns /usr/sbin/svccfg import /var/svc/manifest/site/internal-dns/manifest.xml] executed and failed with status: exit status: 1. Output: svccfg: Temporary service svc:/TEMP/system/illumos/internal-dns was deleted unexpectedly.
    svccfg: Import of /var/svc/manifest/site/internal-dns/manifest.xml failed.  Progress:
    svccfg:   Service "system/illumos/internal-dns": properties imported.
    svccfg:     Instance "default": not reached.
    svccfg: Import of /var/svc/manifest/site/internal-dns/manifest.xml failed.
    
[2022-06-01T23:07:40.50505965Z]  WARN: SledAgent/RSS/2909 on sock: failed to initialize services (error="Error Response")
[2022-06-01T23:07:40.856870037Z]  INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:07:40.857262473Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:07:40.860653423Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service internal-dns does not yet exist
[2022-06-01T23:07:40.879449847Z]  INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_internal-dns
[2022-06-01T23:07:40.900456921Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_internal-dns
[2022-06-01T23:07:43.579114912Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_internal-dns)
[2022-06-01T23:07:44.880626243Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 248.569652ms
[2022-06-01T23:07:44.892728996Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 251.890345ms
[2022-06-01T23:07:44.9058831Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 236.424914ms
[2022-06-01T23:07:44.920890217Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 363.68247ms
[2022-06-01T23:07:45.131018307Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:07:48.056063908Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists
[2022-06-01T23:07:48.057109509Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:1::1, prefix: 64 })) (zone=oxz_internal-dns)
[2022-06-01T23:07:48.644162193Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:1::1 exists - OK
[2022-06-01T23:07:48.647255215Z]  INFO: SledAgent/ServiceManager/2909 on sock:
    GZ addresses: [
        fd00:1122:3344:1::2,
    ]
[2022-06-01T23:07:48.647533723Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring GZ address fd00:1122:3344:1::2 exists
[2022-06-01T23:07:48.953067619Z]  INFO: SledAgent/ServiceManager/2909 on sock: Setting up internal-dns service
[2022-06-01T23:07:49.134502953Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=33d9884c-dbea-4236-8550-677a531ddfbe, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:65156, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:07:49.142928381Z]  INFO: SledAgent/RSS/2909 on sock: sending dataset requests...
[2022-06-01T23:07:49.14320227Z]  INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Crucible, address: [fd00:1122:3344:101::6]:32345 }
[2022-06-01T23:07:49.143484764Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:63166)
[2022-06-01T23:07:49.146545356Z]  INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Crucible, address: [fd00:1122:3344:101::6]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-01T23:07:49.146923333Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible exists
[2022-06-01T23:07:49.230967819Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible is running
[2022-06-01T23:07:49.248201686Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/crucible was not found
[2022-06-01T23:07:49.270982834Z]  INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:07:49.298938367Z]  INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:07:52.408185018Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone booting
    zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:07:54.58967506Z]  INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::6, prefix: 64 }))
    zone: oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:07:55.158758284Z]  INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible
[2022-06-01T23:07:55.945733258Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::6]:32345 is running
[2022-06-01T23:07:55.94666306Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=d78b1247-3d12-4794-ac73-fbbf59d86e4f, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:07:55.950532206Z]  INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: e4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::7]:32345 }
[2022-06-01T23:07:55.951088511Z]  INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: e4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::7]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-01T23:07:55.951251949Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible exists
[2022-06-01T23:07:56.021671906Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible is running
[2022-06-01T23:07:56.031922155Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible was not found
[2022-06-01T23:07:56.056733536Z]  INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:07:56.094079095Z]  INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:07:59.173441751Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone booting
    zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:00.131916877Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 743.622125ms
[2022-06-01T23:08:00.877123811Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:08:03.24196011Z]  INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::7, prefix: 64 }))
    zone: oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:03.825864003Z]  INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible
[2022-06-01T23:08:04.350202462Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_e4b4dc87-ab46-49fb-a4b4-d361ae214c03 with address [fd00:1122:3344:101::7]:32345 is running
[2022-06-01T23:08:04.350783057Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 276.531767ms
[2022-06-01T23:08:04.351165643Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 423.249437ms
[2022-06-01T23:08:04.351359371Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 390.258627ms
[2022-06-01T23:08:04.353106606Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=95b80cd2-aaea-481a-a6b5-395ceb097d42, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:08:04.353885469Z]  INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: f4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::8]:32345 }
[2022-06-01T23:08:04.354124177Z]  INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: f4b4dc87-ab46-49fb-a4b4-d361ae214c03, dataset_kind: Crucible, address: [fd00:1122:3344:101::8]:32345, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-01T23:08:04.354384785Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible exists
[2022-06-01T23:08:04.41698162Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible is running
[2022-06-01T23:08:04.425654484Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03/crucible was not found
[2022-06-01T23:08:04.443365947Z]  INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:04.466658871Z]  INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:07.655240749Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone booting
    zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:11.880400614Z]  INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::8, prefix: 64 }))
    zone: oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03
[2022-06-01T23:08:12.460278541Z]  INFO: SledAgent/StorageManager/2909 on sock: Initializing Crucible
[2022-06-01T23:08:13.038084368Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_crucible_oxp_f4b4dc87-ab46-49fb-a4b4-d361ae214c03 with address [fd00:1122:3344:101::8]:32345 is running
[2022-06-01T23:08:13.039354106Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 149.790028ms
[2022-06-01T23:08:13.039546624Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=f679f371-f3ca-48d2-9576-5ef57bf5b05d, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:08:13.039818662Z]  INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221 }
[2022-06-01T23:08:13.040217589Z]  INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-01T23:08:13.040354397Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb exists
[2022-06-01T23:08:13.104639158Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb is running
[2022-06-01T23:08:13.121924266Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb was not found
[2022-06-01T23:08:13.137782965Z]  INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:13.160533254Z]  INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:16.931955193Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone booting
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:17.161805959Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 770.765063ms
[2022-06-01T23:08:17.933208632Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:08:19.277337814Z]  INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::2, prefix: 64 }))
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:19.847946155Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: Loading CRDB manifest
[2022-06-01T23:08:20.185842614Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/listen_addr: [fd00:1122:3344:101::2]:32221
[2022-06-01T23:08:20.231946275Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/store
[2022-06-01T23:08:20.269034457Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: setting CRDB's config/join_addrs
[2022-06-01T23:08:20.305499214Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: refreshing manifest
[2022-06-01T23:08:20.337933087Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: enabling CRDB service
[2022-06-01T23:08:20.374348425Z]  INFO: SledAgent/StorageManager/2909 on sock: start_zone: awaiting liveness of CRDB
[2022-06-01T23:08:20.380744728Z]  WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive
[2022-06-01T23:08:20.648632498Z]  WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive
[2022-06-01T23:08:21.277440112Z]  WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive
[2022-06-01T23:08:22.516087249Z]  WARN: SledAgent/StorageManager/2909 on sock: cockroachdb not yet alive
[2022-06-01T23:08:25.273275485Z]  INFO: SledAgent/StorageManager/2909 on sock: CRDB is online
[2022-06-01T23:08:25.273634562Z]  INFO: SledAgent/StorageManager/2909 on sock: Formatting CRDB
[2022-06-01T23:08:32.935085953Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 2.561089402s
[2022-06-01T23:08:33.241793708Z]  INFO: SledAgent/StorageManager/2909 on sock: Formatting CRDB - Completed
[2022-06-01T23:08:33.242270573Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::2]:32221 is running
[2022-06-01T23:08:33.242896058Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 142.776439ms
[2022-06-01T23:08:33.243337664Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 507.632943ms
[2022-06-01T23:08:33.243496923Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.182980646s
[2022-06-01T23:08:33.243644801Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 830.830619ms
[2022-06-01T23:08:33.24379535Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 182.536478ms
[2022-06-01T23:08:33.243937299Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=bfcf110c-d574-4862-8d48-85d241efcc5d, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:08:33.248757376Z]  INFO: SledAgent/RSS/2909 on sock: creating new filesystem: DatasetEnsureBody { zpool_uuid: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Clickhouse, address: [fd00:1122:3344:101::5]:8123 }
[2022-06-01T23:08:33.248972164Z]  INFO: SledAgent/StorageManager/2909 on sock: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: Clickhouse, address: [fd00:1122:3344:101::5]:8123, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-01T23:08:33.249200072Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse exists
[2022-06-01T23:08:33.342696875Z]  INFO: SledAgent/StorageManager/2909 on sock: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse is running
[2022-06-01T23:08:33.359413867Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/clickhouse was not found
[2022-06-01T23:08:33.392184627Z]  INFO: SledAgent/StorageManager/2909 on sock: Configuring new Omicron zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:33.433344742Z]  INFO: SledAgent/StorageManager/2909 on sock: Installing Omicron zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:37.485661716Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone booting
    zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:37.726780562Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:08:41.725179224Z]  INFO: SledAgent/StorageManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::5, prefix: 64 }))
    zone: oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-01T23:08:42.336431363Z]  INFO: SledAgent/StorageManager/2909 on sock: Initialiting Clickhouse
[2022-06-01T23:08:42.668654183Z]  INFO: SledAgent/StorageManager/2909 on sock: Zone oxz_clickhouse_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b with address [fd00:1122:3344:101::5]:8123 is running
[2022-06-01T23:08:42.670020961Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=cd542455-7c9b-40bb-9093-82ba3401a41f, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:63166, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:08:42.671626107Z]  INFO: SledAgent/RSS/2909 on sock: Finished setting up agents and datasets
[2022-06-01T23:08:42.677206657Z]  INFO: SledAgent/RSS/2909 on sock: sending service requests...
[2022-06-01T23:08:42.677447995Z]  INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }, ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:08:42.677640383Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: accepted connection (local_addr=[fd00:1122:3344:101::1]:12345, remote_addr=[fd00:1122:3344:101::1]:34451)
[2022-06-01T23:08:42.677940671Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }]
[2022-06-01T23:08:42.68136024Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service oximeter does not yet exist
[2022-06-01T23:08:42.700278763Z]  INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_oximeter
[2022-06-01T23:08:42.720659942Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_oximeter
[2022-06-01T23:08:45.578458419Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_oximeter)
[2022-06-01T23:08:48.056022681Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::4 exists
[2022-06-01T23:08:48.058984814Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::4, prefix: 64 })) (zone=oxz_oximeter)
[2022-06-01T23:08:48.654487674Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::4 exists - OK
[2022-06-01T23:08:48.655124928Z]  INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: []
[2022-06-01T23:08:49.003460771Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service name oximeter did not match
[2022-06-01T23:08:49.043794428Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service nexus does not yet exist
[2022-06-01T23:08:49.069256893Z]  INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_nexus
[2022-06-01T23:08:49.094927246Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus
[2022-06-01T23:08:52.041879133Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus)
[2022-06-01T23:08:52.23645132Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 652.807777ms
[2022-06-01T23:08:52.236619528Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 139.282019ms
[2022-06-01T23:08:52.728457986Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 3.71501894s
[2022-06-01T23:08:55.955553263Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists
[2022-06-01T23:08:55.9582053Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus)
[2022-06-01T23:08:56.541255089Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK
[2022-06-01T23:08:56.541894304Z]  INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: []
[2022-06-01T23:08:58.90935488Z]  INFO: SledAgent/ServiceManager/2909 on sock: halt_and_remove_logged: Previous zone state: Running (zone=oxz_nexus)
[2022-06-01T23:08:58.910103653Z]  INFO: SledAgent/ServiceManager/2909 on sock: Stopped and uninstalled zone (zone=oxz_nexus)
[2022-06-01T23:08:58.919284312Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=71c8cbc1-5c48-4d47-817e-d002d8dcb229, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500)
    error_message_internal: Failed to do 'importing manifest' by running command in zone: Error running command in zone 'oxz_nexus': Command [/usr/sbin/zlogin oxz_nexus /usr/sbin/svccfg import /var/svc/manifest/site/nexus/manifest.xml] executed and failed with status: exit status: 1. Output: svccfg: Temporary service svc:/TEMP/system/illumos/nexus was deleted unexpectedly.
    svccfg: Import of /var/svc/manifest/site/nexus/manifest.xml failed.  Progress:
    svccfg:   Service "system/illumos/nexus": properties imported.
    svccfg:     Instance "default": not reached.
    svccfg: Import of /var/svc/manifest/site/nexus/manifest.xml failed.
    
[2022-06-01T23:08:58.919892947Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:08:58.920041305Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 560.822476ms
[2022-06-01T23:08:58.920193784Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 548.123862ms
[2022-06-01T23:08:58.920328583Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.148798026s
[2022-06-01T23:08:58.920460962Z]  WARN: SledAgent/RSS/2909 on sock: failed to initialize services (error="Error Response")
[2022-06-01T23:08:58.92061501Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 1.836863505s
[2022-06-01T23:08:58.920749389Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 132.370862ms
[2022-06-01T23:08:58.920882558Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.602310766s
[2022-06-01T23:08:59.241694148Z]  INFO: SledAgent/RSS/2909 on sock: initializing sled services: [ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }, ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "internal-dns", addresses: [fd00:1122:3344:1::1], gz_addresses: [fd00:1122:3344:1::2] }]
[2022-06-01T23:08:59.242233643Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring services are initialized: [ServiceRequest { name: "oximeter", addresses: [fd00:1122:3344:101::4], gz_addresses: [] }, ServiceRequest { name: "nexus", addresses: [fd00:1122:3344:101::3], gz_addresses: [] }]
[2022-06-01T23:08:59.245621623Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service oximeter already exists
[2022-06-01T23:08:59.246092369Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service nexus does not yet exist
[2022-06-01T23:08:59.27534333Z]  INFO: SledAgent/ServiceManager/2909 on sock: Configuring new Omicron zone: oxz_nexus
[2022-06-01T23:08:59.302681598Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus
[2022-06-01T23:09:02.106709081Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus)
[2022-06-01T23:09:05.291114596Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists
[2022-06-01T23:09:05.291311424Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus)
[2022-06-01T23:09:05.885177668Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK
[2022-06-01T23:09:05.885370776Z]  INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: []
[2022-06-01T23:09:06.234552896Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service name nexus did not match
[2022-06-01T23:09:06.268590555Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=c6019410-9266-47a4-9137-d529b8d52262, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:09:06.269025041Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 7.858261115s
[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms
[2022-06-01T23:09:06.269713825Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.443517337s
[2022-06-01T23:09:06.269926603Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.126481259s
[2022-06-01T23:09:06.270805385Z]  INFO: SledAgent/RSS/2909 on sock: Finished setting up services
[2022-06-01T23:09:06.5323965Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.434758814s
[2022-06-01T23:09:06.684688782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.086800326s
[2022-06-01T23:09:06.744591962Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 5.570363997s
[2022-06-01T23:09:06.798946621Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 780.561714ms
[2022-06-01T23:09:07.397460443Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.908596339s
[2022-06-01T23:09:07.434737783Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.628760681s
[2022-06-01T23:09:07.58082102Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.415066186s
[2022-06-01T23:09:07.71414967Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.12780999s
[2022-06-01T23:09:07.772264856Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.046736717s
[2022-06-01T23:09:07.998236746Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.076939718s
[2022-06-01T23:09:08.008894782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 3.75894106s
[2022-06-01T23:09:08.85081895Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 3.7597818s
[2022-06-01T23:09:09.098758456Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.197342419s
[2022-06-01T23:09:09.312828691Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.627305181s
[2022-06-01T23:09:09.84729118Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.506179068s
[2022-06-01T23:09:10.001307847Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 4.059281855s
[2022-06-01T23:09:10.068903139Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.554439345s
[2022-06-01T23:09:11.303407502Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.595760471s
[2022-06-01T23:09:11.77377509Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.420023943s
[2022-06-01T23:09:11.947141075Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.456405535s
[2022-06-01T23:09:12.320869017Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 4.439781817s
[2022-06-01T23:09:12.358692392Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 8.392026948s
[2022-06-01T23:09:12.616784788Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.102997133s
[2022-06-01T23:09:14.066622206Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 7.582159334s
[2022-06-01T23:09:14.128674377Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:09:21.65416898Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 15.71140455s

What I found most confusing are those messages at the end:

[2022-06-01T23:08:59.302681598Z]  INFO: SledAgent/ServiceManager/2909 on sock: Installing Omicron zone: oxz_nexus
[2022-06-01T23:09:02.106709081Z]  INFO: SledAgent/ServiceManager/2909 on sock: Zone booting (zone=oxz_nexus)
[2022-06-01T23:09:05.291114596Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists
[2022-06-01T23:09:05.291311424Z]  INFO: SledAgent/ServiceManager/2909 on sock: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::3, prefix: 64 })) (zone=oxz_nexus)
[2022-06-01T23:09:05.885177668Z]  INFO: SledAgent/ServiceManager/2909 on sock: Ensuring address fd00:1122:3344:101::3 exists - OK
[2022-06-01T23:09:05.885370776Z]  INFO: SledAgent/ServiceManager/2909 on sock: GZ addresses: []
[2022-06-01T23:09:06.234552896Z]  INFO: SledAgent/ServiceManager/2909 on sock: Service name nexus did not match
[2022-06-01T23:09:06.268590555Z]  INFO: SledAgent/dropshot (SledAgent)/2909 on sock: request completed (req_id=c6019410-9266-47a4-9137-d529b8d52262, uri=/services, method=PUT, remote_addr=[fd00:1122:3344:101::1]:34451, local_addr=[fd00:1122:3344:101::1]:12345, response_code=204)
[2022-06-01T23:09:06.269025041Z]  WARN: SledAgent/2909 on sock: failed to contact nexus, will retry in 7.858261115s
[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms
[2022-06-01T23:09:06.269713825Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.443517337s
[2022-06-01T23:09:06.269926603Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.126481259s
[2022-06-01T23:09:06.270805385Z]  INFO: SledAgent/RSS/2909 on sock: Finished setting up services
[2022-06-01T23:09:06.5323965Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.434758814s
[2022-06-01T23:09:06.684688782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.086800326s
[2022-06-01T23:09:06.744591962Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 5.570363997s
[2022-06-01T23:09:06.798946621Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 780.561714ms
[2022-06-01T23:09:07.397460443Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.908596339s
[2022-06-01T23:09:07.434737783Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 2.628760681s
[2022-06-01T23:09:07.58082102Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.415066186s
[2022-06-01T23:09:07.71414967Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.12780999s
[2022-06-01T23:09:07.772264856Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.046736717s
[2022-06-01T23:09:07.998236746Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 1.076939718s
[2022-06-01T23:09:08.008894782Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 3.75894106s
[2022-06-01T23:09:08.85081895Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 3.7597818s
[2022-06-01T23:09:09.098758456Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.197342419s
[2022-06-01T23:09:09.312828691Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.627305181s
[2022-06-01T23:09:09.84729118Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 2.506179068s
[2022-06-01T23:09:10.001307847Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 4.059281855s
[2022-06-01T23:09:10.068903139Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.554439345s
[2022-06-01T23:09:11.303407502Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.595760471s
[2022-06-01T23:09:11.77377509Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 10.420023943s
[2022-06-01T23:09:11.947141075Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.456405535s
[2022-06-01T23:09:12.320869017Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus, will retry in 4.439781817s
[2022-06-01T23:09:12.358692392Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 8.392026948s
[2022-06-01T23:09:12.616784788Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 11.102997133s
[2022-06-01T23:09:14.066622206Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 7.582159334s
[2022-06-01T23:09:14.128674377Z]  INFO: SledAgent/2909 on sock: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-01T23:09:21.65416898Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 15.71140455s

As I looked at this, it was five minutes later:

$ date
June  1, 2022 at 11:14:23 PM UTC

I think there are a few problems here:

  • We keep reporting that we failed to notify Nexus about stuff and that we're going to retry, but then we stopped reporting anything. I think what happened is that Nexus came up and the next attempts to report stuff succeeded. It'd be really valuable to get a log message for the success. Otherwise it looks to a reader like there's something wrong with the backoff because we stopped trying.
  • We have two sets of things we seem to be notifying Nexus about, one is "about datasets" and one is unlabeled. Maybe we could label the other one?
  • I don't understand why the retry intervals keep changing. Why would we ever see this sequence:
[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms

This might be a bug in the behavior, not just the log. If not, maybe we could be more clear about what's being retried?

@jgallagher
Copy link
Contributor

I've only been in code adjacent to this recently, but I suspect logs like this:

[2022-06-01T23:09:06.269248009Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 528.435574ms
[2022-06-01T23:09:06.269491647Z]  WARN: SledAgent/StorageManager/2909 on sock: failed to notify nexus about datasets, will retry in 262.229898ms

are two independent dataset retry loops running, not one with a weirdly-decreasing interval. Big 👍 on including more context here to clarify, though!

smklein added a commit that referenced this issue Apr 25, 2023
I noticed myself glossing over many `WARN` lines in the sled-agent, so I
decided to make them less spammy.

Hopefully we can get to the point where each `WARN` is actually a
notable event worthy of attention, without losing useful information
about retries.

Part of #1215
@jordanhendricks jordanhendricks added Sled Agent Related to the Per-Sled Configuration and Management Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more) labels Aug 11, 2023
@jordanhendricks jordanhendricks added the development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better label Sep 25, 2023
leftwo pushed a commit that referenced this issue Apr 10, 2024
Propolis changes:
Rework storage for Accessors

Crucible Changes
Update oximeter dependency (#1244)
Dummy downstairs cleanup (#1247)
Some DTrace updates. (#1246)
HEY! LISTEN! HEY! LISTEN! (#1215)
Fix clippy lints (#1245)
leftwo added a commit that referenced this issue Apr 10, 2024
Propolis changes:
Rework storage for Accessors

Crucible Changes
Update oximeter dependency (#1244)
Dummy downstairs cleanup (#1247)
Some DTrace updates. (#1246)
HEY! LISTEN! HEY! LISTEN! (#1215)
Fix clippy lints (#1245)

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more) development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better Sled Agent Related to the Per-Sled Configuration and Management
Projects
None yet
Development

No branches or pull requests

3 participants