Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to stable toolchain #916

Merged
merged 9 commits into from
Feb 17, 2023
Merged

switch to stable toolchain #916

merged 9 commits into from
Feb 17, 2023

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Apr 14, 2022

This does not build yet. Known blockers:

  • diesel-dtrace needs to drop use of #![feature(asm)]
  • slog-dtrace needs to drop use of #![feature(asm)]
  • use of async_closure (@smklein says he's removing it)

Other changes in this PR:

  • We had some drift in the way our dropshot deps were declared which results in two copies in Cargo.lock. This prevents the use of cargo update -p dropshot --precise. I suspect that's why we haven't seen dependabot updates to dropshot in a couple of weeks. (note: I'm doing this in normalize dropshot dependencies #917 as well)

@smklein
Copy link
Collaborator

smklein commented Apr 14, 2022

@leftwo
Copy link
Contributor

leftwo commented Apr 14, 2022

Omicron, does it need to compile propolis (and therefore Crucible) to build, or does it now use something different? Or I'm just remembering wrong and that was never required?

@bnaecker
Copy link
Collaborator

I think we need to decide on macOS support at this point. If we want to support it, this issue is blocked by the stabilization of the asm_sym feature. There is continuing progress on that, but even if it were to land tomorrow, that means at least 3 months until it's on stable.

At that point, there is still a good bit of work to get the usdt crate and all its consumers in the Omicron build graph up to date. That's not hard work, but it is not trivial. (Part of that is that I believe we should remove the feature flags from the usdt crate entirely. If consumers want to provide their own feature flags to enable/disable the probes, that's fine, but the feature flag in usdt itself was a mistake in retrospect.)

@bnaecker
Copy link
Collaborator

@jmpesp Omicron requires the propolis API crate, but otherwise pulls both the propolis and crucible crates pre-built from their CI artifacts.

@davepacheco
Copy link
Collaborator Author

I think we need to decide on macOS support at this point. If we want to support it, this issue is blocked by the stabilization of the asm_sym feature. There is continuing progress on that, but even if it were to land tomorrow, that means at least 3 months until it's on stable.

Personally, I'd be game to drop it. I'd like input from the folks that use it, of course!

At that point, there is still a good bit of work to get the usdt crate and all its consumers in the Omicron build graph up to date. That's not hard work, but it is not trivial. (Part of that is that I believe we should remove the feature flags from the usdt crate entirely. If consumers want to provide their own feature flags to enable/disable the probes, that's fine, but the feature flag in usdt itself was a mistake in retrospect.)

I think there's a lot of value to be on "stable":

  • being on nightly makes it more costly to update because (1) you have to find a working nightly and (2) as we've learned the hard way, nightly makes no compatibility guarantees, so you can wind up having to make a bunch of source changes
  • so (empirically) we stay further behind. Being out of date causes us to accumulate debt that eventually has to be paid by the poor soul who needs to update for some reason.
  • staying on "nightly" when we could switch makes it easier to accidentally introduce new deps on "nightly"

@bnaecker is it easy to inventory the work you're describing? If folks are on board, maybe we can spread it around.

@bnaecker
Copy link
Collaborator

bnaecker commented Apr 14, 2022

Personally, I'd be game to drop it. I'd like input from the folks that use it, of course!

I'm also on board :)

I think there's a lot of value to be on "stable":

Definitely! I got bit by the stabilization of asm too, and just don't want people to have to worry about it.

@bnaecker is it easy to inventory the work you're describing? If folks are on board, maybe we can spread it around.

I think the work-list would roughly as follows. I'll update this if I think of more.

  • Remove the usdt feature flag "asm". Instead of feature-flags, it should only decide what code is generated inside the probe macros on the basis of the target OS. This means usdt itself would still require nightly on macOS, but if we drop it for Omicron, that's not at issue here.
  • Publish a new version of usdt (0.4.0), as the above change is breaking (change to feature flags).
  • For each consuming crate (currently dropshot, slog-dtrace, diesel-dtrace, propolis, and crucible): Update to the latest version of usdt. If the crate chooses, it can expose its own feature flag to conditionally enable and include the usdt probes. (Dropshot is a good example. We might still want to support building on stable on macOS, meaning you don't get probes.) If a crate chooses to do this, it would need a cfg(my_usdt_feature_flag) directive conditionally compiling the usdt probe macros.
  • Bump each crate's version on crates.io, if applicable.
  • Update the omicron dependency of each crate exposing usdt probes, possibly enabling those probes if the crate now exposes a flag.

I'd of course welcome help, but as long as everyone's on board with dropping macOS support, then I'd be happy to just spend a day and churn through it.

@david-crespo
Copy link
Contributor

david-crespo commented Apr 14, 2022

I'm fine with it, assuming it would be relatively easy to run Nexus + simulated Sled Agent in a Docker container or whatever for local dev. We do most console dev using a mock API server that runs in the browser, but we do definitely sometimes dev against real Nexus.

@bnaecker
Copy link
Collaborator

I talked with @pfmooney after the huddle this morning. He's planning to update usdt to automatically select the no-op probe implementation on macOS platforms, if one is using a stable toolchain. If using nightly, or on another supported platform and using stable, the appropriate implementation for the platform will be chosen. (Real probes on illumos, no-op on Linux, etc.)

In addition, there'll be a feature flag called something like strict-features. This will override the above selection of the implementation, and always use the correct one for the platform. This will cause a compiler error if using a stable toolchain on macOS, for example. Consuming crates can thus ensure that they're built with probes if possible, rather than silently building without them, if they choose.

@davepacheco
Copy link
Collaborator Author

I updated this PR to see how far we are. Right now the problem we run into is that illumos-sys-hdrs (from OPTE) uses the unstable extern_types feature.

@davepacheco
Copy link
Collaborator Author

It looks like opte has already been updated to avoid this without the "kernel" feature (which we don't seem to use in Omicron's build), but omicron is using a fairly old version of opte and related crates (git commit oxidecomputer/opte@23fdf). I'm not sure what's involved in updating it further. Maybe that's covered by #1904?

@luqmana
Copy link
Contributor

luqmana commented Feb 17, 2023

It looks like opte has already been updated to avoid this without the "kernel" feature (which we don't seem to use in Omicron's build), but omicron is using a fairly old version of opte and related crates (git commit oxidecomputer/opte@23fdf). I'm not sure what's involved in updating it further. Maybe that's covered by #1904?

opte has been updated so give this another shot and it should work!

@@ -788,7 +788,7 @@ async fn sp_component_active_slot_set(
let component = component_from_str(&component)?;
let slot = body.into_inner().slot;

sp.set_component_active_slot(component, slot)
sp.set_component_active_slot(component, slot, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkeeter and/or @jgallagher can you confirm that false is what we want here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does maintain the current behavior, yes. Probably we want to extend SpComponentFirmwareSlot to add a boolean to pass here, but that can be done later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect; thank you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhhhhh

false means that we will not persistently select that boot slot. It is a policy question about what the sp_component_active_slot_set should be doing, and I'm not sure of the answer.

When updating similar APIs, I've added a persist: bool to the function signature, to force callers to make a choice about it; we may want to do the same to sp_component_active_slot_set here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When updating similar APIs, I've added a persist: bool to the function signature, to force callers to make a choice about it; we may want to do the same to sp_component_active_slot_set here.

Yep, this is what I meant by extending SpComponentFirmwareSlot, which is coming from MGS's client here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github strikes again – I had the page open before you commented, so didn't see it until my comment was already submitted!

@davepacheco davepacheco marked this pull request as ready for review February 17, 2023 18:11
Copy link
Contributor

@ahl ahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get that money

@davepacheco davepacheco enabled auto-merge (squash) February 17, 2023 19:31
@davepacheco davepacheco merged commit e8fd6b4 into main Feb 17, 2023
@davepacheco davepacheco deleted the stable-check branch February 17, 2023 19:31
leftwo pushed a commit that referenced this pull request Oct 4, 2023
Crucible updates
    all Crucible connections should set TCP_NODELAY (#983)
    Use a fixed size for tag and nonce (#957)
    Log crucible opts on start, order crutest options (#974)
    Lock the Downstairs less (#966)
    Cache dirty flag locally, reducing SQLite operations (#970)
    Make stats mutex synchronous (#961)
    Optimize requeue during flow control conditions (#962)
    Update Rust crate base64 to 0.21.4 (#950)
    Do less in control (#949)
    Fix --flush-per-blocks (#959)
    Fast dependency checking (#916)
    Update actions/checkout action to v4 (#960)
    Use `cargo hakari` for better workspace deps (#956)
    Update actions/checkout digest to 8ade135 (#939)
    Cache block size in Guest (#947)
    Update Rust crate ringbuffer to 0.15.0 (#954)
    Update Rust crate toml to 0.8 (#955)
    Update Rust crate reedline to 0.24.0 (#953)
    Update Rust crate libc to 0.2.148 (#952)
    Update Rust crate indicatif to 0.17.7 (#951)
    Remove unused async (#943)
    Use a synchronous mutex for bw/iop_tokens (#946)
    Make flush ID non-locking (#945)
    Use `oneshot` channels instead of `mpsc` for notification (#918)
    Use a strong type for upstairs negotiation (#941)
    Add a "dynamometer" option to crucible-downstairs (#931)
    Get new work and active count in one lock (#938)
    A bunch of misc test cleanup stuff (#937)
    Wait for a snapshot to finish on all downstairs (#920)
    dsc and clippy cleanup. (#935)
    No need to sort ackable_work (#934)
    Use a strong type for repair ID (#928)
    Keep new jobs sorted (#929)
    Remove state_count function on Downstairs (#927)
    Small cleanup to IOStateCount (#932)
    let cmon and IOStateCount use ClientId (#930)
    Fast return for zero length IOs (#926)
    Use a strong type for client ID (#925)
    A few Crucible Agent fixes (#922)
    Use a newtype for `JobId` (#919)
    Don't pass MutexGuard into functions (#917)
    Crutest updates, rename tests, new options (#911)

Propolis updates
    Update tungstenite crates to 0.20
    Use `strum` crate for enum-related utilities
    Wire up bits for CPUID customization
    PHD: improve artifact store (#529)
    Revert abort-on-panic in 'dev' cargo profile
leftwo added a commit that referenced this pull request Oct 5, 2023
Crucible updates
    all Crucible connections should set TCP_NODELAY (#983)
    Use a fixed size for tag and nonce (#957)
    Log crucible opts on start, order crutest options (#974)
    Lock the Downstairs less (#966)
    Cache dirty flag locally, reducing SQLite operations (#970)
    Make stats mutex synchronous (#961)
    Optimize requeue during flow control conditions (#962)
    Update Rust crate base64 to 0.21.4 (#950)
    Do less in control (#949)
    Fix --flush-per-blocks (#959)
    Fast dependency checking (#916)
    Update actions/checkout action to v4 (#960)
    Use `cargo hakari` for better workspace deps (#956)
    Update actions/checkout digest to 8ade135 (#939)
    Cache block size in Guest (#947)
    Update Rust crate ringbuffer to 0.15.0 (#954)
    Update Rust crate toml to 0.8 (#955)
    Update Rust crate reedline to 0.24.0 (#953)
    Update Rust crate libc to 0.2.148 (#952)
    Update Rust crate indicatif to 0.17.7 (#951)
    Remove unused async (#943)
    Use a synchronous mutex for bw/iop_tokens (#946)
    Make flush ID non-locking (#945)
    Use `oneshot` channels instead of `mpsc` for notification (#918)
    Use a strong type for upstairs negotiation (#941)
    Add a "dynamometer" option to crucible-downstairs (#931)
    Get new work and active count in one lock (#938)
    A bunch of misc test cleanup stuff (#937)
    Wait for a snapshot to finish on all downstairs (#920)
    dsc and clippy cleanup. (#935)
    No need to sort ackable_work (#934)
    Use a strong type for repair ID (#928)
    Keep new jobs sorted (#929)
    Remove state_count function on Downstairs (#927)
    Small cleanup to IOStateCount (#932)
    let cmon and IOStateCount use ClientId (#930)
    Fast return for zero length IOs (#926)
    Use a strong type for client ID (#925)
    A few Crucible Agent fixes (#922)
    Use a newtype for `JobId` (#919)
    Don't pass MutexGuard into functions (#917)
    Crutest updates, rename tests, new options (#911)

Propolis updates
    Update tungstenite crates to 0.20
    Use `strum` crate for enum-related utilities
    Wire up bits for CPUID customization
    PHD: improve artifact store (#529)
    Revert abort-on-panic in 'dev' cargo profile

---------

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants