Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build libcontainer's container on Instance::new to be spec compliant #340

Merged
merged 3 commits into from
Oct 3, 2023

Conversation

jprendes
Copy link
Collaborator

@jprendes jprendes commented Sep 27, 2023

Fixes #158 and #326

The issue behind #326 is that we are asking youki to create and start the container on the start call.
We don't do any meaningful work on the create call.

This missmach means that containerd is assuming that extra setup has happened successfully. but youki hasn't done it yet. When youki fails to do it at start, we are breaking those assumptions (e.g., being able to provide a PID for the init process).
The solution is to build (create) the youki container on the create call (i.e., the new method of runwasi's instance).
This also requires that we change the signature of new to return a Result<Self, Error>, which was previously mentioned in #158.

I haven't added any test for this here as testing it is rather involved.
The sideeffect of this issue on the first invocation is that we leave a dangling process, and on a second invocation is a hang.
I'm open to suggestions on how we could test it.
Thanks @jsturtevant for suggesting testing approach.

@jprendes
Copy link
Collaborator Author

@squillace FYI

@jprendes jprendes force-pushed the fix-exit-code-hang branch 2 times, most recently from 9c83cd2 to 76c3bd8 Compare September 27, 2023 16:00
@jsturtevant
Copy link
Contributor

jsturtevant commented Sep 27, 2023

The sideeffect of this issue on the first invocation is that we leave a dangling process, and on a second invocation is a hang.

this is the side affect of this change?

I'm open to suggestions on how we could test it.

Would it be something like:

fn test_custom_entrypoint() -> anyhow::Result<()> {
    let (exit_code, stdout, _) = WasiTest::<WasiInstance>::builder()?
        .with_start_fn("foo")?
        .with_wasm(INVALID_ENTRYPOINT)?
        .build()?
    assert!(error occurred);

    Ok(())
}

@jprendes
Copy link
Collaborator Author

this is the side affect of this change?

No, this is what I observe on linux before this change. With this change I do not observe any dangling process and no hanging.

@jprendes
Copy link
Collaborator Author

Would it be something like:

Yes, that's 100% correct, thanks!. I'll add test points for all shims.

@jprendes
Copy link
Collaborator Author

jprendes commented Sep 27, 2023

I gave the test a go. The right place for it would be in the containerd-shim-wasm crate, as there's where the logic is implemented.

However, that test uses the containerd-shim-wasm-test crate. Adding that crate as a dependency creates a cycle which causes some weird interaction with types being doubly defined (e.g., there's a crate::sandbox::Instance and a containerd_shim_wasm::sandbox::Instance, and they are not the same trait for the compiler)

A solution is moving the testing utility from containerd-shim-wasm-test to containerd-shim-wasm behind a feature flag. Doing that significantly increases the number of files touched by this PR, but shouldn't add much complexity.

@jprendes
Copy link
Collaborator Author

I recommend reviewing the commits separately :-)

@jprendes jprendes force-pushed the fix-exit-code-hang branch 2 times, most recently from 98e17bc to 4f94d6b Compare September 27, 2023 21:46
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great find! Creating the container during the new aligns with containerd's "task_create" which makes sense. left a few minor comments

crossbeam = { workspace = true }
env_logger = { workspace = true, optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since these are under dev-dependencies are they needed here?

Copy link
Collaborator Author

@jprendes jprendes Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consumers of the testing feature of the library we need to specify it as a normal dependencies as well.
All 3 dev-dependencies are also specified as optional normal dependencies that get enabled by the testing feature.
I'm open to suggestions 😄

@jprendes jprendes force-pushed the fix-exit-code-hang branch 3 times, most recently from ea8fc1f to 3b03ecd Compare September 30, 2023 06:53
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -746,10 +746,10 @@ impl<T: Instance + Send + Sync> Local<T> {
);
InstanceData {
instance: None,
base: Some(Nop::new(id, None)),
base: Some(Nop::new(id, None).unwrap()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is safe since Nop::new() always returns Ok. might be work a comment incase things change in future?

@devigned
Copy link
Contributor

devigned commented Oct 3, 2023

@jprendes this needs a rebase, but lgtm after that.

@jprendes jprendes force-pushed the fix-exit-code-hang branch from 3b03ecd to ab6eaaa Compare October 3, 2023 20:25
@jprendes jprendes force-pushed the fix-exit-code-hang branch from ab6eaaa to acc190a Compare October 3, 2023 21:26
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@devigned devigned merged commit 06018bd into containerd:main Oct 3, 2023
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Instance::new should return a Result type
3 participants