-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lifecycle for containers #231
Conversation
|
||
* OCI compliant runtime is invoked by passing the bundle path as argument. | ||
* The container's runtime environment is created. | ||
* The prestart hooks are invoked in the host environment by the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrunalp: If we split creation
from start
, then why can't users directly invoke the hooks?
I'm thinking something on the lines of
./oci-impl create foo <spec>
./my-pre-start-hook $(oci-impl state foo)
./oci-impl start foo
./my-post-start-hook $(oci-impl state foo)
./oci-impl stop foo # or my container exits / dies
./my-post-stop-hook $(oci-impl state foo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vishh This is just a timeline of container lifecycle. For some runtimes like runc, this would be an internal implementation detail of what happens during start. I think we can tackle whether we want a separate create when we discuss the actions/API/interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrunalp: But isn't that part of the lifecycle? When I think of a container in the OCI context, I assume this PR will describe its lifecycle.
Can you explain why you think this is an implementation detail?
Updated. |
2. The container's runtime environment is created. | ||
3. The container's state is written to the filesystem. | ||
4. The prestart hooks are invoked by the runtime. | ||
If any prestart hook fails, then the container is killed (step 8) and an error code is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just “the container is killed and the lifecycle continues at step 8”. Actually killing the container would be part of step 7 (where you talk about signaling the container), and the rest of the activity after step 8 would happen in the “failed pre-start hook” case too.
@LK4D4 PTAL |
The container could also error out or crash. | ||
8. The container is destroyed by undoing the steps performed during create phase if required. | ||
9. The poststop hooks are invoked by the runtime and errors, if any, are logged. | ||
10. The state associated with the container is removed and the return code of the container is returned or logged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/of the container/of the container's user specific process/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/state/state file (state.json
)/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're tweaking the text, both of @duglin's suggestions look good to me too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By state
I assume you are only referring to the internal state
. Any changes that the spawned processed introduced to the root filesystem will not be cleaned up right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Nov 17, 2015 at 05:41:08PM -0800, Vish Kannan wrote:
+10. The state associated with the container is removed and the
return code of the container is returned or logged.By
state
I assume you are only referring to the internal
state
. Any changes that the spawned processed introduced to the
root filesystem will not be cleaned up right?
+1 to not cleaning up bundle-side filesystem changes, and another
reason why @duglin's state.json clarification is useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated state to state.json
In order to avoid things getting out of sync, and duplicating information, I think it would make sense to combine the lifecycle information here with the definition of our ops, from |
In particular, I think it would make sense to pull in the lower-level details mentioned in #225. So each step could have a one-liner summary followed but a more detailed list of the technical details. Seeing it all in one spot keeps it simple, aligned and easier to follow. |
On Wed, Oct 28, 2015 at 07:52:11AM -0700, Doug Davis wrote:
I like the approach with a one-liner overview linking out to more |
LGTM |
Process is killed from the outside. | ||
|
||
This event needs to be captured by runc to run onstop event handlers. | ||
The lifecycle describes the timeline of events that happen from when a container is started to when it is stopped or crashes unexpectedly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: to when it ceases to exist
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: How about container is created
instead of started
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Nov 17, 2015 at 04:21:21PM -0800, Vish Kannan wrote:
-This event needs to be captured by runc to run onstop event handlers.
+The lifecycle describes the timeline of events that happen from when a container is started to when it is stopped or crashes unexpectedly.nit: How about
container is created
instead ofstarted
?
Both of Vish's suggestions sound like improvements to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
1. OCI compliant runtime is invoked by passing the bundle path as argument. | ||
2. The container's runtime environment is created. | ||
3. The container's state is written to the filesystem. | ||
4. The prestart hooks are invoked by the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is subject to change based on namespaces pinning POC right? Shall we explicitly mention that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a note at the bottom to this effect.
5d07ec8
to
bd549a2
Compare
Pushed updates to address comments. |
1. OCI compliant runtime is invoked by passing the bundle path as argument. | ||
2. The container's runtime environment is created according to the configuration in config.json. | ||
Any updates to config.json after container is running do not affect the container. | ||
3. The container's state.json file is written to the filesystem under /run/opencontainer/<runtime>/containers/<id>/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5. The user specified process is executed in the container. | ||
6. The poststart hooks are invoked by the runtime. | ||
If any poststart hook fails, then the container is killed and the lifecycle continues at step 8. | ||
7. Additional actions such as pausing the container, resuming the container or signaling the container may be performed using the runtime interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With ec85489 → bd549a2 you dropped “checkpointing the container”, but left in pause/resume/signalling. @vishh didn't call those out in his original comment, but they also seem like something we'd want to address after landing a simpler start/cleanup lifecycle. As it stands, this is the only reference to pausing in the spec, and it doesn't look detailed enough to be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runtime interface
--> runtime interfaces
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hqhq interface would support multiple operations so I think singular form is fine here.
The lifecycle described is generic and should apply all platforms. It provides leeway for the runtimes to be flexible in how they implement it. Signed-off-by: Mrunal Patel <[email protected]>
The lifecycle described is generic and should apply all platforms. It provides leeway for the runtimes to be flexible in how they implement it. Signed-off-by: Mrunal Patel <[email protected]>
5. The user specified process is executed in the container. | ||
6. The poststart hooks are invoked by the runtime. | ||
If any poststart hook fails, then the container is stopped and the lifecycle continues at step 8. | ||
7. Additional actions such as pausing the container, resuming the container or signaling the container may be performed using the runtime interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we want to drop pause/resume/signal until we have more clarity on how they will work. For more background here, especially around complications with shared state on Linux, see wking/oci-command-line-api#3. Alternatively, if we already have a clear idea of how these will work in the absence of /run/opencontainer, I'd appreciate some comments in the CLI-spec PR explaining what the CLI should look like.
LGTM I think this is a great first step, we can further discuss the details for each individual actions, but the overall flow and structure looks right. |
LGTM |
As discussed earlier [1,2]. I'm in favor of rolling it back into config.json [3], but we aren't there yet [4]. [1]: opencontainers#231 (comment) [2]: https://github.com/opencontainers/specs/pull/231/files#r46735828 [3]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/0QbyJDM9fWY Subject: Single, unified config file (i.e. rolling back specs#88) Date: Wed, 4 Nov 2015 09:53:20 -0800 Message-ID: <[email protected]> [4]: https://github.com/opencontainers/specs/blob/4a63e81a807edec3d67ed2b6bd99a6c2b288676f/bundle.md#container-format Signed-off-by: W. Trevor King <[email protected]>
As discussed earlier [1,2]. I'm in favor of rolling it back into config.json [3], but we aren't there yet [4]. [1]: opencontainers#231 (comment) [2]: https://github.com/opencontainers/specs/pull/231/files#r46735828 [3]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/0QbyJDM9fWY Subject: Single, unified config file (i.e. rolling back specs#88) Date: Wed, 4 Nov 2015 09:53:20 -0800 Message-ID: <[email protected]> [4]: https://github.com/opencontainers/specs/blob/4a63e81a807edec3d67ed2b6bd99a6c2b288676f/bundle.md#container-format Signed-off-by: W. Trevor King <[email protected]>
As discussed earlier [1,2]. I'm in favor of rolling it back into config.json [3], but we aren't there yet [4]. [1]: opencontainers#231 (comment) [2]: https://github.com/opencontainers/specs/pull/231/files#r46735828 [3]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/0QbyJDM9fWY Subject: Single, unified config file (i.e. rolling back specs#88) Date: Wed, 4 Nov 2015 09:53:20 -0800 Message-ID: <[email protected]> [4]: https://github.com/opencontainers/specs/blob/4a63e81a807edec3d67ed2b6bd99a6c2b288676f/bundle.md#container-format Signed-off-by: W. Trevor King <[email protected]>
Initial work landed in [1], but it doesn't get into the order of the setup process or explain how pause/resume/signalling work [2]. If we don't figure these things out ahead of time, I expect we'll have to figure them out in order to land compliance testing. [1]: opencontainers/runtime-spec#231 [2]: https://github.com/opencontainers/specs/pull/231/files#r46735313
I think this is resolved since 7713efc (Add lifecycle for containers, 2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230). Signed-off-by: W. Trevor King <[email protected]>
# digest/hashing target Most of this has spun off with [1], and I haven't heard of anyone talking about verifying the on-disk filesystem in a while. My personal take is on-disk verification doesn't add much over serialized verification unless you have a local attacker (or unreliable disk), and you'll need some careful threat modeling if you want to do anything productive about the local attacker case. For some more on-disk verification discussion, see the thread starting with [2]. # distributable-format target This spun off with [1]. # lifecycle target I think this is resolved since 7713efc (Add lifecycle for containers, 2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230). # container-action target Addressed by 7117ede (Expand on the definition of our ops, 2015-10-13, opencontainers#225), although there has been additional discussion in a7a366b (Remove exec from required runtime functionalities, 2016-04-19, opencontainers#388) and 0430aaf (Split create and start, 2016-04-01, opencontainers#384). # validation and testing targets Validation is partly covered by cdcabde (schema: JSON Schema and validator for `config.json`, 2016-01-19, opencontainers#313) and subequent JSON Schema work. The remainder of these targets are handled by ocitools [3]. # printable/compiled-spec target The bulk of this was addressed by 4ee036f (*: printable documents, 2015-12-09, opencontainers#263). Any remaining polishing of that workflow seems like a GitHub-issue thing and not a ROADMAP thing. And publishing these to opencontainers.org certainly seems like it's outside the scope of this repository (although I think that such publishing is a good idea). [1]: https://github.com/opencontainers/image-spec [2]: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/NHpSQ19KCAAJ Subject: OCI Bundle Digests Summary Date: Wed, 14 Oct 2015 17:09:15 +0000 Message-ID: <CAD2oYtN-9yLLhG_STO3F1h58Bn5QovK+u3wOBa=t+7TQi-hP1Q@mail.gmail.com> [3]: https://github.com/opencontainers/ocitools Signed-off-by: W. Trevor King <[email protected]>
As discussed earlier [1,2]. I'm in favor of rolling it back into config.json [3], but we aren't there yet [4]. [1]: opencontainers#231 (comment) [2]: https://github.com/opencontainers/specs/pull/231/files#r46735828 [3]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/0QbyJDM9fWY Subject: Single, unified config file (i.e. rolling back specs#88) Date: Wed, 4 Nov 2015 09:53:20 -0800 Message-ID: <[email protected]> [4]: https://github.com/opencontainers/specs/blob/4a63e81a807edec3d67ed2b6bd99a6c2b288676f/bundle.md#container-format Signed-off-by: W. Trevor King <[email protected]>
# digest/hashing target Most of this has spun off with [1], and I haven't heard of anyone talking about verifying the on-disk filesystem in a while. My personal take is on-disk verification doesn't add much over serialized verification unless you have a local attacker (or unreliable disk), and you'll need some careful threat modeling if you want to do anything productive about the local attacker case. For some more on-disk verification discussion, see the thread starting with [2]. # distributable-format target This spun off with [1]. # lifecycle target I think this is resolved since 7713efc (Add lifecycle for containers, 2015-10-22, opencontainers#231), which was committed on the same day as the ROADMAP entry (4859f6d, Add initial roadmap, 2015-10-22, opencontainers#230). # container-action target Addressed by 7117ede (Expand on the definition of our ops, 2015-10-13, opencontainers#225), although there has been additional discussion in a7a366b (Remove exec from required runtime functionalities, 2016-04-19, opencontainers#388) and 0430aaf (Split create and start, 2016-04-01, opencontainers#384). # validation and testing targets Validation is partly covered by cdcabde (schema: JSON Schema and validator for `config.json`, 2016-01-19, opencontainers#313) and subequent JSON Schema work. The remainder of these targets are handled by ocitools [3]. # printable/compiled-spec target The bulk of this was addressed by 4ee036f (*: printable documents, 2015-12-09, opencontainers#263). Any remaining polishing of that workflow seems like a GitHub-issue thing and not a ROADMAP thing. And publishing these to opencontainers.org certainly seems like it's outside the scope of this repository (although I think that such publishing is a good idea). [1]: https://github.com/opencontainers/image-spec [2]: https://groups.google.com/a/opencontainers.org/d/msg/dev/xo4SQ92aWJ8/NHpSQ19KCAAJ Subject: OCI Bundle Digests Summary Date: Wed, 14 Oct 2015 17:09:15 +0000 Message-ID: <CAD2oYtN-9yLLhG_STO3F1h58Bn5QovK+u3wOBa=t+7TQi-hP1Q@mail.gmail.com> [3]: https://github.com/opencontainers/ocitools Signed-off-by: W. Trevor King <[email protected]>
My pull request was rejected on 2015-10-12 [1], but Mrunal filed an alternative which landed on 2015-12-09 [2]. [1]: opencontainers/runtime-spec#207 (comment) [2]: opencontainers/runtime-spec#231 (comment)
The lifecycle described is generic and should apply to all platforms.
It provides leeway for the runtimes to be flexible in how they
implement it.
Signed-off-by: Mrunal Patel [email protected]
Thanks to @crosbymichael for the feedback!