Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make state detection precise #930

Merged
merged 1 commit into from
Jul 6, 2016

Conversation

hqhq
Copy link
Contributor

@hqhq hqhq commented Jun 30, 2016

Fixes: #871

Signed-off-by: Qiang Huang [email protected]

return Stopped, newSystemErrorWithCausef(err, "getting init process %d start time", pid)
}
if c.initProcessStartTime != startTime {
return Stopped, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my suggestion, but @cyphar suggested stat'ing /proc/{pid} (and using the inode?). In either case, it may be better to abstract this extra identifier out into a GetProcessUniquifier helper (silly name, but whatever ;), since we're only using it for an equality comparison, and don't really care what the value actually means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it, and doing a stat is not guaranteed to give you a unique inode number (as far as I can tell, inode numbers in procfs are taken from a pool of unused inode numbers). So I think start time (while ugly) is better. But I agree that we should abstract the actual value.

@hqhq hqhq changed the title Make state detection procise Make state detection precise Jul 4, 2016
@hqhq hqhq force-pushed the fix_state_detection branch from ef15ae4 to 1d009af Compare July 4, 2016 12:02
@hqhq
Copy link
Contributor Author

hqhq commented Jul 4, 2016

@wking @cyphar @mrunalp Updated.

@@ -1109,6 +1118,17 @@ func (c *linuxContainer) refreshState() error {
return c.state.transition(&stoppedState{c: c})
}

func (c *linuxContainer) isProcessUnique(pid int) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesProcessExist is probably a better name for this function. Processes are unique, and with this function we're just doing a better (but still not perfect) job of identifying whether a given /proc entry is the container process we're interested in.

@hqhq hqhq force-pushed the fix_state_detection branch from 1d009af to 14e95b2 Compare July 5, 2016 01:28
@hqhq
Copy link
Contributor Author

hqhq commented Jul 5, 2016

@wking Updated, I've changed the function name to doesInitProcessExist, I think it's more accurate about what this function does.

@wking
Copy link
Contributor

wking commented Jul 5, 2016

On Mon, Jul 04, 2016 at 06:30:14PM -0700, Qiang Huang wrote:

… I've changed the function name to doesInitProcessExist

14e95b2 looks good to me.

@dqminh
Copy link
Contributor

dqminh commented Jul 6, 2016

LGTM

Approved with PullApprove

1 similar comment
@mrunalp
Copy link
Contributor

mrunalp commented Jul 6, 2016

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit 9d7831e into opencontainers:master Jul 6, 2016
@hqhq hqhq deleted the fix_state_detection branch July 7, 2016 01:18
wking added a commit to wking/runc that referenced this pull request Jun 14, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 14, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 14, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 14, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 15, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 15, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 15, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

The Running -> Stopped change in checkpoint_test.go is because the
post-checkpoint process is a zombie, and with this commit zombie
processes are Stopped (and no longer Running).

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 15, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

The Running -> Stopped change in checkpoint_test.go is because the
post-checkpoint process is a zombie, and with this commit zombie
processes are Stopped (and no longer Running).

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/runc that referenced this pull request Jun 20, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

The Running -> Stopped change in checkpoint_test.go is because the
post-checkpoint process is a zombie, and with this commit zombie
processes are Stopped (and no longer Running).

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
adrianreber pushed a commit to adrianreber/runc that referenced this pull request Jul 27, 2017
And Stat_t.PID and Stat_t.Name while we're at it.  Then use the new
.State property in runType to distinguish between running and
zombie/dead processes, since kill(2) does not [1].  With this change
we no longer claim Running status for zombie/dead processes.

I've also removed the kill(2) call from runType.  It was originally
added in 13841ef (new-api: return the Running state only if the init
process is alive, 2014-12-23), but we've been accessing
/proc/[pid]/stat since 14e95b2 (Make state detection precise,
2016-07-05, opencontainers#930), and with the /stat access the kill(2) check is
redundant.

I also don't see much point to the previously-separate
doesInitProcessExist, so I've inlined that logic in runType.

It would be nice to distinguish between "/proc/[pid]/stat doesn't
exist" and errors parsing its contents, but I've skipped that for the
moment.

The Running -> Stopped change in checkpoint_test.go is because the
post-checkpoint process is a zombie, and with this commit zombie
processes are Stopped (and no longer Running).

[1]: opencontainers#1483 (comment)

Signed-off-by: W. Trevor King <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants