Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent dangling command calls #19448

Closed
wants to merge 16 commits into from

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented Apr 20, 2022

If an os/exec.Command is passed non *os.File as an input/output, go
will create os.Pipes and wait for their closure in cmd.Wait().
If the code following this is responsible for closing io.Pipes or
other handlers then on process death from context cancellation the
Wait can hang.

There are two possible solutions:

  1. use os.Pipe as the input/output as cmd.Wait does not wait for these.
  2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR changes the inputs to the CatFileBatch calls to use os.Pipes preventing
dangling cat-file calls.

iterate on #19454

Signed-off-by: Andrew Thornton [email protected]

@zeripath
Copy link
Contributor Author

there are other places that need similar changes but this is the most important of these

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 20, 2022
@wxiaoguang
Copy link
Contributor

Then there is only one nio usage left, should it be refactored together? Then maybe the nio vendor package can be removed.

// LogNameStatusRepo opens git log --raw in the provided repo and returns a stdin pipe, a stdout reader and cancel function
func LogNameStatusRepo(ctx context.Context, repository, head, treepath string, paths ...string) (*bufio.Reader, func()) {
	// We often want to feed the commits in order into cat-file --batch, followed by their trees and sub trees as necessary.
	// so let's create a batch stdin and stdout
	stdoutReader, stdoutWriter := nio.Pipe(buffer.New(32 * 1024))

@zeripath
Copy link
Contributor Author

zeripath commented Apr 21, 2022

Then there is only one nio usage left, should it be refactored together? Then maybe the nio vendor package can be removed.

Yes this was a planned follow-up, however:

There were (sometimes quite significant) performance improvements when the nio.Pipe was used over the io.Pipe implementation - (albeit the io.Pipe was itself a wrapper around an os.Pipe.)

We're going to need to double check with some large files and large repositories to see whether a nio.Pipe wrapper around the os.Pipe is necessary. (The git or bsd ports repos are good test cases.)

There are a few options:

  1. Just remove the nio.Pipe here - and in its backport - merge and see if there are apparent perf issues in real life usage. In a follow-up PR we could add back in the nio.Pipe if necessary. Meanwhile I continue working through the rest of the commands and remove the wrapping io.Pipes and nio.Pipes simplifying to os.Pipe
  2. We attempt to get everything sorted in one PR.

Unfortunately, I don't know how Windows and BSD are going to respond to the loss of the buffering nio.Pipe. I suspect it will actually turn out to be an improvement but I'm not certain as it depends on the context switching of the kernel vs go's context switching.

There's another approach to solving the dangling file problem which makes fewer changes and you may prefer that to be merged and backported first. See #19454

zeripath added a commit to zeripath/gitea that referenced this pull request Apr 21, 2022
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.  If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR provides the second option - which is a simpler change that can
be more easily backported.

Closes go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 21, 2022

The nio was introduced in #16059 , but I think it's not necessary.

Pipe is a very general mechanism provided by OS, in most cases it could be treated as a socket (indeed they have the similar performance). Since we never add a buffer to a socket when copying streams, I don't think it's necessary to add a buffer to a pipe, either.

When downloading a large file, if io.Copy is used, then there is already a buffer inside io.Copy. And for other cases, I believe there are already various (implicit) buffers, we do not need to add a global buffer on os.Pipe again.

I prefer to drop the nio package.

Codeberg-org pushed a commit to Codeberg-org/gitea that referenced this pull request Apr 21, 2022
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.  If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR provides the second option - which is a simpler change that can
be more easily backported.

Closes go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
zeripath and others added 5 commits April 21, 2022 22:33
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.
If the code following this is responsible for closing `io.Pipe`s or
other handlers then on process death from context cancellation the
`Wait` can hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR changes the inputs to the `CatFileBatch` calls to use `os.Pipe`s preventing
dangling cat-file calls.

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath force-pushed the fix-dangling-cat-file-again branch from 9e051f4 to 7f8174b Compare April 21, 2022 22:01
return err
}
defer fi.Close()
_, _, err = NewCommand(ctx, "bundle", "create", "-", "bundle", "HEAD").RunStdString(&RunOpts{Dir: tmp, Env: env, Stdout: out})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am uncertain why this code was written to use a temporary file in the first place. I suspect it may have been affected by this dangling problem and a temporary file was used to get around this?

@zeripath
Copy link
Contributor Author

zeripath commented Apr 22, 2022

OK, I've made substantial changes and created a common git.Pipe command with an implementation of PipeWriter and PipeReader that is based on the os.Pipe files.

If a git.PipeWriter or git.PipeReader (as appropriate) is passed in to a git.Command RunContext then the File() will be passed to the underlying os/exec process transparently. I'm not sure if this will/could have effects on error propagation but I think it will be fine. (Although I will check.)

I've removed the nio.Pipe stuff as we may not need it and we could always just add it back in - however, we'll only know about it's need once 1.17 is released because I do not think we can really backport this.

Most of the cases switching to using a git.Pipe won't have been affected by the dangling file case - certainly the pipeline functions I think were safe. However, it's still likely to be a performance improvement switching to use these.

There are also a few cases where there was unnecessary usage of pipes and one case where I've dropped using a temporary file in preference to feeding the output down straight down the writer.

As an aside we need to come up with a succinct way of expressing when this kind of deadlock can occur and how to ensure it does not happen. Although we could simply change the Command RunContexts to only take git.PipeWriter and git.PipeReader as appropriate it might be better to express when exactly you need to have the additional:

go func() {
   <-ctx.Done()
   cancel()
   _ = stdin.Close()
   _ = stdout.Close()
}()

Perhaps actually Command should do this itself?


EDIT: The Command now does this.

@wxiaoguang

This comment was marked as resolved.

@6543 6543 closed this in 0dcc74a Apr 22, 2022
@6543 6543 reopened this Apr 22, 2022
@6543
Copy link
Member

6543 commented Apr 22, 2022

as per #19454 (review)

6543 pushed a commit to 6543-forks/gitea that referenced this pull request Apr 22, 2022
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.  If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR provides the second option - which is a simpler change that can
be more easily backported.

Closes go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
zeripath added a commit that referenced this pull request Apr 22, 2022
)

If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.  If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR provides the second option - which is a simpler change that can
be more easily backported.

Closes #19448

Signed-off-by: Andrew Thornton <[email protected]>

Co-authored-by: zeripath <[email protected]>
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Apr 25, 2022
Signed-off-by: Andrew Thornton <[email protected]>
Signed-off-by: Andrew Thornton <[email protected]>
Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath force-pushed the fix-dangling-cat-file-again branch from a33ada4 to 441c932 Compare April 25, 2022 21:12
zeripath added a commit to zeripath/gitea that referenced this pull request Apr 25, 2022
…osed

It may be prudent to add runtime finalizers to the git.Repository and
git.blobReader objects to absolutely ensure that these are both properly
cancelled, cleaned and closed out.

This commit is an extract from go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
zeripath added a commit to zeripath/gitea that referenced this pull request Apr 25, 2022
…osed

It may be prudent to add runtime finalizers to the git.Repository and
git.blobReader objects to absolutely ensure that these are both properly
cancelled, cleaned and closed out.

This commit is a backport of an extract from go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath force-pushed the fix-dangling-cat-file-again branch from 5075b10 to 051fe40 Compare April 26, 2022 21:15
@zeripath
Copy link
Contributor Author

Looks like the race detector doesn't like the PipePair.Close() function because it has to close both ends of the pipes in different go-routines to the ones that read them.

This is frustrating.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 28, 2022

Use PipeReader/PipeWriter.p.lock to protect?

@zeripath
Copy link
Contributor Author

We'd need to lock the reads as well.

wxiaoguang pushed a commit that referenced this pull request May 2, 2022
…osed (#19495) (#19496)

It may be prudent to add runtime finalizers to the git.Repository and
git.blobReader objects to absolutely ensure that these are both properly
cancelled, cleaned and closed out.

This commit is a backport of an extract from #19448

Signed-off-by: Andrew Thornton <[email protected]>
@lunny lunny modified the milestones: 1.17.0, 1.18.0 Jun 16, 2022
AbdulrhmnGhanem pushed a commit to kitspace/gitea that referenced this pull request Aug 24, 2022
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`.  If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.

There are two possible solutions:

1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.

This PR provides the second option - which is a simpler change that can
be more easily backported.

Closes go-gitea#19448

Signed-off-by: Andrew Thornton <[email protected]>
@lunny lunny modified the milestones: 1.18.0, 1.19.0 Oct 17, 2022
@zeripath zeripath closed this Jan 31, 2023
@zeripath zeripath deleted the fix-dangling-cat-file-again branch January 31, 2023 20:26
@lunny lunny removed this from the 1.19.0 milestone Feb 1, 2023
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 1 This PR needs approval from one additional maintainer to be merged. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants