Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic from hashSearch #21

Open
bcho opened this issue May 8, 2022 · 5 comments
Open

panic from hashSearch #21

bcho opened this issue May 8, 2022 · 5 comments

Comments

@bcho
Copy link
Contributor

bcho commented May 8, 2022

Hey,

Background

I am building an application that makes use of rsync as a file server to sync files to remote. In my implementation, it would potentially that the file contents changed during the file sync.

Issue & Symptom

I am seeing some random panics from the rsync call. Stack trace looks like this:

panic: runtime error: index out of range [0] with length 0

goroutine 733 [running]:
github.com/gokrazy/rsync/rsyncd.(*sendTransfer).hashSearch(0x140005cfbe0, {0x140027f5788, 0x1, 0x28?}, 0x64?, {0x1, 0x2bc, 0x2, 0x2, {0x140030a9c20, ...}}, ...)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/match.go:172 +0x9d0
github.com/gokrazy/rsync/rsyncd.(*sendTransfer).sendFiles(0x140005cfbe0, 0x140033c5140)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/sender.go:85 +0x320
github.com/gokrazy/rsync/rsyncd.(*Server).HandleConn(0x1400209afc0, {{0x1036e4a34, 0x9}, {0x140000421e0, 0x2a}, {0x0, 0x0, 0x0}}, {0x103d7e2b8?, 0x140003ac0c0}, ...)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:388 +0x3c8
github.com/gokrazy/rsync/rsyncd.(*Server).HandleDaemonConn(0x1400209afc0, {0x14001000960?, 0x140010009c0?}, {0x12c50db18?, 0x14000011c18}, {0x103d886e0?, 0x1400212a660})
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:309 +0xb28
github.com/gokrazy/rsync/rsyncd.(*Server).Serve.func2()
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:441 +0xb4
    created by github.com/gokrazy/rsync/rsyncd.(*Server).Serve
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:439 +0xc8

I can reproduce this issue from the commit: b567d9d . But I am not yet reproduce this issue in commits after this.

Unfortunately, I don't have a stable way to reproduce this issue (I am trying to automate the process, but no luck yet).

My guess

My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: f19f793). Could you confirm the fix commit f19f793 is for this behavior?

Further Asks

I checked the code, the fileio.go is doing some log trackings (which are commented out), and it also exit for unexpected data state:

rsync/rsyncd/fileio.go

Lines 108 to 111 in f3bbd08

// TODO: zero the buffer, file has changed mid-transfer
log.Printf("file has changed mid-transfer")
os.Exit(1)
break

Is it possible to:

  1. record the state for the mapStruct during the ptr call to capture the value of offset, last readSize / readOffset etc
  2. panic instead of os.Exit(1) for these unexpected cases
  3. setup panic handler in the rsyncd server, then we can protect the server from panics like the one from this issue or from the mapStruct.ptr call

What do you think?

@stapelberg
Copy link
Contributor

Thanks for your report. I haven’t used this project with changing files, so some rough edges are to be expected.

My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: f19f793). Could you confirm the fix commit f19f793 is for this behavior?

That commit has nothing to do with changing files, it was just a wrong calculation that manifested itself as files with certain file sizes never transferring correctly.

  • record the state for the mapStruct during the ptr call to capture the value of offset, last readSize / readOffset etc

Not quite sure what you mean by this?

mapStruct.ptr only exits when its caller violates the invariants (that’s why the lines are all prefixed with BUG:). This shouldn’t be possible to trigger from user input. Failing loudly is my preferred way of dealing with these sorts of issues in the current stage of development in this project.

  • panic instead of os.Exit(1) for these unexpected cases
  • setup panic handler in the rsyncd server, then we can protect the server from panics like the one from this issue or from the mapStruct.ptr call

No, for the case of changing files, the buffer should be zeroed and the os.Exit should be removed. We don’t need to work around this issue, we need to fix the issue itself.

In general, I’m not a fan of panic handlers in server programs. Instead of recovering panics, the only safe move upon a panic is to terminate the program.

@bcho
Copy link
Contributor Author

bcho commented May 10, 2022

Hi @stapelberg thanks for your reply, fully understood. I am ok for not adding a panic handler. But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.

I will maintain a fork for my application for now. I will report back if I can see new reproduce for the issue I mentioned in the issue.

@stapelberg
Copy link
Contributor

But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.

We could write unit tests by using panic/recover, but again, I would prefer to return errors (which also enables testing, and in a more convenient way), or — in this particular case — just fix the issue itself, in which case there is no need to panic/recover :)

@bcho
Copy link
Contributor Author

bcho commented May 17, 2022

Yep, but I am still finding it's hard to reproduce it yet. I managed to capture a reproduce last week. In my case, I can see:

rsync/rsyncd/fileio.go

Lines 105 to 106 in f3bbd08

n, err := ms.f.Read(ms.window[readOffset : readOffset+readSize])
if err != nil {

is returning io.EOF. Anything else we should inspect / check?

@stapelberg
Copy link
Contributor

It’s possible that you run into io.EOF if the file shrunk during transfer, but we don’t need to check for io.EOF explicitly — the existing conditional handles all error types, as it should. Should be relatively easy to reproduce using truncate(1), I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants