panic from hashSearch #21

bcho · 2022-05-08T10:59:02Z

Hey,

Background

I am building an application that makes use of rsync as a file server to sync files to remote. In my implementation, it would potentially that the file contents changed during the file sync.

Issue & Symptom

I am seeing some random panics from the rsync call. Stack trace looks like this:

panic: runtime error: index out of range [0] with length 0

goroutine 733 [running]:
github.com/gokrazy/rsync/rsyncd.(*sendTransfer).hashSearch(0x140005cfbe0, {0x140027f5788, 0x1, 0x28?}, 0x64?, {0x1, 0x2bc, 0x2, 0x2, {0x140030a9c20, ...}}, ...)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/match.go:172 +0x9d0
github.com/gokrazy/rsync/rsyncd.(*sendTransfer).sendFiles(0x140005cfbe0, 0x140033c5140)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/sender.go:85 +0x320
github.com/gokrazy/rsync/rsyncd.(*Server).HandleConn(0x1400209afc0, {{0x1036e4a34, 0x9}, {0x140000421e0, 0x2a}, {0x0, 0x0, 0x0}}, {0x103d7e2b8?, 0x140003ac0c0}, ...)
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:388 +0x3c8
github.com/gokrazy/rsync/rsyncd.(*Server).HandleDaemonConn(0x1400209afc0, {0x14001000960?, 0x140010009c0?}, {0x12c50db18?, 0x14000011c18}, {0x103d886e0?, 0x1400212a660})
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:309 +0xb28
github.com/gokrazy/rsync/rsyncd.(*Server).Serve.func2()
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:441 +0xb4
    created by github.com/gokrazy/rsync/rsyncd.(*Server).Serve
<path>/go/pkg/mod/github.com/gokrazy/[email protected]/rsyncd/rsyncd.go:439 +0xc8

I can reproduce this issue from the commit: b567d9d . But I am not yet reproduce this issue in commits after this.

Unfortunately, I don't have a stable way to reproduce this issue (I am trying to automate the process, but no luck yet).

My guess

My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: f19f793). Could you confirm the fix commit f19f793 is for this behavior?

Further Asks

I checked the code, the fileio.go is doing some log trackings (which are commented out), and it also exit for unexpected data state:

rsync/rsyncd/fileio.go

Lines 108 to 111 in f3bbd08

    
           // TODO: zero the buffer, file has changed mid-transfer 
        
           log.Printf("file has changed mid-transfer") 
        
           os.Exit(1) 
        
           break

Is it possible to:

record the state for the mapStruct during the ptr call to capture the value of offset, last readSize / readOffset etc
panic instead of os.Exit(1) for these unexpected cases
setup panic handler in the rsyncd server, then we can protect the server from panics like the one from this issue or from the mapStruct.ptr call

What do you think?

The text was updated successfully, but these errors were encountered:

stapelberg · 2022-05-09T22:17:04Z

Thanks for your report. I haven’t used this project with changing files, so some rough edges are to be expected.

My guess is, this issue is due to in-flight file change + wrong read size settings (which fixed in this commit: f19f793). Could you confirm the fix commit f19f793 is for this behavior?

That commit has nothing to do with changing files, it was just a wrong calculation that manifested itself as files with certain file sizes never transferring correctly.

record the state for the mapStruct during the ptr call to capture the value of offset, last readSize / readOffset etc

Not quite sure what you mean by this?

mapStruct.ptr only exits when its caller violates the invariants (that’s why the lines are all prefixed with BUG:). This shouldn’t be possible to trigger from user input. Failing loudly is my preferred way of dealing with these sorts of issues in the current stage of development in this project.

panic instead of os.Exit(1) for these unexpected cases

setup panic handler in the rsyncd server, then we can protect the server from panics like the one from this issue or from the mapStruct.ptr call

No, for the case of changing files, the buffer should be zeroed and the os.Exit should be removed. We don’t need to work around this issue, we need to fix the issue itself.

In general, I’m not a fan of panic handlers in server programs. Instead of recovering panics, the only safe move upon a panic is to terminate the program.

bcho · 2022-05-10T03:21:19Z

Hi @stapelberg thanks for your reply, fully understood. I am ok for not adding a panic handler. But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.

I will maintain a fork for my application for now. I will report back if I can see new reproduce for the issue I mentioned in the issue.

stapelberg · 2022-05-16T21:12:06Z

But how about changing the os.Exit(1) (which introduced by my previous change to the logger, and used by std log's fatal call) to panic? At least we can enable unit tests by using a panic.

We could write unit tests by using panic/recover, but again, I would prefer to return errors (which also enables testing, and in a more convenient way), or — in this particular case — just fix the issue itself, in which case there is no need to panic/recover :)

bcho · 2022-05-17T02:27:17Z

Yep, but I am still finding it's hard to reproduce it yet. I managed to capture a reproduce last week. In my case, I can see:

rsync/rsyncd/fileio.go

Lines 105 to 106 in f3bbd08

    
           n, err := ms.f.Read(ms.window[readOffset : readOffset+readSize]) 
        
           if err != nil {

is returning io.EOF. Anything else we should inspect / check?

stapelberg · 2022-05-17T06:32:14Z

It’s possible that you run into io.EOF if the file shrunk during transfer, but we don’t need to check for io.EOF explicitly — the existing conditional handles all error types, as it should. Should be relatively easy to reproduce using truncate(1), I think.

bcho mentioned this issue May 8, 2022

feat: add panic handler for rsyncd server #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

panic from hashSearch #21

panic from hashSearch #21

bcho commented May 8, 2022 •

edited

Loading

stapelberg commented May 9, 2022

bcho commented May 10, 2022

stapelberg commented May 16, 2022

bcho commented May 17, 2022

stapelberg commented May 17, 2022

panic from hashSearch #21

panic from hashSearch #21

Comments

bcho commented May 8, 2022 • edited Loading

Background

Issue & Symptom

My guess

Further Asks

stapelberg commented May 9, 2022

bcho commented May 10, 2022

stapelberg commented May 16, 2022

bcho commented May 17, 2022

stapelberg commented May 17, 2022

bcho commented May 8, 2022 •

edited

Loading