Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon 0.4.3-rc2 crash #3045

Closed
NeoTeo opened this issue Aug 6, 2016 · 47 comments
Closed

Daemon 0.4.3-rc2 crash #3045

NeoTeo opened this issue Aug 6, 2016 · 47 comments

Comments

@NeoTeo
Copy link

NeoTeo commented Aug 6, 2016

Version/Platform/Processor information (from ipfs version --all):
go-ipfs version: 0.4.3-rc2-
Repo version: 4
System version: amd64/darwin
Golang version: go1.6.2

Type (bug, feature, meta, test failure, question):
bug

Area (api, commands, daemon, fuse, etc):
daemon

Priority (from P0: functioning, to P4: operations on fire):
P3: frequent crashing

Description:
I can reproduce the following crash by starting the daemon and then calling
ipfs cat QmPZ9gcCEpqKTo6aq61g2nXGUhM4iCL3ewB6LDXZCtioEB

The daemon crashes out with the output included below.
Subsequent runs of the daemon crash out immediately.

Crash log:

panic: sync: inconsistent mutex state

goroutine 58 [running]:
panic(0x8be940, 0xc822333c60)
    /home/whyrusleeping/go/src/runtime/panic.go:481 +0x3e6
sync.(*Mutex).Lock(0xc82012b96c)
    /home/whyrusleeping/go/src/sync/mutex.go:75 +0x16d
gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics.(*StandardEWMA).Rate(0xc82012b950, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics/ewma.go:94 +0x3a
gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics.(*StandardMeter).updateSnapshot(0xc8201c5a80)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics/meter.go:216 +0xf2
gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics.(*StandardMeter).Mark(0xc8201c5a80, 0x22)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmeYJHEk8UjVVZ4XCRTZe6dFQrb8pGWD81LYCgeLp8CvMB/go-metrics/meter.go:159 +0x144
gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics.(*BandwidthCounter).LogRecvMessage(0xc8201918c0, 0x22)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics/bw_stats.go:39 +0x44
gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics.(Reporter).LogRecvMessage-fm(0x22)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics/conn/conn.go:16 +0x3c
gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics/conn.(*MeteredConn).Read(0xc821c9de20, 0xc821e58000, 0x22, 0x1000, 0x0, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/metrics/conn/conn.go:30 +0xa5
gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/net/conn.(*singleConn).Read(0xc821cced80, 0xc821e58000, 0x22, 0x1000, 0x4, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/net/conn/conn.go:123 +0x6b
io.ReadAtLeast(0x14e6818, 0xc821cced80, 0xc821e58000, 0x22, 0x1000, 0x22, 0x0, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:297 +0xe6
io.ReadFull(0x14e6818, 0xc821cced80, 0xc821e58000, 0x22, 0x1000, 0x14eadb8, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:315 +0x62
gx/ipfs/QmRQhVisS8dmPbjBUthVkenn81pBxrx1GxE281csJhm2vL/go-msgio.(*reader).Read(0xc821e4c230, 0xc821e58000, 0x22, 0x1000, 0x0, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmRQhVisS8dmPbjBUthVkenn81pBxrx1GxE281csJhm2vL/go-msgio/msgio.go:186 +0x1af
io.ReadAtLeast(0x14eadb8, 0xc821e4c230, 0xc821e58000, 0x22, 0x1000, 0x22, 0x0, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:297 +0xe6
io.ReadFull(0x14eadb8, 0xc821e4c230, 0xc821e58000, 0x22, 0x1000, 0x165ab8, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:315 +0x62
gx/ipfs/QmVjz1uf6U3sVQ5DbWWj7ktTtDd4GgsptYc7FBp33nWE53/go-libp2p-secio.(*etmReader).Read(0xc82029f200, 0xc821e58000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVjz1uf6U3sVQ5DbWWj7ktTtDd4GgsptYc7FBp33nWE53/go-libp2p-secio/rw.go:182 +0x2e0
gx/ipfs/QmRQhVisS8dmPbjBUthVkenn81pBxrx1GxE281csJhm2vL/go-msgio.(*readWriter).Read(0xc821e54180, 0xc821e58000, 0x1000, 0x1000, 0x40, 0x0, 0x0)
    <autogenerated>:16 +0x80
gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/net/conn.(*secureConn).Read(0xc821cbdd00, 0xc821e58000, 0x1000, 0x1000, 0x40, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmVCe3SNMjkcPgnpFhZs719dheq6xE7gJwjzV7aWcUM4Ms/go-libp2p/p2p/net/conn/secure_conn.go:113 +0x90
bufio.(*Reader).fill(0xc8200d14a0)
    /home/whyrusleeping/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Read(0xc8200d14a0, 0xc821e7a000, 0x2, 0x3e20, 0xf60c07643132818a, 0x0, 0x0)
    /home/whyrusleeping/go/src/bufio/bufio.go:207 +0x260
io.(*LimitedReader).Read(0xc8223b4720, 0xc821e7a000, 0x2, 0x3e20, 0x5733a1, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:426 +0xbd
bytes.(*Buffer).ReadFrom(0xc821cc88c0, 0x16805b0, 0xc8223b4720, 0x0, 0x0, 0x0)
    /home/whyrusleeping/go/src/bytes/buffer.go:176 +0x23f
io.copyBuffer(0x14a12a8, 0xc821cc88c0, 0x16805b0, 0xc8223b4720, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:374 +0x180
io.Copy(0x14a12a8, 0xc821cc88c0, 0x16805b0, 0xc8223b4720, 0xc821e58000, 0x0, 0x0)
    /home/whyrusleeping/go/src/io/io.go:350 +0x64
gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.(*Stream).readData(0xc821e00c30, 0xc821cd8ca0, 0xc, 0xc, 0xc821e50000, 0x16805b0, 0xc8223b4720, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/stream.go:410 +0x5de
gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.(*Session).handleStreamMessage(0xc821e6e000, 0xc821cd8ca0, 0xc, 0xc, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/session.go:488 +0xb5e
gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.(*Session).(gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.handleStreamMessage)-fm(0xc821cd8ca0, 0xc, 0xc, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/session.go:427 +0x4c
gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.(*Session).recvLoop(0xc821e6e000, 0x0, 0x0)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/session.go:439 +0x5a8
gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.(*Session).recv(0xc821e6e000)
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/session.go:399 +0x21
created by gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux.newSession
    /builds/distributions/dists/go-ipfs/gopath/src/gx/ipfs/QmZ8MMKFwA95viWULoSYFZpA4kdFa8idmFSrP12YJwjjaL/yamux/session.go:104 +0x4b4

@whyrusleeping
Copy link
Member

whyrusleeping commented Aug 6, 2016

@NeoTeo this happens every time you restart the daemon?

Did you build from source? Or use a downloaded binary from dists?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 6, 2016

Hey @whyrusleeping :)
It happens every time I restart. The binary is downloaded from dists.

@whyrusleeping
Copy link
Member

@NeoTeo what version of OSX are you running? I can't reproduce this issue...

Did you start with a new repo, or is this an older repo that has a bunch of stuff in it still?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

It first happened on El Capitan 10.11.6 (15G31) so I tried it on a different machine which runs Sierra 10.12 beta (16A270f). On the Sierra I also deleted the .ipfs and did an ipfs init. Just tried the El Cap version (which has an existing v4 repo) again and now it doesn't crash but alternates between hanging and returning Error: merkledag: not found. I now worry it's just my machine having a fit, but it worked fine on 0.4.2 :/

@whyrusleeping
Copy link
Member

@NeoTeo alright, i'll look into it more. Do you have a repeatable way of reproducing the issue? Or is it just more or less random?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

On Sierra the crash is repeatable, on El Cap the alternating behaviour is consistent (and never actually cats the hash). I have just tried doing a
http://localhost:8080/ipfs/QmPZ9gcCEpqKTo6aq61g2nXGUhM4iCL3ewB6LDXZCtioEB from Safari on the El Cap machine and it comes up immediately so the node's able to get the hash ok.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

Let me know if you need me to try something.

@whyrusleeping
Copy link
Member

@NeoTeo This is really bizarre, it feels like a bug in go to me.

@whyrusleeping
Copy link
Member

@NeoTeo If its not too much trouble, on the system where the crash is reproducible, could you try installing the latest release candidate of go (1.7) and try building ipfs from source? I'm thinking that this might be an issue caused by changes in OSX syscalls

@whyrusleeping
Copy link
Member

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

ok gonna try that.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

Good hunch! Not crashing on Sierra now. Gonna try on El Cap to see if it solves the hanging issue there as well.

@whyrusleeping
Copy link
Member

If that fixes it, then i'll just have to make sure the release binaries for 0.4.3 get built with go1.7 (which i was hoping to do anyways)

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

Alas, El Cap still gives the error: merkledag: not found.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

The Sierra binary wasn't rebuilt. Upgrading go was enough.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

Aaand, it crashed after successfully cat'ing the file. I'll try a build from source.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 7, 2016

Trouble with "make install". Overrode check_go_version to accept "1.7rc5 " version number but gx seems to not like it either as its strconv.ParseInt doesn't like the rc part...

@whyrusleeping
Copy link
Member

whyrusleeping commented Aug 8, 2016

@NeoTeo just do:

go get -u github.com/whyrusleeping/gx
go get -u github.com/whyrusleeping/gx-go
gx install --global
go install ./cmd/ipfs

I need to fix the go version parsing code there, its been biting me too

@NeoTeo
Copy link
Author

NeoTeo commented Aug 8, 2016

Hm...odd. The first time it gave me this:

teo@Vger-6 ~> ipfs cat QmPZ9gcCEpqKTo6aq61g2nXGUhM4iCL3ewB6LDXZCtioEB
fish: 'ipfs cat QmPZ9gcCEpqKTo6aq61g2n…' terminated by signal SIGSEGV (Address boundary error)

Ran it again straight after and it worked just fine. The node's been rock solid since first run. And any subsequent cat'ing works as expected \o/

Is it worth trying this on the El Cap version too?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

Just built 0.4.3-rc2 from source on El Cap with go 1.7rc5 but unfortunately it still hangs on cat. Different problem? At least the Sierra problem seems fixed :)

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

Hm, spoke a bit too soon. An ipfs add of a tiny text file causes the following crash:

https://gist.github.com/NeoTeo/fbaf0357ef5cf7ae1576f928d438d3e0

@whyrusleeping
Copy link
Member

Anytime you see 'unexpected fault address', it's 99% of the time a go compiler bug.

You've rebuilt go-ipfs with go1.7-rc5 (theres an rc6 now with a bunch of OSX sierra changes) and have restarted the daemon?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

Yep. I'm updating Sierra right now to b5, so I'll try go1.7-rc6 just after.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

Looking good so far. Using rc6 the daemon has survived both an add and a cat.

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

Same setup on El Capitan still gives
Error: merkledag: not found
/me gently sobs

@whyrusleeping
Copy link
Member

@NeoTeo on el capitan can you try with a freshly inited repo? Thats very odd that its still failing

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

I have run an ipfs repo verify (there's a lot there I'd be sad to lose). Can I safely squirrel away the .ipfs directory, init and then hope to somehow restore the repo after?

@whyrusleeping
Copy link
Member

@NeoTeo yeah, you can just move the directory somewhere safe and move it back once we figure out the problem

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

:/ Seems it was the repo somehow. Weird it'd pass the repo verify though.

@whyrusleeping
Copy link
Member

@NeoTeo thats no good! Is it just catting the one object that fails? Or do other things break?

Can you add a new file and cat it?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

@whyrusleeping with the fresh init it adds and cats as it should.
Moving the old repo back it hangs (or returns with the merkledag error) when cat'ing anything.

@whyrusleeping
Copy link
Member

@NeoTeo on the old repo, when it hangs, can you give me the output of pressing Ctrl + \ ?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

@whyrusleeping

^\SIGQUIT: quit
PC=0x6cc43 m=2

goroutine 0 [idle]:
runtime.mach_semaphore_timedwait(0x3c00002303, 0x0, 0x700000080cf4, 0x3c, 0x0, 0xc4200009c0, 0x700000080d28, 0x62cc3, 0xdf8475800, 0x0, ...)
/usr/local/go/src/runtime/sys_darwin_amd64.s:428 +0x13
runtime.semasleep1(0xdf8475800, 0x0)
/usr/local/go/src/runtime/os_darwin.go:424 +0xda
runtime.semasleep.func1()
/usr/local/go/src/runtime/os_darwin.go:451 +0x33
runtime.systemstack(0x700000080d50)
/usr/local/go/src/runtime/asm_amd64.s:314 +0xab
runtime.semasleep(0xdf8475800, 0x0)
/usr/local/go/src/runtime/os_darwin.go:452 +0x44
runtime.notetsleep_internal(0xdbf598, 0xdf8475800, 0xc4200009c0, 0x146943a38ecb61c7, 0xc4200009c0)
/usr/local/go/src/runtime/lock_sema.go:198 +0x92
runtime.notetsleep(0xdbf598, 0xdf8475800, 0x146943946c6fea01)
/usr/local/go/src/runtime/lock_sema.go:246 +0x75
runtime.sysmon()
/usr/local/go/src/runtime/proc.go:3618 +0x14c
runtime.mstart1()
/usr/local/go/src/runtime/proc.go:1126 +0x11e
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1096 +0x64

goroutine 1 [select]:
net/http.(_persistConn).roundTrip(0xc420073900, 0xc4202931c0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:1835 +0x93b
net/http.(_Transport).RoundTrip(0xc4200160f0, 0xc4201542d0, 0xc4200160f0, 0x0, 0xc400000000)
/usr/local/go/src/net/http/transport.go:380 +0x4ee
net/http.send(0xc4201542d0, 0xcdb120, 0xc4200160f0, 0x0, 0x0, 0x0, 0x8, 0xc4201659e0, 0xc420128150)
/usr/local/go/src/net/http/client.go:256 +0x15f
net/http.(_Client).send(0xdbe4c0, 0xc4201542d0, 0x0, 0x0, 0x0, 0xc420128150, 0x0, 0x1)
/usr/local/go/src/net/http/client.go:146 +0x102
net/http.(_Client).doFollowingRedirects(0xdbe4c0, 0xc4201542d0, 0x93fb20, 0x4, 0xfb601, 0xc420273f80)
/usr/local/go/src/net/http/client.go:528 +0x5e5
net/http.(_Client).Do(0xdbe4c0, 0xc4201542d0, 0xc42026f740, 0x8c9e6d, 0x13)
/usr/local/go/src/net/http/client.go:187 +0x11d
github.com/ipfs/go-ipfs/commands/http.(_client).Send(0xc420293080, 0xcebc60, 0xc42011e180, 0x1, 0xc420233f40, 0xdbf3c0, 0x0)
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/commands/http/client.go:100 +0x639
main.callCommand(0xce5be0, 0xc420019000, 0xcebc60, 0xc42011e180, 0xdbf3c0, 0xd9c520, 0x0, 0x0, 0xc420165e60, 0x8206)
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/cmd/ipfs/main.go:329 +0x666
main.(*cmdInvocation).Run(0xc420018f40, 0xce5be0, 0xc420019000, 0xcdac60, 0xc4200d8e00, 0xce5be0, 0xc420019000)
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/cmd/ipfs/main.go:189 +0x116
main.main()
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/cmd/ipfs/main.go:154 +0x349

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1

goroutine 18 [chan receive]:
gx/ipfs/QmV3NSS3A1kX5s28r7yLczhDsXzkgo65cqRgKFXYunWZmD/metrics.init.1.func2()
/Users/teo/source/go/src/gx/ipfs/QmV3NSS3A1kX5s28r7yLczhDsXzkgo65cqRgKFXYunWZmD/metrics/metrics.go:321 +0x88
created by gx/ipfs/QmV3NSS3A1kX5s28r7yLczhDsXzkgo65cqRgKFXYunWZmD/metrics.init.1
/Users/teo/source/go/src/gx/ipfs/QmV3NSS3A1kX5s28r7yLczhDsXzkgo65cqRgKFXYunWZmD/metrics/metrics.go:328 +0x66

goroutine 19 [syscall]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:116 +0x157
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.1
/usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 20 [select]:
gx/ipfs/QmNQynaz7qfriSUJkiEZUrm2Wen1u3Kj9goZzWtrPyu7XR/go-log.(*MirrorWriter).logRoutine(0xc42012c340)
/Users/teo/source/go/src/gx/ipfs/QmNQynaz7qfriSUJkiEZUrm2Wen1u3Kj9goZzWtrPyu7XR/go-log/writer.go:71 +0x32e
created by gx/ipfs/QmNQynaz7qfriSUJkiEZUrm2Wen1u3Kj9goZzWtrPyu7XR/go-log.NewMirrorWriter
/Users/teo/source/go/src/gx/ipfs/QmNQynaz7qfriSUJkiEZUrm2Wen1u3Kj9goZzWtrPyu7XR/go-log/writer.go:38 +0xf1

goroutine 10 [select, locked to thread]:
runtime.gopark(0x93ffe0, 0x0, 0x8bce81, 0x6, 0x18, 0x2)
/usr/local/go/src/runtime/proc.go:259 +0x13a
runtime.selectgoImpl(0xc42002b730, 0x0, 0x18)
/usr/local/go/src/runtime/select.go:423 +0x11d9
runtime.selectgo(0xc42002b730)
/usr/local/go/src/runtime/select.go:238 +0x1c
runtime.ensureSigM.func1()
/usr/local/go/src/runtime/signal1_unix.go:304 +0x2d1
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1

goroutine 23 [chan receive]:
main.(_IntrHandler).Handle.func1(0xc4200d8e00, 0xc4200138a0)
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/cmd/ipfs/main.go:542 +0x96
created by main.(_IntrHandler).Handle
/Users/teo/source/go/src/github.com/ipfs/go-ipfs/cmd/ipfs/main.go:547 +0x97

goroutine 27 [IO wait]:
net.runtime_pollWait(0x1369488, 0x72, 0x6)
/usr/local/go/src/runtime/netpoll.go:160 +0x59
net.(_pollDesc).wait(0xc42027b720, 0x72, 0xc4203629d0, 0xc4200121d0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x38
net.(_pollDesc).waitRead(0xc42027b720, 0xcdd9a0, 0xc4200121d0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x34
net.(_netFD).Read(0xc42027b6c0, 0xc42017f000, 0x1000, 0x1000, 0x0, 0xcdd9a0, 0xc4200121d0)
/usr/local/go/src/net/fd_unix.go:243 +0x1a1
net.(_conn).Read(0xc420128158, 0xc42017f000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:173 +0x70
net/http.(_persistConn).Read(0xc420073900, 0xc42017f000, 0x1000, 0x1000, 0x11d1a3, 0xc42002d378, 0xc42002d388)
/usr/local/go/src/net/http/transport.go:1256 +0x154
bufio.(_Reader).fill(0xc420131c80)
/usr/local/go/src/bufio/bufio.go:97 +0x10c
bufio.(_Reader).Peek(0xc420131c80, 0x1, 0xc420340060, 0xc42002d460, 0xc42002d3c8, 0x1efd8, 0x50)
/usr/local/go/src/bufio/bufio.go:129 +0x62
net/http.(_persistConn).readLoop(0xc420073900)
/usr/local/go/src/net/http/transport.go:1413 +0x1a1
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1057 +0x4e9

goroutine 28 [select]:
net/http.(_persistConn).writeLoop(0xc420073900)
/usr/local/go/src/net/http/transport.go:1641 +0x3bd
created by net/http.(_Transport).dialConn
/usr/local/go/src/net/http/transport.go:1058 +0x50e

rax 0xe
rbx 0x0
rcx 0x700000080cc8
rdx 0x0
rdi 0x2303
rsi 0x3c
rbp 0x700000080d00
rsp 0x700000080cc8
r8 0xffffffffffffffff
r9 0x34
r10 0x75731cbc
r11 0x202
r12 0x1443f2c804aaa
r13 0x144c2f4554db0
r14 0x1469439556e1bc00
r15 0x711b10
rip 0x6cc43
rflags 0x202
cs 0x7
fs 0x0
gs 0x0

@whyrusleeping
Copy link
Member

@NeoTeo could I get the output from the daemon while that command is hanging? (press Ctrl + \ on the daemon while the client command is hung)

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

@whyrusleeping That's what that was. I did a
ipfs add acme.dump then hit ctrl + \

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

No, wait. Maybe I misunderstood. On the daemon. Coming up...

@NeoTeo
Copy link
Author

NeoTeo commented Aug 9, 2016

@whyrusleeping
Copy link
Member

@NeoTeo that is really strange... can you try deleting the following from line 45 of core/commands/cat.go and recompiling?

        if err := corerepo.ConditionalGC(req.Context(), node, length); err != nil {
            res.SetError(err, cmds.ErrNormal)
            return
        }

For some reason your stack is telling me its hanging while running a garbage collection... but not actually getting to the GC part, just trying to take the lock?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 11, 2016

Yep, that works. But presumably you put that code there for a reason... some other part locking the GC?

@NeoTeo
Copy link
Author

NeoTeo commented Aug 11, 2016

...and, since a freshly inited repo doesn't hang, somehow related to dealing with a repo that has been migrated from the the pre 0.4 days and yet passes the repo verify.

@whyrusleeping
Copy link
Member

@NeoTeo how large is this repo? (the one that youre having troubles with)

I think this might be related to another issue where checking if we need a GC is too expensive

@NeoTeo
Copy link
Author

NeoTeo commented Aug 11, 2016

@whyrusleeping ~12 GB

@whyrusleeping
Copy link
Member

@NeoTeo in your .ipfs config, can you change Datastore.StorageMax to something like 1000GB or something? and then try catting something with the normal binary (not with the removed GC code)

@NeoTeo
Copy link
Author

NeoTeo commented Aug 11, 2016

Yep, that's it! So a warning or a friendly fail message to to tell you you've exceeded your max would do it. Nice investigative work though @whyrusleeping 👍 :)

@Kubuxu
Copy link
Member

Kubuxu commented Aug 11, 2016

@whyrusleeping it might be the same issue. I had the default GC setting in my config, we might want to be more explicit about running GC because size limit was hit.

also the softGC shouldn't lock if there is real GC running.

@whyrusleeping
Copy link
Member

We should not have that 'feature' enabled by default. If you want to set a size limit, you should have to do it manually

@whyrusleeping
Copy link
Member

@NeoTeo gonna go ahead and close this as i think we solved all the problems :)

Please open a new issue if anything else comes up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants