-
Notifications
You must be signed in to change notification settings - Fork 89
Open kfs leveldbs never closed #666
Comments
|
Reopen: 30 minutes and open file count is increasing but not decreasing.
|
Based on the last activity I would expect something beween 4 and 12 open files.
|
One of our community members (@andyjc) has this issue:
|
Wrong user
…On 20 March 2017 at 13:13, Meije Sibbel ***@***.***> wrote:
One of our community members ***@***.*** <https://github.com/andyjc>) has
this issue:
{info} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] received valid message from {"userAgent":"6.3.2","protocol":"1.1.0","address":"91.92.111.163","port":11634,"nodeID":"090d3ff382ad9be05a94914fd00d7b3c4a23a546","lastSeen":1489957848478}
{info} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] sending PUBLISH message to {"userAgent":"6.3.2","protocol":"1.1.0","address":"209.93.13.215","port":4107,"nodeID":"095bf027715564b6989cf8fa60fc74dd33404aea","lastSeen":1489958333754}
{info} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] sending PUBLISH message to {"userAgent":"6.3.0","protocol":"1.1.0","address":"203.97.196.252","port":60547,"nodeID":"08a37eacb12d9eb248059b6be2535a06c7791a9a","lastSeen":1489958315389}
{info} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] sending PUBLISH message to {"userAgent":"6.3.2","protocol":"1.1.0","address":"client022.storj.dk","port":15023,"nodeID":"0918fd0b1ac23a6e23ba29ae5d2becc3d1d9e1d8","lastSeen":1489958183119}
{error} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] Could not get usedSpace: IO error: /media/andrew/b7d12396-25e3-4878-a1bf-f135fcfecf43/Storj1/storjshare-d5a7b9/sharddata.kfs/090.s: Too many open files
{info} [Mon Mar 20 2017 06:19:06 GMT+0900 (JST)] received valid message from {"userAgent":"6.3.2","protocol":"1.1.0","address":"209.93.13.215","port":4107,"nodeID":"095bf027715564b6989cf8fa60fc74dd33404aea","lastSeen":1489702078035}
{info} [Mon Mar 20 2017 06:19:07 GMT+0900 (JST)] received valid message from {"userAgent":"6.3.2","protocol":"1.1.0","address":"client022.storj.dk","port":15023,"nodeID":"0918fd0b1ac23a6e23ba29ae5d2becc3d1d9e1d8","lastSeen":1489706641140}
{info} [Mon Mar 20 2017 06:19:07 GMT+0900 (JST)] received valid message from {"userAgent":"6.3.0","protocol":"1.1.0","address":"203.97.196.252","port":60547,"nodeID":"08a37eacb12d9eb248059b6be2535a06c7791a9a","lastSeen":1489683361530}
{info} [Mon Mar 20 2017 06:19:07 GMT+0900 (JST)] replying to message to c50df964efaaddea8eb590a41f0e1d5a6865c40d
{warn} [Mon Mar 20 2017 06:19:08 GMT+0900 (JST)] rpc call d58ad5a6a21bbae10dc13c6627b3f1e8d67100c1 timed out
{info} [Mon Mar 20 2017 06:19:08 GMT+0900 (JST)] replying to message to 8bb59c92429fc8c0728c5ab417738edf8bc29d75
{warn} [Mon Mar 20 2017 06:19:08 GMT+0900 (JST)] rpc call 58df090420249a28ee01e60a458689f7b9099c3f timed out
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#666 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AACPH89lPE918_9p5kkvXLsRzZFi_HgZks5rnnuGgaJpZM4MPAvK>
.
|
I am having this issue on latest version of OS X and GUI. its worse than the previous release from last week. Basically my count of peers drops to zero and when I click on the GUI it says too many files open. It fixes now better than before I can just restart the app but it typically dies out after a few hours. On windows this is fine but OS X its a constant issue. |
|
I'm having the same issue which is causing my nodes to use more memory than they should and eventually causing crashes, the majority of files are LOG, and LOCK files - i am storing my drives over the network |
Same here:
|
Same thing here:
Error: IO error: /opt/storj/sharddata.kfs/049.s/001893.ldb: Too many open files |
Confirmed on linux: {"level":"error","message":"failed to read from mirror node: connect ETIMEDOUT 158.69.248.73:5015","timestamp":"2017-05-13T21:33:05.997Z"} Error: IO error: /drivepath/sharddata.kfs/241.s/000588.ldb: Too many open files Addition infos: |
I got this error too:
|
Fix it ffs.. Not that you've paid me any SJCX in 2 months anyway. /usr/lib
|
This "Too many open files" error is really annoying for a lot of users, I can't run any of my nodes because of this storj-gui crashes at startup. This error seems to be common https://hexo.io/docs/troubleshooting.html#EMFILE-Error I'm no expert on the matter but it seems that using graceful-fs solved the issue for many https://github.com/isaacs/node-graceful-fs#improvements-over-fs-module |
Did i get this right - this is a bug found nearly a year ago and is attached to a milestone that has no due date?
Crash happened after uptime of bit more than 5 days. Machine infoStorj is running on server grade hardware (WD reds, RAID-Z2, ECC etc) $ uname -a
FreeBSD storjjail 11.1-STABLE FreeBSD 11.1-STABLE #0 r321665+4bd3ee42941(freenas/11.1-stable): Thu Jan 18 15:45:01 UTC 2018 $ sysctl kern.maxfiles kern.maxfilesperproc kern.openfiles
kern.maxfiles: 1038717
kern.maxfilesperproc: 934839
kern.openfiles: 4838 $ ulimit -a
number of pseudoterminals (-P) unlimited
socket buffer size (bytes, -b) unlimited
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) 33554432
file size (blocks, -f) unlimited
max kqueues (-k) unlimited
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 934839
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 524288
cpu time (seconds, -t) unlimited
max user processes (-u) 34059
virtual memory (kbytes, -v) unlimited
swap size (kbytes, -w) unlimited Storjshare info$ storjshare --version
daemon: 5.3.0, core: 8.5.0, protocol: 1.2.0 $ npm list -g storjshare-daemon
/usr/home/storjroot/.nvm/versions/node/v8.9.4/lib
`-- [email protected] $ node --version
v8.9.4 |
It's probably your mount not able to handle it. |
@ne0ark, sure it may also be a reason. Do you have ideas how to go about testing which part is (if it is) the bottleneck? Meanwhile, what are the requirements to run storjshare, I'm confused. I thought the premise is that Storj Share should be able to run at decent desktop grade machine with spare gigs on HDD. At least this https://storj.io/share.html gives such impression, marketing emphasis is on desktop GUI version for the Storj Share. Btw, I know for a fact that at the crashing time there was no other load on the machine (raidz scrubs, replication), as I know by heart when they are scheduled. And here are graphical load logs. By knowing loads when I actually do stuff with this machine the load at 1:14 AM when the crash happened is dead idle (and stuff you see after 1:14 is machine state when everybody is sleeping and the only considerable activity is by just-restarted Storj Share on the machine, only other activity I can think of at that time span is Nextcloud cronjobs) Edit: missed 6th disk in the screenshot, but you get the picture 😄 |
Isn't the issue that Storj needs more than 1024 open files? Isn't it huge? I can see that for every shard, there's 4 open files:
Perhaps we can close some? Or open these only when needed? The solution for now is to upgrade the soft limit (something like https://serverfault.com/a/610135/114520), but I don't really like it either... |
👋 Hey! Thanks for this contribution. Apologies for the delay in responding! We've decided to rearchitect Storj, so that we can scale better. You can read more about this decision here. This means that we are entirely focused on v3 at the moment, in the storj/storj repository. Our white paper for v3 is coming very, very soon - follow along on the blog and in our Rocketchat. As this repository is part of the v2 network, we're no longer maintaining this repository. I am going to close this for now. If you have any questions, I encourage you to jump on Rocketchat and ask them there. Thanks! |
Package Versions
Expected Behavior
I have 2 unused download requests and one completed download. The unused download are expired (30 minutes TOKEN_EXPIRE). 1 minutes later (SBUCKET_IDLE) these leveldbs should be closed. I would expect only 4 open files (contract db).
Note: This is my worst case expectation and I could live with it. Also possible is a 1 minute timeout after data channel authorized. Close leveldb and open it again as soon as the download is used.
Actual Behavior
Kfs leveldbs are never closed.
Steps to Reproduce
Please include the steps the reproduce the issue, numbered below. Include as
much detail as possible.
ls -l /proc/418/fd | grep 'storjshare'
grep 'download\|upload\|contract offer' .storjshare/storjshare/logs/188071ba7cfd974a9e47b59e24b0737ebf845db3.log
The text was updated successfully, but these errors were encountered: