-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursive add of large directory fails at 100% (with nocopy and fscache) #5815
Comments
This is probably due to provider records (i.e., the process of telling the network that you have the content), unfortunately. We currently need to make one DHT request per block which means ~1e6 DHT requests. Tracked by: #5774 For now, you should be able to use the Unfortunately, that does mean you won't tell the network that you have the data. |
Actually, it may not be that. Can you post a heap and goroutine profile when this gets stuck? That is, run: wget http://localhost:5001/debug/pprof/heap
wget http://localhost:5001/debug/pprof/goroutine?debug=2 |
@Stebalien Here you go. https://gateway.ipfs.io/ipfs/QmS79kLK2sxYVCtYNAJqwH1pZNePS8AQBNfr9AhRKkrq9a Note that the adding progresses fine until exactly 100% is reached. I will try again with |
@Stebalien Most surprising result with
|
So, it does look like provider records are backing up however...
That's not good. Can you run Also, could you try It looks like you're missing a block that you should have. |
I've cleaned the filestore and all pins and run both verify commands, to no avail. :/ However, very much to my surprise, the resource does seem to be pinned! (After another stall at 100% - note that this one was without
Note lastly, that weirdly enough the reported filesizes by the gateway are only a fraction of the real size of the data (18 GB reported vs. 400 GB original). I have not yet tried to download the resource as, with current IPFS performance, that would take several days. But you're very much invited to try. |
(I'm currently giving it another run with |
:/ |
Ok, this is definitely a bug in filestore. What's the shape of the data? That is: small directory of large files or a large directory of small files? |
It's al elasticsearch snapshot: couple of levels of depth (~4), lot's of smaller (bytes) and larger files (megabytes). Example data: Qmc3RxfyZTPf7omWN1XxDkaZhp93ukfLSY14CTC8n1v5Hv (created using ipfs-pack, which somehow does seem to work) |
I've just tested a large directory of small files with filestore so I'm pretty sure it's not that. I've also tested filestore on a 200MiB file so it's not that either. @dokterbob have you tried running this without |
Haven’t tried without nocopy, yet.
I’ve had this problem on two different machines, one of which runs ZFS - so very little chance of filesystem corruption (but running fsck on one of them anyways).
… Op 10 dec. 2018, om 23:23 heeft Steven Allen ***@***.***> het volgende geschreven:
I've just tested a large directory of small files with filestore so I'm pretty sure it's not that. I've also tested filestore on a 200MiB file so it's not that either.
@dokterbob have you tried running this without nocopy? I'm wondering if you have a filesystem corruption.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Also check the permissions, make sure the daemon can read all the files. Are you running the daemon as a different user? (but it's probably a bug) |
Just succesfully created a tar as the ipfs user, so that’s ruled out.
Also testing without --ncopy, so we can focus on that.
Love to hear what I can do to provide you guys with more info to debug this. Also happy to share the actual dataset (although the given hash should suffice to replicate the problem).
|
Those incorrect sizes are also pretty worrying. |
Yep. Without |
Do you have garbage collection enabled? |
Yep. It's kind of necessary as we pull about 1TB a day through our server. |
I could test it later this week on my home server with GC disabled (if it is enabled at all). |
Thanks! |
Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days. |
This issue was closed because it is missing author input. |
Version information:
go-ipfs version: 0.4.18-
Repo version: 7
System version: amd64/linux
Golang version: go1.11.1
Type:
Bug
Description:
Adding a large resource (the ipfs-search.com index, specifically, 390GB) fails at 100% - it simply blocks and doesn't give the overall hash for the resource. Getting to 100% takes an acceptable amount of time, after which nothing happens for at least 12 hours.
Example output:
The text was updated successfully, but these errors were encountered: