-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource Constraints + Limits #1482
Comments
The most pressing are:
|
@rht would this be an issue you could work on? it's needed sooner than later. particularly @whyrusleeping your help will be needed no matter who implements this. |
@jbenet yeap. My concern is that before we even think about configurable limits and such, we need to determine how the system behaves when you are out of a certain resource, whether thats open connections, disk space, or memory. Once we determine how a limit will be manifest in the application, we can start setting those limits. |
We already know how some of those would behave, for example, disk. Trigger gc after a threshold, and stop accepting blocks after the limit. — On Wed, Jul 15, 2015 at 12:36 PM, Jeromy Johnson [email protected]
|
okay, when we stop accepting blocks, how does that affect the user? Do we just start returning 'error disk full' up the stack everywhere? (probably) |
yeah, it's a write error. same would happen if the OS's disk got full. On Wed, Jul 15, 2015 at 1:12 PM, Jeromy Johnson [email protected]
|
👍 the daemon keeps consuming my meager ADSL upload bandwidth |
These are a big deal, we should get back on these. |
My VPS runs out of RAM pretty quickly with IPFS consuming 80% of it (this is not adding, just idling).. other daemons start to shut down due to out of memory. Granted my VPS has only 128 or 256mb (cant remember which), but still, I would think its possible to seed some content with minimal resources. |
agreed. we should start adding memory constraints as tests for long running nodes to ipfs |
Update here:
|
Thanks for update @rht Re limits, i think people will mostly want to set hard BW caps in explicit |
I just randomly found this discussion while trying to limit the overall output traffic (per day / month). I think limiting output traffic could be an interesting thing (especially with respect to file coin one day) as egress traffic is typically limited in cloud settings like AWS or Azure. There I am fine with temporary spikes of high bandwidth as long as my output traffic stays within some bounds per unit of time. Setting a limit per hour / day / month might make sense to prevent from blowing a months volume in a day / hour. |
Hi, thanks very much for IPFS. I did not carefully read the above, so some of the following may be duplicates. This is all long-term things to think about, nothing that is a headache for me right now. The following are some usage models that may suggest features for controlling resources:
|
For vpn users, being able to limit the maximum number of connections is a very important feature, since many vpns automatically disconnect you if you have to many open connections (it's probably some sort of protection to fight spammers and ddosers). IPFS by default creates hundreds of connections, so its barely usable, unless you don't care if you regularly get disconnected. |
It means that it is directly in connection with 214 peers, those are live nodes in the network, we might want to start limiting that. Deluge (torrent client) by default allows for 200 connections and only 50 active at the time, but it uses utp which we were unable to do successfully due to utp lib for Go hanging. @davidak is that netdata collector for IPFS? Looks nice, have you published it somewhere? |
@Kubuxu the IPFS netdata plugin just got merged some minutes ago ;) |
I've had some luck using linux "tc" command to throttle IPFS down to about 10KB/s outbound.. this has the side-effect of dropping incoming down to about 15-20KB/s I can see IPFS is using 100% of its allocated 10KB/s all day every day, but at least I can calculate how much bandwidth that is per month to ensure I don't go over my quotas. And a nice bonus is it significantly reduces memory usage, which is now hovering around 50-100Mb. |
@slothbag does it work in that condition?
|
Where applicable, different bw limits for pinned items would be a nice feature to have. Users might be more inclined to providing bandwidth for files they find important enough to pin. |
For my own use case this would be quite valuable. Not all workers I add to
the network should serve all files equally. Files they create should be
served with much higher priority than files they need and mirror.
…On Aug 28, 2017 1:12 PM, "Alfonso Montero" ***@***.***> wrote:
Where applicable, different bw limits for pinned items would be a nice
feature to have. Users might be more inclined to providing bandwidth for
files they find important enough to pin.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1482 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAcnUNsThKVpfxBeOBajwvOi7IY7Guuks5scx8_gaJpZM4FZWAT>
.
|
+1 for Although @jbenet suggests we can have this done on a higher level, a long-running actively used IPFS daemon will currently eat all memory available on a system which basically means that, without memory constraints it will not be stable. Obviously, the memory footprint (#3318) could be reduced but given that the project moves forward very fast feature wise, there will be new kinds of memory waste popping up. |
ipfs for me has several hundreds of open connections, which triggers a number of warning mechanisms including TCP resets/s (many dozens) and makes it look like a network scan. Connecting to this many peers seems insane for a p2p network. Being able to limit this would be a high priority for me. |
This is resolved in the next release, try out the release candidate for
0.4.12
…On Tue, Oct 31, 2017, 9:47 PM Niklas Haas ***@***.***> wrote:
ipfs for me has several hundreds of open connections, which triggers a
number of warning mechanisms including TCP resets/s (many dozens) and makes
it look like a network scan.
Connecting to this many peers seems insane for a p2p network. Being able
to limit this would be a high priority for me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1482 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABL4HHqGd5sX9DGVcO8-4sWzZ6S6pqnPks5sx9vQgaJpZM4FZWAT>
.
|
I need also limit for maximum open files! (causes: #4589 ) |
@whyrusleeping: go-ipfs v0.4.13 still maintains several hundreds of open connections. |
@KrzysiekJ Yeah, DHTs need to maintain a decent number of open connections for proper functioning. You can tweak it lower in your configuration file, Look for |
Does the DHT actually need to maintain large numbers of connections to work? It seems like you need to know the locations of a good number of DHT peers, but why actually connect to them? Can't we just keep a list of a few thousand peers, and figure out if they're still up if/when they're needed? Connectionless DHT queries should only take 1 UDP round trip per hop if you don't use a handshake or encryption, and it's not like you can't monitor someone pretty easily as is(Connect to them, and watch their wantlist broadcasts). Congestion doesn't seem like it should be that much of an issue, especially if you limit retries, If they aren't there after 3 or 4 attempts, you just assume they aren't online anymore and try a different path. An advantage of connectionless is that you can potentially store the last known IP of millions of nodes, meaning most of the network can be within 2 or 3 hops. That has the issue of concentrating traffic on a few nodes for popular content, but I suspect there's ways of managing that. |
Correct. Unfortunately, we don't have any working UDP based protocols at the moment anyways. However, we're working on supporting QUIC. While this wouldn't be a connection-less protocol, connections won't take up file descriptors and we can save memory/bandwidth by "suspending" unused connections (remember the connection's session information but otherwise go silent). In the future, we'd like a real packet transport system but we aren't there yet. The tricky part will be getting the abstractions right will take a bit of work because we try to make all parts of the IPFS/libp2p stack pluggable.
The encryption isn't just about monitoring, it also prevents middle boxes from being "smart". However, as we generally don't care about replay or perfect forward secrecy for DHT messages, we may be able to encrypt these requests without creating a connection (although that gets expensive if we send more than one message). Again, the tricky part will be getting the abstractions correct (and, in this case, not creating a security footgun).
Unfortunately, IPFS nodes tend to go offline/online all the time. Having connections open helps us keep track of which ones are online. However, the solution here is to just not have flaky nodes act as DHT nodes. |
FWIW: Many operating systems provide facilities for limiting all of those things e.g. consider using linux containers and separate disk partitions. It is then up to ipfs to just handle error conditions returned by the OS properly. |
Your suggestion is strongly against the long standing common practice of Unix daemon design, where daemons should manage their own footprints and only in error conditions should the OS interfere.
For example, most forking servers allow the amount of processes to be limited (i.e. Apache, PHP-FPM, Postfix, etc.). Many also allow limits to the memory used (i.e. Elasticsearch, MySQL). In addition, for disk caches it's normal to have hard and soft limits set.
Most system administrators consider a daemon that, when unrestrained, just eats up all the resources in a system, to be badly designed. Only recently have these types of behaviours become sonewhat tolerated, but really only amongst users of stuff like Docker.
Mind you, many operating systems do not support such newfangled tools and whether or not it is actually safe to rely on will have to be proven (consider the large amount of security issues in the early Xen days).
andrewchambers <[email protected]> schreef op 9 april 2018 02:20:09 GMT+01:00:
…FWIW: Many operating systems provide facilities for limiting all of
those things e.g. consider using linux containers and separate disk
partitions. It is then up to ipfs to just handle error conditions
returned by the OS properly.
--
Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn beknoptheid.
|
If you make the OS / docker limit the memory that ipfs uses, then will ipfs be careful to use less than that amount? If not, ipfs might just keep charging headfirst into the limit and get regularly killed/restarted by the system. |
That’s the exact behaviour I’ve been observing for a system with high load.
|
We would hard limit the amount of used memory if Golang allowed for it but it does not. |
I don't want limits in order to limit the impact of bugs; I'm worried about limiting the amount of memory that ipfs uses under arbitrarily high load. I want to do things like set ipfs to refuse or queue new connections if it's processing too many right now, etc. |
@agentme this isn't a problem right now. Currently, AFAIK, most memory issues are due to bugs. |
Bugs happen and no one should only rely on the fact no problems will happen once known ones are corrected. I think most limitations as mentioned by @jbenet are necessary as is a seatbelt while driving. Golang can't have ressource consumption limits set, but "breathing sleep" of some milliseconds can get coded for an end user not to "loose control" of its device for example. And/or the number of effective TCP connections/used bandwidth could also be limited as those are part of the software design. My personal understanding about this is that one of the numerous goals of IPFS is efficiency, so consuming a lot of ressources (cpu, memory, bandwich) on edges while in idle mode is not an option as it could be seen as an "uncontrolled" software. Would you want a computer knowing that if connected to internet it couldn't get used as it's ensuring everything is working well? Remember me antivirus running on Windows years ago. I'm far from an IPFS/Libp2p expert, but maybe each node could implement a pub/sub like scheme to open only one connection to listen for heartbeats sent from other nodes referencing it. And when a node's heartbeat is missing for too long it could trigger the DHT routing table to be renewed the regular TCP way. That would be a compromise between UDP/TCP as discussed by @loadletter and @whyrusleeping earlier. This could also be used to optimise/adapt routing as it could offer a pseudo-latency or workload/availability shared monitoring between nodes, even if I think libp2p already implement many close things as of nodes auto discovering on a common network or that IPFS intends to work even if part of the network get splited in subnetworks etc... I really hope this will get improved as I think it currently is an adoption barrier. IPFS is a really great and promising thing, and I really thank every designer/contributor for all the work done, but I also would really love seeing it spreading to the whole universe ;) |
As a note of reference, I had problems with ipfs-daemon consistently killing my WiFi connection after a few minutes. I had to disconnect and reconnect manually. (OS: Arch Linux + NetworkManager). After limiting the maximum connections to 300 (with It works fine now, but this is really bad for the average user where they might just not understand why their internet is suddenly so slow or not working correctly. The default setup should be very conservative with resources used. |
Any new update on this? |
Libp2p has recently added a "resource manager" which we are working to integrate with More info: https://github.com/libp2p/go-libp2p-resource-manager |
We need a number of configurable resource limits. This issue will serve as a meta-issue to track them all and discuss a consistent way to configure/handle them.
I'm going to use a notation like
thingA.subthingB.subthingC
. we dont have to keep this at all, just helps us bind scoped names to things. (using.
instead of/
as the.
could reflect json hierarchy in the config, but it may not have to (e.g.repo.storage_max
andrepo.datastore.storage_gc_watermark
could be in config asRepo.StorageMax
andRepo.StorageGC
, or something.).Possible Limits
This is a list of possible limits. I don't think we need all of them, as other tools could limit this more, particularly in server scenarios. but please keep in mind that some users/use cases of ipfs demand that we have some limits in place ourselves, as many end users cannot be expected to even know what a Terminal is (e.g. if they run ipfs as an elecron-app or as a browser extension).
node.repo.storage_max
: this affects the physical storage that a repo takes up. this must include all the storage, datastore + config file size (ok to pre-allocate more if neeeded), so that people can set a maximum. (MUST be user configurable) Repo Size Constraints #972node.repo.datastore.storage_max
: hard limit on datastore storage size. could be computed asrepo.storage_max - configsize
whereconfigsize
could be live, or could be a reasonable bound. Repo Size Constraints #972node.repo.datastore.storage_gc_watermark
: soft limit on datastore storage size. after passing this threshold, automatically run gc. could be computed asnode.repo.datastore.storage_max - 1MB
or something. Repo Size Constraints #972node.network_bandwidth_max
: limit on network bandwidth used.node.gateway.bandwidth_max
: limit on bandwidth allocated to running the gateway. this could be calculated fromnode.network_bandwidth_max - all other bandwidth use
. gateway limitations #1070node.swarm.bandwidth_max
: limit on network bandiwdth allocated to running the ipfs protocol. this could be calculated fromnode.network_bandwidth_max - all other bandwidth use
.node.dht.bandwidth_max
: limit on network bandwidth allocated to running the dht protocol. this could be calculated fromnode.network_bandwidth_max - all other bandwidth use
.node.bitswap.bandwidth_max
: limit on network bandwidth allocated to running the bitswap protocol. this could be calculated fromnode.network_bandwidth_max - all other bandwidth use
.node.swarm.connections
: soft limit on ipfs protocol network connections to make. the reason for this limit is that there is overhead to every connections kept alive. the node could try to stay within this limit.node.gateway.ratelimit
: a number of requests per second. with this limit, the user could reduce the accept load on the gateway. gateway limitations #1070node.memlimit
: a limit on the memory allocated to ipfs. could try to use smaller buffers if under different constraints. this is hard to do, prob wont be used end-user-side, and likely easier to do with tools around it sysadmin-side (docker, etc).note on config: the above keys need not be the config keys, but we should figure out some keys that make sense hierarchically.
What other things are we interested in limiting?
The text was updated successfully, but these errors were encountered: