-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: stream blockchain #1210
Feature: stream blockchain #1210
Conversation
I've found a bug in A normal exit from curl with This is pretty edge case, and I have a issue open on the providers repo - will see what they come back with. |
I've fixed the issue in the upstream repo, just waiting to see if the author wants a pull or will just update himself |
Looks like my pull will get merged upstream. Will wait for the next release. If it takes too long, can just inline |
Since express version 4.16, you no longer need to import body-parser, it's part of express, so have updated and removed. Added tar-fs requirement to stream tarfile Added streamChain method, with restrictions on when the chain can be streamed. usually, requires fluxd to NOT be running on subject host.
821241c
to
9b62d84
Compare
My fix just got merged upstream for tar-fs. I can resolve merge conflicts here and we could review and merge? (If you think this is a reasonable feature) |
Yes please. |
I think this should have node admin privileges. So it can be streamed to outside but only the admin of the node can use this feature? |
Interesting. Yeah it makes sense to put some auth on there and open it up. I'll take a look over the weekend and see what I can come up with. Cheers |
I'll close this and reopen on flux repo - need to write some tests. |
Allows any Fluxnode to stream the blockchain, at breakneck speeds, with some heavy caveats.
Designed for UPnP nodes.
Background:
The uncompressed Flux blockchain stands at around 36Gb. Compressed with gzip, around 26Gb. (Approx 30% reduction) It is not uncommon to see blockchain downloads in excess of 1 hour via CDN, dependent on an operators internet connection.
Copying the chain file from one node to another is cumbersome, and the chain goes stale.
Use case:
Your average node owner at home, who has a node or two running already via UPnP, and wants to fire up another. They may or may not have good internet, either way - this will save them time and won't cost them any data usage with their ISP. They can just stream the chain off one of their existing nodes, and the chain is up to date already.
Authentication thoughts:
I have been in two minds about node owner authentication on this endpoint. I decided against it for the following reasons:
blocks
,chainstate
, anddeterm_zelnodes
.The Feature:
Of note - since we are on
express
version > 4.16,body-parser
is no longer required. (It's part of express now) so have updated this and removed the package. Have added extra requirementtar-fs
The pull introduces a new endpoint
/flux/streamchain
Yes this is related to the blockchain, which would be the daemon, but it didn't make sense to go under the daemon endpoint, as we are dealing with files here, not fluxd rpcs.This endpoint when called via POST, will stream the chain to you live, and can be called via cURL. POST is used so as to dissuade misktaken calls via a browser.
Example:
curl -X POST http://<Node IP>:<Node port>/flux/streamchain -o flux_explorer_bootstrap.tar
Leverages the fact that a lot of nodes run on the same hypervisor, wherethey share a brige or v-switch. In this case - the transfer is as fast as your SSD. Real life testing showed speeds of 3.2Gbps on an Evo 980+ SSD. Able to download the entire chain UNCOMPRESED in 90 seconds.
Even in a traditional LAN, most consumer grade hardware is 1Gbps - this should be easily achievable in your average home network.
During normal operation, the flux daemon (fluxd) must NOT be running, this is sot he database is in a consistent state when it is read. However, during testing, and WITHOUT compression, as long as the chain transfer is reasonably fast, there is minimal risk of a db compaction happening, and corrupting the new data.
This method can transfer data compressed (using gzip) or uncompressed. It is recommended to only stream the data uncompressed. If using compression on the fly, this uses a lot of CPU and will slow the transfers down by 10-20 times, while only saving ~30% on file size. If the daemon is still running during this time, IT WILL CORRUPT THE NEW DATA. (tested)
Due to this, if compression is used, the daemon MUST not be running.
There is an unsafe mode, where a user can transfer the chain while the daemon is still running. Of note, the data being copied will not be corrupted, only the new chain. I have used this over a dozen times without any issue, but use at your own risk.
Only allows one stream at a time - will return 503 if stream in progress. If passing in options, the
Content-Type
header must be set toapplication/json
In a future pull - I'll look to tidy up some of the express post handlers, they don't need listeners anymore.