Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing bitswap ledgers on disk #215

Open
btc opened this issue Oct 27, 2014 · 10 comments
Open

Storing bitswap ledgers on disk #215

btc opened this issue Oct 27, 2014 · 10 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/analysis Needs further analysis before proceeding topic/bitswap Topic bitswap

Comments

@btc
Copy link
Contributor

btc commented Oct 27, 2014

Presently, ledgers are only held in memory. Let this issue track implementation of persisted ledgers.

Ledgers could be stored to datastore using a unique key:

func LedgerKey(partner peer.ID) ds.Key {
    k := b58.Encode(partner)
    return ds.Key(fmt.Sprintf("/bitswap/ledger/%s", k)
}

and data could be stored as a []byte using the proto described here:

message Ledger {
    enum Strategy {
        NICE = 0;
        STANDARD = 1; // as described in IPFS Draft 3 (name DRAFT_3?)
    }
    optional uint64 bytes_sent = 1 [default = 0];
    optional uint64 bytes_received = 2 [default = 0];
    optional uint64 first_exchange_unix_time = 3; // default?
    optional uint64 last_exchange_unix_time = 4; // default?
    optional uint64 exchange_count = 5 [default = 0];
    optional Strategy strategy = 6 [default = STANDARD];
}

Two options

a) keep ledgers in memory (introduces book-keeping/consistency)

Can be implemented using an async actor that selects on a timer and iterates over map of in-memory ledgers, writing entries to the datastore.

On the upside, bitswap transactions are faster. Downside: data loss in the event of a crash. Slightly more complexity in the implementation.

b) fetch ledgers from disk

On the upside, very simple Get and Put operations.

Downside: more costly.

With leveldb sync disabled, pay the cost of getting the data to the operating system buffer cache. I don't have figures off the top of my head, do we pay the price of a syscall?

With sync option enabled, mean latency could be in 5-20ms range. This prices feels a bit high for a potentially-hot code path.

NB: cost may be small compared to long-distance RTTs, but may be noticeable for communication within a single datacenter.

NB2: Many bitswap operations will already require multiple datastore Get operations. In those cases, this constitutes a mere incremental increase.

Thoughts? @whyrusleeping @jbenet

References:

leveldb sync
https://github.com/google/leveldb/blob/master/include/leveldb/options.h#L170

@btc btc added the kind/enhancement A net-new feature or improvement to an existing feature label Oct 27, 2014
@whyrusleeping
Copy link
Member

Im not really comfortable introducing a 20ms latency... as thats a significant portion of the time we take for a given block transfer.

@jbenet
Copy link
Member

jbenet commented Oct 27, 2014

Ledgers could be stored to datastore using a unique key:

👍

and data could be stored as a []byte using the proto described here

👍

  • fields LGTM. i like the idea of storing strategy per peer.
  • formatting nit: our proto files right now use camelCase (e.g. closerPeers in dht.proto)

Two options

One of the upsides of fetching ledgers from disk is total consistency. this is really important. however, reading latency like that could be really bad. I think we should issue async flushes and otherwise read once from disk and keep in memory.

Note that even if it's kept on disk, it is possible to get into inconsistent ledger states if nodes crash. (e.g. first node updating its ledger state on disk crashes immediately after updating it and before externalizing this to the peer) -- NB: a way around this is with a write-ahead log that nodes can inspect to calculate the proper state of ledgers.

Not sure yet what the decision should be. In the paper, I noted the importance of ensuring the ledgers are the same. To give you a sense of the trade-offs: in long-lived connections, a tiny discrepancy could be eaten up by one of the parties, but this opens the door to gaming up to that discrepancy threshold (which in equilibrium nodes will do). (refs BitTyrant, BitThief, and PropShare).

Conversely, in trusted, single datacenter settings, nodes can benefit significantly from not caring at all. This reinforces the "bitswap-as-barter" idea, meaning that bitswap protocol definition should target defining basic rules of transactions / exchanges, leaving it up to nodes' Strategies to provide further guarantees (like consistency or even network consensus).

Bitswap is a big protocol and we won't get it exactly right without being able to test the different use cases.

in the interest of time...

For now, let's keep it in memory and async-flush an update to disk. On comparing ledgers, we can log ERROR when bitswap ledgers mismatch to see if it's a big problem early on.

@btc
Copy link
Contributor Author

btc commented Oct 28, 2014

Acknowledged. Moving forward with:

  • proto fields to camelCase
  • log error on mismatched incoming ledger
  • keep in memory, perform async flush to disk

Additionally, would like to get thoughts on these other points.

@jbenet @whyrusleeping

  • include ledger in the bitswap message
message Message {
    repeated string wantlist = 1;
    repeated bytes blocks = 2;
        optional Ledger partnerLedger = 3;
}
  • when flushing to disk, also flush a list of partners so a cold bitswap instance can rediscover relationships.
message Partners {
    repeated string peerIds = 1; // b58 encoded
}

@btc btc self-assigned this Oct 28, 2014
@jbenet
Copy link
Member

jbenet commented Oct 28, 2014

include ledger in the bitswap message

sgtm

when flushing to disk, also flush a list of partners so a cold bitswap instance can rediscover relationships.

sgtm

@btc btc added the ready label Oct 28, 2014
@btc btc added help wanted Seeking public contribution on this issue and removed ready labels Nov 22, 2014
@btc btc removed their assignment Nov 27, 2014
@whyrusleeping
Copy link
Member

This is still on the TODO list.

@daviddias daviddias removed the icebox label Jan 2, 2016
@RichardLitt RichardLitt added exp/novice Someone with a little familiarity can pick up and removed difficulty: easy labels Feb 2, 2016
@Stebalien Stebalien added needs refinement and removed exp/novice Someone with a little familiarity can pick up help wanted Seeking public contribution on this issue labels Apr 25, 2019
@Stebalien
Copy link
Member

Switched to "needs refinement" as we've talked about having a more centralized ledger for other services. This isn't something someone should jump in and help on until we've worked through this a bit.

@hannahhoward
Copy link
Contributor

@Stebalien is there a place where the discussion about centralized ledgers is happening?

Also, it seems to me that given now bitswap is a separate service, that could be used by other clients other than IPFS, I think writing to disc should ideally be optional -- so that datastore is not an required dependency in order to use go-bitswap.

@Stebalien
Copy link
Member

@Stebalien is there a place where the discussion about centralized ledgers is happening?

Not yet. (back channel chatter...)

Also, it seems to me that given now bitswap is a separate service, that could be used by other clients other than IPFS, I think writing to disc should ideally be optional -- so that datastore is not an required dependency in order to use go-bitswap.

Agreed. We usually do this by using an in-memory datastore by default.

@hannahhoward
Copy link
Contributor

Ok in that case I'd suggest the next action item is open an issue for discussions of a centralized ledger. I don't have context for this so I don't feel qualified write a decent issue, but I could just create a blank one if someone else can fill in -- or someone else can volunteer to write it.

@Stebalien
Copy link
Member

Actually, I think I'm being too hasty here. We do want a central service but this feature is probably useful in the mean-time. Really, I'm getting ahead of myself with the central service as we don't even know what it's going to look like.

However, this issue, while valid, still depends on using the ledger for something. We currently ignore it entirely.

@hsanjuan hsanjuan added need/analysis Needs further analysis before proceeding and removed needs refinement labels Apr 16, 2020
hacdias pushed a commit that referenced this issue Nov 29, 2023
Include rename from:
  github.com/ipfs/go-libipfs => github.com/ipfs/boxo

This migration was reverted:
  ./blocks => github.com/ipfs/go-block-format

Migrated repos:
- github.com/ipfs/interface-go-ipfs-core         => ./coreiface
- github.com/ipfs/go-pinning-service-http-client => ./pinning/remote/client
- github.com/ipfs/go-path                        => ./path
- github.com/ipfs/go-namesys                     => ./namesys
- github.com/ipfs/go-mfs                         => ./mfs
- github.com/ipfs/go-ipfs-provider               => ./provider
- github.com/ipfs/go-ipfs-pinner                 => ./pinning/pinner
- github.com/ipfs/go-ipfs-keystore               => ./keystore
- github.com/ipfs/go-filestore                   => ./filestore
- github.com/ipfs/go-ipns                        => ./ipns
- github.com/ipfs/go-blockservice                => ./blockservice
- github.com/ipfs/go-ipfs-chunker                => ./chunker
- github.com/ipfs/go-fetcher                     => ./fetcher
- github.com/ipfs/go-ipfs-blockstore             => ./blockstore
- github.com/ipfs/go-ipfs-posinfo                => ./filestore/posinfo
- github.com/ipfs/go-ipfs-util                   => ./util
- github.com/ipfs/go-ipfs-ds-help                => ./datastore/dshelp
- github.com/ipfs/go-verifcid                    => ./verifcid
- github.com/ipfs/go-ipfs-exchange-offline       => ./exchange/offline
- github.com/ipfs/go-ipfs-routing                => ./routing
- github.com/ipfs/go-ipfs-exchange-interface     => ./exchange
- github.com/ipfs/go-unixfs                      => ./ipld/unixfs
- github.com/ipfs/go-merkledag                   => ./ipld/merkledag
- github.com/ipld/go-car                         => ./ipld/car

Fixes #215
Updates #202


This commit was moved from ipfs/boxo@038bdd2
@Jorropo Jorropo reopened this Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature need/analysis Needs further analysis before proceeding topic/bitswap Topic bitswap
Projects
None yet
Development

No branches or pull requests

10 participants