You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Taken here from Discord to not lose initial ideas:
@Frando: I'm writing a bigger test for multi-party sync and we have a problem: sync does not propagate properly, I think, in all cases. This situation:
P1 creates a doc, many peers join
peer P2 goes offline and adds stuff while offline. so no gossip, because offline.
now they come back online. they will run sync with their swarm neighbors, let's say P3,P4,P5
others in the swarm will not be informed about the news from P2 at all!
while P3,P4,P5 receive the new entries from P2, they do not gossip them further, because we only gossip entries that we originally publish - otherwise we'd gossip far too much by default.
however this means that we have a serious issue about content propagation. interesting that we didn't think of this yet. @dignifiedquire has there been previous thought about that? I'm wondering if/where this came up in our design discussion but can't remember
@Frando: what we do have is range fingerprints. we could gossip those on start. however, this will only tell a receiving peer "we are at the same state" or "we are not". we can trigger sync in case we are not (if the broadcaster included its peer info). it could lead to many many sync requests to that poor peer coming online though.
@ramfox: is there a way to track which entries were added locally while offline, or the timestamp of when we went offline
and then when the peer comes back online, after or before it does a sync, it the gossips about the entries its added
@Frando: yea we could think into that. keeping a list of things that weren't gossiped while offline. and then gossiping those when coming online. however this might have scaling problems if you add many entries while offline. which might not be uncommon.
@Frando: another idea, do it similarily to content propagation:
P2 comes online, syncs with P3, P4, P5. when done P3,P4,P5 gossip, to their neighbors only "hey I got new stuff from P2, if you haven't synced with them in a while, you should sync with me now, because I got their new stuff"
@dignifiedquire: so the issue with publishing changes we observe, is that it is easy to create a flood. so if we implement sth like that, we need to deduplicate before publishing. these complications were one of the reasons I wanted to just run a full sync on all connects
sorry, I meant on all gossip messages
@Frando: I'm back to thinking about this. I think it is our biggest issue with sync atm. It does not even have to include peers switching between offline / online much at all. It is enough to just come online with new data, and it will not be propagated between peers that are online.
I started to write a test scenario here: #1514
it works with 3 nodes (because everyone is neighbor to everyone) but starts to fail with 4 or more.
(might be for other issues or so too, not too sure, but let's make it work 😉
I started to think a bit more about gossiping "resync requests" to neighbors when receiving stuff from othter peers, but ... it's difficult. without a "latest pointer" for a peer of some kind I'm fearing uncontrollable amplification / never ending resync loops.
we could use the "fingerprint over everything plus timestamp" as latest pointer maybe. and gossip that to neighbors when we synced with someone else and added entries. however as it will also change for each gossip entry received from the swarm I'm not very positive for this to be stable enough
basically I think the question to tackle can be reformulated like this:
P1 comes online and syncs with neighbor P2, sends them new stuff
How can P2 inform their neighbors P4 and P5 that they might want to sync with P2, to get the new stuff from P1 - without this leading to a loop, because once P4 synced with P2, it would inform itsneighbors (including P2 and P5) to, again, resync with them, and so on..
in other words, what information could P2 send to P4 that says "if you don't have stuff from P1 until X, please sync with me"?
gossiping Vec<(AuthorId, LatestTimestamp)> to neighbors after finishing syncs could work? so, cheapo vectorclocks. gossiping this to neighbors, then they compare. if something is older on their instance, run sync. maybe ignore a few seconds to now() to not rerun sync for gossip messages with a little delay.
( we don't have that info available though atm (apart from full table scan) because timestamp is part of the value not of the key in our records table. would need to add a table to redb to track that and extend the store trait to get it (eg Store::latest_timestamps() -> Self::LatestTimestampIterator with Store::LatestTimestampIterator: Iterator<Item = (AuthorId, Timestamp)>). )
40 bytes (32 + 8) per author, so we could fit ~25 authors in a gossip message (*). if more are in the doc, only include those with latest changes and a hash of the rest.
(* we don't really have a size limit because it runs on quic streams, however I think it makes sense to constrain gossip messages to quic MTU)
this would assume linear inserts per author. not guaranteed, but will hold well enough in practice.
The text was updated successfully, but these errors were encountered:
Taken here from Discord to not lose initial ideas:
@Frando: I'm writing a bigger test for multi-party sync and we have a problem: sync does not propagate properly, I think, in all cases. This situation:
while P3,P4,P5 receive the new entries from P2, they do not gossip them further, because we only gossip entries that we originally publish - otherwise we'd gossip far too much by default.
however this means that we have a serious issue about content propagation. interesting that we didn't think of this yet.
@dignifiedquire has there been previous thought about that? I'm wondering if/where this came up in our design discussion but can't remember
@Frando: what we do have is range fingerprints. we could gossip those on start. however, this will only tell a receiving peer "we are at the same state" or "we are not". we can trigger sync in case we are not (if the broadcaster included its peer info). it could lead to many many sync requests to that poor peer coming online though.
@ramfox: is there a way to track which entries were added locally while offline, or the timestamp of when we went offline
and then when the peer comes back online, after or before it does a sync, it the gossips about the entries its added
@Frando: yea we could think into that. keeping a list of things that weren't gossiped while offline. and then gossiping those when coming online. however this might have scaling problems if you add many entries while offline. which might not be uncommon.
@Frando: another idea, do it similarily to content propagation:
P2 comes online, syncs with P3, P4, P5. when done P3,P4,P5 gossip, to their neighbors only "hey I got new stuff from P2, if you haven't synced with them in a while, you should sync with me now, because I got their new stuff"
@dignifiedquire: so the issue with publishing changes we observe, is that it is easy to create a flood. so if we implement sth like that, we need to deduplicate before publishing. these complications were one of the reasons I wanted to just run a full sync on all connects
sorry, I meant on all gossip messages
@Frando: I'm back to thinking about this. I think it is our biggest issue with sync atm. It does not even have to include peers switching between offline / online much at all. It is enough to just come online with new data, and it will not be propagated between peers that are online.
I started to write a test scenario here: #1514
it works with 3 nodes (because everyone is neighbor to everyone) but starts to fail with 4 or more.
(might be for other issues or so too, not too sure, but let's make it work 😉
I started to think a bit more about gossiping "resync requests" to neighbors when receiving stuff from othter peers, but ... it's difficult. without a "latest pointer" for a peer of some kind I'm fearing uncontrollable amplification / never ending resync loops.
we could use the "fingerprint over everything plus timestamp" as latest pointer maybe. and gossip that to neighbors when we synced with someone else and added entries. however as it will also change for each gossip entry received from the swarm I'm not very positive for this to be stable enough
basically I think the question to tackle can be reformulated like this:
gossiping
Vec<(AuthorId, LatestTimestamp)>
to neighbors after finishing syncs could work? so, cheapo vectorclocks. gossiping this to neighbors, then they compare. if something is older on their instance, run sync. maybe ignore a few seconds tonow()
to not rerun sync for gossip messages with a little delay.( we don't have that info available though atm (apart from full table scan) because timestamp is part of the value not of the key in our records table. would need to add a table to redb to track that and extend the store trait to get it (eg
Store::latest_timestamps() -> Self::LatestTimestampIterator
withStore::LatestTimestampIterator: Iterator<Item = (AuthorId, Timestamp)>
). )40 bytes (32 + 8) per author, so we could fit ~25 authors in a gossip message (*). if more are in the doc, only include those with latest changes and a hash of the rest.
(* we don't really have a size limit because it runs on quic streams, however I think it makes sense to constrain gossip messages to quic MTU)
this would assume linear inserts per author. not guaranteed, but will hold well enough in practice.
The text was updated successfully, but these errors were encountered: