Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make set reconcilation propagate #1515

Closed
Frando opened this issue Sep 22, 2023 · 2 comments
Closed

Make set reconcilation propagate #1515

Frando opened this issue Sep 22, 2023 · 2 comments

Comments

@Frando
Copy link
Member

Frando commented Sep 22, 2023

Taken here from Discord to not lose initial ideas:

@Frando: I'm writing a bigger test for multi-party sync and we have a problem: sync does not propagate properly, I think, in all cases. This situation:

  • P1 creates a doc, many peers join
  • peer P2 goes offline and adds stuff while offline. so no gossip, because offline.
  • now they come back online. they will run sync with their swarm neighbors, let's say P3,P4,P5
  • others in the swarm will not be informed about the news from P2 at all!
    while P3,P4,P5 receive the new entries from P2, they do not gossip them further, because we only gossip entries that we originally publish - otherwise we'd gossip far too much by default.
    however this means that we have a serious issue about content propagation. interesting that we didn't think of this yet.
    @dignifiedquire has there been previous thought about that? I'm wondering if/where this came up in our design discussion but can't remember

@Frando: what we do have is range fingerprints. we could gossip those on start. however, this will only tell a receiving peer "we are at the same state" or "we are not". we can trigger sync in case we are not (if the broadcaster included its peer info). it could lead to many many sync requests to that poor peer coming online though.

@ramfox: is there a way to track which entries were added locally while offline, or the timestamp of when we went offline
and then when the peer comes back online, after or before it does a sync, it the gossips about the entries its added

@Frando: yea we could think into that. keeping a list of things that weren't gossiped while offline. and then gossiping those when coming online. however this might have scaling problems if you add many entries while offline. which might not be uncommon.

@Frando: another idea, do it similarily to content propagation:
P2 comes online, syncs with P3, P4, P5. when done P3,P4,P5 gossip, to their neighbors only "hey I got new stuff from P2, if you haven't synced with them in a while, you should sync with me now, because I got their new stuff"

@dignifiedquire: so the issue with publishing changes we observe, is that it is easy to create a flood. so if we implement sth like that, we need to deduplicate before publishing. these complications were one of the reasons I wanted to just run a full sync on all connects
sorry, I meant on all gossip messages

@Frando: I'm back to thinking about this. I think it is our biggest issue with sync atm. It does not even have to include peers switching between offline / online much at all. It is enough to just come online with new data, and it will not be propagated between peers that are online.

I started to write a test scenario here: #1514
it works with 3 nodes (because everyone is neighbor to everyone) but starts to fail with 4 or more.
(might be for other issues or so too, not too sure, but let's make it work 😉
I started to think a bit more about gossiping "resync requests" to neighbors when receiving stuff from othter peers, but ... it's difficult. without a "latest pointer" for a peer of some kind I'm fearing uncontrollable amplification / never ending resync loops.
we could use the "fingerprint over everything plus timestamp" as latest pointer maybe. and gossip that to neighbors when we synced with someone else and added entries. however as it will also change for each gossip entry received from the swarm I'm not very positive for this to be stable enough

basically I think the question to tackle can be reformulated like this:

  • P1 comes online and syncs with neighbor P2, sends them new stuff
  • How can P2 inform their neighbors P4 and P5 that they might want to sync with P2, to get the new stuff from P1 - without this leading to a loop, because once P4 synced with P2, it would inform itsneighbors (including P2 and P5) to, again, resync with them, and so on..
  • in other words, what information could P2 send to P4 that says "if you don't have stuff from P1 until X, please sync with me"?

gossiping Vec<(AuthorId, LatestTimestamp)> to neighbors after finishing syncs could work? so, cheapo vectorclocks. gossiping this to neighbors, then they compare. if something is older on their instance, run sync. maybe ignore a few seconds to now() to not rerun sync for gossip messages with a little delay.

( we don't have that info available though atm (apart from full table scan) because timestamp is part of the value not of the key in our records table. would need to add a table to redb to track that and extend the store trait to get it (eg Store::latest_timestamps() -> Self::LatestTimestampIterator with Store::LatestTimestampIterator: Iterator<Item = (AuthorId, Timestamp)>). )

40 bytes (32 + 8) per author, so we could fit ~25 authors in a gossip message (*). if more are in the doc, only include those with latest changes and a hash of the rest.
(* we don't really have a size limit because it runs on quic streams, however I think it makes sense to constrain gossip messages to quic MTU)

this would assume linear inserts per author. not guaranteed, but will hold well enough in practice.

@github-project-automation github-project-automation bot moved this to 📋 Backlog - unassigned issues in iroh Sep 22, 2023
@Frando Frando mentioned this issue Sep 26, 2023
29 tasks
@b5 b5 added this to the v0.7.0 milestone Sep 26, 2023
@b5 b5 assigned Frando Oct 2, 2023
@b5 b5 moved this from 📋 Backlog - unassigned issues to 🔖 Ready - assigned issues in iroh Oct 2, 2023
@dignifiedquire dignifiedquire moved this from 🔖 Ready to 🏗 In progress in iroh Oct 4, 2023
@b5 b5 modified the milestones: v0.7.0, v0.8.0 Oct 10, 2023
@dignifiedquire
Copy link
Contributor

@Frando can we close this?

@Frando
Copy link
Member Author

Frando commented Oct 26, 2023

Fixed in #1613

@Frando Frando closed this as completed Oct 26, 2023
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in iroh Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

4 participants