Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring vs. undoing/correcting vs. applying updates for security and application domain logic? #432

Closed
jjlee opened this issue Sep 5, 2021 · 7 comments

Comments

@jjlee
Copy link

jjlee commented Sep 5, 2021

In your talk at "Emerging Technologies for the Enterprise", you mentioned that a peer could choose to ignore updates if it decides that another peer is not allowed to make a change for security reasons. That leaves the peer that originated the changes out of sync with all other peers. Also, further changes cannot be accepted from the originating peer.

If that's not acceptable for an application, I'm trying to understand what other options there are.

  1. Might it be possible for automerge to communicate back what the application chose to ignore, allowing the originating peer to discard its failed update so that syncing can continue? I imagine the answer is "no" because it would presumably involve reaching consensus about what to discard.

  2. Perhaps it might be useful instead in some applications for some security/permissions issues to be handled by accepting the changes and then immediately making an automatic undo update, perhaps with some metadata to record what happened?

    Similarly -- and more interesting to me -- I wonder if it's feasible for applications to use automatic updates to loosely enforce consistency at the application domain rule level. For example, if an application has a calendar and after a sync finishes a peer sees that two different events occur on one day, there might be an application rule for how to automatically combine the two calendar events (to be applied by the application just as if it were an update from a user).

    Though automerge provides strong guarantees about consistency, I imagine there are still some hazards here when multiple peers try to make the same or similar automatic correcting updates concurrently, perhaps leading to confusing results?

  3. Perhaps in the future if something like your "suggest changes" proposal happens, an application could choose to make some entire classes of changes always start as automerge suggested changes, leaving the application free to choose whether to automatically apply the changes? Again I have little idea what the implications are for update anomalies/other confusion when multiple peers try to apply the same or similar changes concurrently.

I'm sure I should experiment, but I wonder if you have anything to say in this area?

@jjlee
Copy link
Author

jjlee commented Sep 5, 2021

I probably should have omitted security from this: I've not thought much about that.

Really I'm mostly interested about how to impose domain logic, like in the example above about application-level merging of calendar entries. So please take my three guesses above at ways of achieving that in that context.

@ept
Copy link
Member

ept commented Sep 6, 2021

You're right that if one node accepts a change and another node rejects it, they will become inconsistent. My thinking was that change rejection should only be used in circumstances where it is unambiguous whether a change is allowed or not. Moreover, client software should stop users from making changes that would be rejected by other nodes, so that an honest mistake by a user will not trigger rejection. That way, changes would only get rejected if there is genuine malicious or buggy behaviour happening.

However, there are situations where this can very easily go wrong. Say you have a rule that a user may only make a certain change if they have permission to do so. Now a user makes a change for which they do not currently have permission, but concurrently another user grants that user permission to do that thing. Now, depending on the order in which the nodes process the change and the permission grant, some may determine that the change should be rejected, and others will accept it.

A consensus mechanism is possible; in this case your consensus quorum would serve as an authority that decides what to accept or reject. That's certainly possible, but it runs somewhat counter to the principles of CRDTs and local-first.

I don't really have a good answer to this, and I suspect it depends on the application. I believe @alexjg has been looking into a permissions model recently. Maybe he has something useful to say?

@alexjg
Copy link
Contributor

alexjg commented Sep 6, 2021

I am planning to start work on a permissions model in the next month or so but right now I just have thoughts and nothing concrete. The concurrency problem is significant but my feeling is that there are a lot of applications which can be solved as long as you make it possible to express reasonably granular security logics. The issue with concurrency is that you can't revoke access to something, but this is less of a problem if you only grant very specific permissions - "`actor1 may modify this comment because they have a signed message from actor3, who created the comment" - being unable to revoke this permission is a lot less troublesome than "actor1 may create and modify comments".

@jjlee
Copy link
Author

jjlee commented Sep 9, 2021

@ept Thanks.

Your answer as I understand it is about the implications of "ignore updates", and about my option 1. This is along the lines I expected: except in the restricted case you explain, it's not within the framework of CRDT because it requires consensus (which could be bolted on).

However, any observations about the wisdom of my option 2. above: i.e. application code making CRDT updates just as a user might, in an attempt to impose domain rules? (apologies again for opening one issue really about at least two separate questions)

I have in mind application code making these updates in response to user CRDT updates, such as two users concurrently adding a calendar entry on the same day, where the two calendar entries can be automatically combined into one using application-specific logic. I do also mostly have in mind just a few peers having access to each document, but of course there's no hard limit on that in principle.

I imagine many times the outcome would be benign, with this resulting either in trivial automerge merges or ones that don't confuse users. But in some cases where application code on different peers concurrently make different changes, even though automerge will do its job, perhaps the history gets quite complicated and surprising and user confusion results (also developer confusion ;-). I suspect the likelihood of complicated history and confused users would depend on what application rules are chosen, the number of peers, sync latency, etc.

Is this option 2. kind of thing just a bad idea, or might it be workable?

Edit: Just to be clear: I do understand that I can nominate a special peer to act as server (or again bolt on a consensus mechanism). But I'm trying to understand how things might work without that.

@HerbCaudill
Copy link
Contributor

I see this as a two separate questions:

  • Permissions: You can avoid the need for consensus mechanisms by having authority flow from the creator of the group. I've been tinkering for a while on a library called localfirst/auth that implements this idea. This (draft) blog post explains the thinking behind this library in more detail.

  • Domain-specific conflict resolution: When I started working on localfirst/auth, I assumed I'd use Automerge to record and sync group membership state. But in this domain, "conflict" means something more than "modified the same property." For example, if Bob adds Charlie and concurrently Alice removes Bob, that's a conflict. And the way that conflict is resolved is very specific (Bob's addition of Charlie needs to be discarded). And so on: What if two people remove each other, etc. So I built what ended up being a generic CRDT engine that can work with any domain's semantics and rules; it's basically a Redux-like store but with a hook for conflict-resolution rules so it can be replicated across peers. I've pulled this machinery out into its own library, which I'm calling CRDX.

These are both works in progress - please don't use for anything real until they're released as the APIs might still change - but hopefully the thinking behind them is useful for what you're doing.

@ept
Copy link
Member

ept commented Sep 13, 2021

any observations about the wisdom of my option 2. above: i.e. application code making CRDT updates just as a user might, in an attempt to impose domain rules?

I am a bit hesitant to recommend this approach because I think it could easily get very confusing. In particular, if several peers concurrently detect that something needs to be fixed up and independently perform the fix, the result is that the fix gets applied several times. Some CRDT operations are idempotent so this is not a problem (e.g. deleting an element from a list several times is the same as deleting it once), but other operations are not (e.g. if you insert into a list, you will end up with several copies of the inserted element, one for each peer).

There is also the risk of introducing infinite loops if one peer makes a change, another undoes/modifies it, the first reacts again to the undo/modification, etc. You can avoid this if you're careful, but I'm still concerned about it.

If you want to introduce custom domain-specific logic, I think a more robust approach would be to use an "event sourcing" style, where the Automerge document only contains a log of events, and each peer uses application logic to derive the current application state from those events. By processing the events in the order they appear in the log, each peer can end up in the same state, and you are free to use arbitrary logic as long as it is deterministic. This way you're not using much of the CRDT machinery, but it'll probably be easier to reason about.

@jjlee
Copy link
Author

jjlee commented Sep 14, 2021

Thanks! Don't know how both of you answer questions as thoughtfully as you do while producing great research and software.

Event sourcing was where I started but I suspected I was making things more complicated than needed, so that's a very helpful push. Still not sure exactly how that will turn out with automerge and my domain, but it'll be fun to try.

I'll be sure to follow your work in both these areas, Herb, really interesting stuff and very relevant to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants