Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RO syncs #1008

Merged
merged 4 commits into from
Jul 8, 2020
Merged

RO syncs #1008

merged 4 commits into from
Jul 8, 2020

Conversation

icristescu
Copy link
Contributor

A straightforward implementation for RO sync. I'll add more tests (concurrent ones too) so for now its a draft.

The way I think ro_sync should be used is to catch exceptions for RO operations, and retry ro_sync in case of failure.

Depends on mirage/index#175

let ro_sync t =
Contents.CA.ro_sync (contents_t t);
Node.CA.ro_sync (snd (node_t t));
Commit.CA.ro_sync (snd (commit_t t))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens in the case where the 3 stores are actually the same store? Do we have a way to make this more efficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left only one ro_sync call and added a comment.

@icristescu icristescu force-pushed the ro_sync branch 3 times, most recently from 747b57f to 3ae1bc8 Compare June 15, 2020 18:37
@icristescu
Copy link
Contributor Author

If during a merge, an RO instance calls ro_sync and then looks for a value, that was flushed to disk by the merge, it will lead to an exception raised instead of returning None (see https://github.com/mirage/irmin/pull/1008/files#diff-51f1262f591288eadf8719ee5f293b18R309).

I added some tests that use the force_merge function from index, so I switched to Private.Make in pack_index. I think this is fine, we are not allowed to use pack_index from outside irmin-pack, but maybe there is a cleaner way to do this?

@icristescu icristescu marked this pull request as ready for review June 15, 2020 18:54
Copy link
Member

@craigfe craigfe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some tests that use the force_merge function from index, so I switched to Private.Make in pack_index. I think this is fine, we are not allowed to use pack_index from outside irmin-pack, but maybe there is a cleaner way to do this?

I'm very uneasy about that. The private API of Index is now being exposed by Irmin_pack.Index, and the tests get access to that API very easily (no references to Private or such). Indeed, constructing a Pack instance now requires the user to construct an Index with the private API.

If testing Index dependencies requires them having access to the private API, perhaps we need to rethink the public API 🙂

@icristescu
Copy link
Contributor Author

I'm very uneasy about that. The private API of Index is now being exposed by Irmin_pack.Index, and the tests get access to that API very easily (no references to Private or such). Indeed, constructing a Pack instance now requires the user to construct an Index with the private API.

If testing Index dependencies requires them having access to the private API, perhaps we need to rethink the public API 🙂

I would like to have a Make_private functor for irmin pack that can use the private API of index. I tried and it leads to a lot of code duplication, so I gave up.

But we need to use the private API of index in this PR, the problematic behaviour of ro_sync only occurs during a merge.

@icristescu
Copy link
Contributor Author

I added some tests that use the force_merge function from index, so I switched to Private.Make in pack_index.

I removed the private API of index, and used Index.filter and Index.flush instead.

@craigfe
Copy link
Member

craigfe commented Jun 25, 2020

As discussed offline, the tricky case of "flush to index before indexed value has been flushed to pack" could be solved by having Index support a pre-flush callback. We would use that callback to flush to Pack.

@craigfe
Copy link
Member

craigfe commented Jun 25, 2020

(And we should be sure to fsync that flush inside Pack, so as to keep the ordering correct even in the presence of crashes.)

@icristescu
Copy link
Contributor Author

As discussed offline, the tricky case of "flush to index before indexed value has been flushed to pack" could be solved by having Index support a pre-flush callback. We would use that callback to flush to Pack.

I added the hook from mirage/index#189 in the latest commit, and the tests now work fine, no invalid_read is raised.

@icristescu icristescu force-pushed the ro_sync branch 7 times, most recently from 0e5d531 to 2f62593 Compare July 1, 2020 09:19
@icristescu icristescu force-pushed the ro_sync branch 2 times, most recently from fcc376b to 135f697 Compare July 8, 2020 13:32
@craigfe
Copy link
Member

craigfe commented Jul 8, 2020

Thanks. Will wait for Travis to wake up and then merge.

@craigfe craigfe merged commit 5b962e9 into mirage:master Jul 8, 2020
@icristescu icristescu deleted the ro_sync branch July 8, 2020 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants