Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an is_merging function #192

Merged
merged 3 commits into from
Jul 9, 2020
Merged

Conversation

icristescu
Copy link
Contributor

is_merging returns true is the index is running an async merge.

I used the merge_lock to detect that a merge is ongoing. Other operations take this lock (clear and close), but as they are not async, so it should be fine.

@icristescu icristescu marked this pull request as draft July 8, 2020 18:17
@craigfe
Copy link
Member

craigfe commented Jul 8, 2020

Note that read-only instances don't share the same merge_lock as read-write instances, so they will always get is_merging t = false (except perhaps when a sibling in another thread is in the process of closing the instance).

@icristescu
Copy link
Contributor Author

Note that read-only instances don't share the same merge_lock as read-write instances, so they will always get is_merging t = false (except perhaps when a sibling in another thread is in the process of closing the instance).

Indeed, I tried to clarify this point in the doc of is_merging. I added an exception if called by the RO instance to make this more clear.

@craigfe
Copy link
Member

craigfe commented Jul 8, 2020

We now have quite a number of functions that work for only one of the instance types. By my count:

  • either. find, mem, close, iter
  • RW-only. clear, flush, filter, replace, is_merging (private: force_merge / await)
  • RO-only. sync

And even the functions that are shared often either change behaviour according to the instance type or do things for both instance types that only make sense for once, such as taking certain locks.

This is out-of-scope for this PR, but if this trend of diverging interfaces continues I think we should consider splitting these out into different functors – or at least different in-memory representations for each instance type sharing the same functor (perhaps using phantom capabilities).

@icristescu icristescu marked this pull request as ready for review July 9, 2020 07:04
Comment on lines 348 to 351
let k1, v1 = (Key.v (), Value.v ()) in
Index.replace rw k1 v1;
let t = Index.force_merge ~hook:(before (f "before" true)) rw in
Index.await t |> check_completed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a helper e.g. add_binding_and_merge ~hook, to be used 3 times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nicer with your suggestion, thanks!

@craigfe
Copy link
Member

craigfe commented Jul 9, 2020

  • CHANGES.md entry

Copy link
Member

@craigfe craigfe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge shortly.

@icristescu icristescu merged commit bd2c093 into mirage:master Jul 9, 2020
@icristescu icristescu deleted the is_merging branch July 9, 2020 11:14
craigfe added a commit to craigfe/opam-repository that referenced this pull request Oct 21, 2020
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) [`Block_writes] strategy. The new
  [`Overcommit_memory] does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)

- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
craigfe added a commit to craigfe/opam-repository that referenced this pull request Jan 5, 2021
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) `` `Block_writes`` strategy. The new
  `` `Overcommit_memory`` does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

- `Index.replace` now takes a `~overcommit` argument to postpone a merge. (mirage/index#253)

- `Index.merge` is now part of the public API. (mirage/index#253)

- `Index.try_merge` is now part of the public API. `try_merge' is a no-op if
  the number of entries in the write-ahead log is smaller than `log_size`,
  otherwise it's `merge'. (mirage/index#253 @samoht)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)
- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
- Fix `Index.iter` to only iterate once over elements persisted on the disk
  (mirage/index#260, @samoht, @icristescu)
kit-ty-kate pushed a commit to craigfe/opam-repository that referenced this pull request Jan 6, 2021
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) `` `Block_writes`` strategy. The new
  `` `Overcommit_memory`` does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

- `Index.replace` now takes a `~overcommit` argument to postpone a merge. (mirage/index#253)

- `Index.merge` is now part of the public API. (mirage/index#253)

- `Index.try_merge` is now part of the public API. `try_merge' is a no-op if
  the number of entries in the write-ahead log is smaller than `log_size`,
  otherwise it's `merge'. (mirage/index#253 @samoht)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)
- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
- Fix `Index.iter` to only iterate once over elements persisted on the disk
  (mirage/index#260, @samoht, @icristescu)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants