Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 #10324

tobz · 2021-12-07T18:45:07Z

Part of the Buffer Improvements RFC (RFC, #9476)

As part of the work on #10143, we opted to defer adding tests which exercise the new disk buffer implementation by using an underlying filesystem that was "adversarial", or had the ability to inject errors that might normally be rare in practice for the purpose of ensuring that we handle, and catch, these errors.

What this should likely be a first pass is some variant of buffer_perf, where we point it to store its data on the aforementioned adversarial filesystem, and then track what writes are successful vs not successful, and what we see from the reader side, and if the two match up. Essentially, the goal becomes: if we got no error back when writing and flushing a record, we should be able to read it back, or correctly detect when the data was modified outside of our control, and be able to account for every single attempted write.

One option could be to explore CharybdeFS as the adversarial filesystem implementation, as it is relatively well-maintained, should be battle-tested as it's written and used by ScyllaDB, and is programmatic controllable via Thrift RPC which should be easy to integrate into a test harness.

Another option that is slightly more integrate-able would be to use something like fuser to allow creating and controlling our target filesystem entirely in Rust. This could allow for writing the entire chunk of testing code in Rust, and potentially as a single binary that could be then fed a seed for the RNG used to choose which FS operations succeed or fail.

The text was updated successfully, but these errors were encountered:

tobz · 2021-12-08T19:38:01Z

As a data point: we've encountered a few assertions in specific tests where the fact that we're using real file operations when under test leads to indeterminism around how many times a future has to be polled before it reaches the await point we expect it to, and so on.

An adversarial filesystem would help us root out this indeterminism more easily, so that the tests could be written more robustly. Things that are fast and never fail when run locally are more easily triggered when run in CI, but even then, CI is not always slower, so having a filesystem we can make very slow would potentially be useful for developing more robust tests.

tobz · 2022-01-18T21:03:08Z

Chatting about this a little more with @blt, the plan as laid out above is likely unworkable for a single reason: unit tests are meant to assert deterministic scenarios, whereas adversarial filesystems are more amenable to black box testing.

While we should test something like the buffer_perf example binary against an adversarial filesystem, unit tests themselves require too much control. We would essentially be using the filesystem as a way to control what I/O operations respond with, which is useful, but not in the sense of what happens when throwing arbitrary/randomized errors back in terms of committed writes actually being written or not.

Thus, I'm going to transform this issue to encompass what sort of test we should run on top of an adversarial filesystem -- likely something as described above, or as described in the CharybdeFS documentation itself -- and a new issue will be created for tracking work to actually use property-based testing, along with some code refactoring, to meaningfully control both the input operations (read, write, flush, etc) and the way the "filesystem" should respond, and finding sequences of filesystem operation responses that invalidate those expectations.

tobz added type: task Generic non-code related tasks domain: buffers Anything related to Vector's memory/disk buffers domain: reliability Anything related to Vector's reliability labels Dec 7, 2021

tobz changed the title ~~chore(buffers): add adversarial filesystem test variants for disk buffer v2~~ Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 Dec 7, 2021

tobz mentioned this issue Dec 7, 2021

Buffer Improvements: write new disk buffer implementation #9920

Closed

tobz mentioned this issue Jan 10, 2022

Buffer improvements #9476

Closed

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 #10324

Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 #10324

tobz commented Dec 7, 2021 •

edited

Loading

tobz commented Dec 8, 2021

tobz commented Jan 18, 2022 •

edited

Loading

Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 #10324

Buffer Improvements: add adversarial filesystem test variants for disk buffer v2 #10324

Comments

tobz commented Dec 7, 2021 • edited Loading

tobz commented Dec 8, 2021

tobz commented Jan 18, 2022 • edited Loading

tobz commented Dec 7, 2021 •

edited

Loading

tobz commented Jan 18, 2022 •

edited

Loading