Draft: Add support for tapered, precise, absolute faketime start time #456

aalekseyev · 2024-02-01T16:33:06Z

FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH behaves similar to FAKETIME_START_AFTER_SECONDS, with two main differences:

the timestamp is absolute, instead of being relative to process startup
the timestamp is specified in nanoseconds

The reason we want this feature is the following use case.

We run a large test suite under faketime. That test suite has access to filesystem artifacts that were created prior to test start up. Among those artifacts are some caches which are considered up to date iff the timestamps of the files match what's recorded in a data structure.

This means that to access those caches to be considered valid we need their timestamps to not be rewritten.

The reason we can't use FAKETIME_START_AFTER_SECONDS directly is that the test suite consists of multiple processes, for those processes to correctly interact with each other they need a consistent timestamp mapping that is shared between them. In fact the simplest bash script already behaves incorrectly because the commands use different process start times.

touch old
FAKETIME=+100d FAKETIME_START_AFTER_SECONDS=0 bash -c 'touch new; stat old new'

The expected behavior is that the timestamp of old is not rewritten, while the timestamp of new is rewritten.

That is in fact achievable now:

touch old
now_ns=$(date +%s.%N | sed -r 's_\.__')
FAKETIME=+100d FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH="$now_ns" bash -c 'touch new; stat old new'

aalekseyev · 2024-02-01T16:41:08Z

Sorry if this isn't exactly idiomatic or clean (in particular long long nanoseconds may be considered a code smell or non-portable?), but I thought I'd start somewhere.
Basically, we have a use case that I expected to be a very common use case, where FAKETIME_START_AFTER_SECONDS almost works well, but not quite.

I'd like to have some way of "starting" rewrites at a given time for a whole process tree, not an individual process, which seems to be a limitation in FAKETIME_START_AFTER_SECONDS.

aalekseyev · 2024-02-02T18:25:53Z

I keep finding more problems with this approach this PR is not solving.
So far I fixed the way utimes family works, but I've ran into a fundamental limitation caused by step transition of time.
I'm now looking into implementing a "tapered" transition instead of a step transition. I'll close the PR for now, but I'll keep you posted.

wolfcw · 2024-02-02T19:49:29Z

Yes, FAKETIME_START_AFTER_SECONDS works per process. You might consider turning an absolute timestamp into a relative one during initialisation, basically resulting in a smaller value for each consecutively spawned child process.

The other aspect is getting everything to nanosecond resolution. Certainly doable, but probably needs changes in several places.

aalekseyev · 2024-02-05T18:46:52Z

@wolfcw, thanks for the quick response and for your suggestions, and sorry for a storm of increasingly ad-hoc patches.

I think tapering will be highly desired for us, after all, since without it programs are more likely to notice the weird time jump, even if it happens before process startup. (programs notice it if they run touch -d "7 days ago", for example).

Should I open a separate issue to discuss what's the best way to support tapered start?

As a quick introduction of what tapering is, unless that's already clear: it's when the time mapping instead of a jump transition uses a gradual transition, see code below. The point being that such mapping is reversible (up to some loss of precision), so you need really pedantic tests to run into issues.

int fake(int offset, int taper_begin, int taper_end, int time) {
  if (time <= taper_begin) { 
    return time;
  }
  if (time >= taper_end) {
    return time + offset;
  }
  // interpolate between (taper_begin, taper_begin) and (taper_end, taper_end+offset)
  int t = time - taper_begin;
  int w = taper_end - taper_begin in
  int h = w + offset in
  return taper_begin + t * h / w;
}

In the branch of this PR I seem to have a working prototype, but there are probably many reasons you don't want to take that code as-is.

Adding two variables, specifying the beginning and the end of a tapered start interval, measured in nanoseconds since epoch: ``` FAKETIME_TAPER_BEGIN_NSEC_SINCE_EPOCH FAKETIME_TAPER_END_NSEC_SINCE_EPOCH ``` The behavior these implement is similar to what `FAKETIME_START_AFTER_SECONDS` does, with a few key differences: * timestamp is absolute, instead of being relative to process startup * timestamp is specified in nanoseconds * conversion interacts correctly with `utimes` family of functions * conversion is tapered (see below), which makes mapping reversible (up to loss of precision) The reason we want this feature is the following use case. We run a large test suite under faketime. That test suite has access to filesystem artifacts that were created prior to test start up. Among those artifacts are some caches which are considered up to date iff the timestamps of the files match what's recorded in a data structure. This means that for those caches to be considered valid we need their timestamps to not be fake. The reason we can't use `FAKETIME_START_AFTER_SECONDS` directly is that the test suite consists of multiple processes, for those processes to correctly interact with each other they need a consistent timestamp mapping that is shared between them. In fact the simplest bash script already behaves incorrectly because the commands use different process start times. ``` touch old LD_PRELOAD=... FAKETIME=+100d FAKETIME_START_AFTER_SECONDS=0 bash -c 'touch new; stat old new' ``` The expected behavior is that the timestamp of `old` is not rewritten, while the timestamp of `new` is rewritten. That is in fact achievable now: ``` FAKETIME_TAPER_BEGIN_NSEC_SINCE_EPOCH=$(date +%s%N) sleep 0.1 FAKETIME_TAPER_END_NSEC_SINCE_EPOCH=$(date +%s%N) FAKETIME=+100d FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH="$now_ns" bash -c 'touch new; stat old new' ``` What is tapering and why do we need it? The idea is to make the time transition smooth instead of abrupt, gradually increasing the offset amount from the start to the end of the tapering interval. The reason we want this is to make the time mapping reversible (up to some loss of precision). This means some programs that interact with the file system will no longer be confused. For example, if you do the equivalent of `touch -d "3 days ago"` and then read back the timestamp, you'll get approximately the expected timestamp, instead of something completely off.

aalekseyev changed the base branch from master to develop February 1, 2024 16:34

aalekseyev changed the base branch from develop to master February 1, 2024 16:34

aalekseyev force-pushed the keep-ancient-files-timestamps branch from 44215b6 to 934c66a Compare February 1, 2024 16:42

aalekseyev closed this Feb 2, 2024

aalekseyev reopened this Feb 5, 2024

aalekseyev changed the title ~~Add support for FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH~~ Draft: Add support for tapered, precise, absolute faketime start time Feb 5, 2024

aalekseyev force-pushed the keep-ancient-files-timestamps branch from 3343efd to 1beb455 Compare February 6, 2024 11:28

wolfcw self-assigned this Feb 6, 2024

add a simple mult_div_avoid_overflow implementation

3ce11e5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Add support for tapered, precise, absolute faketime start time #456

Draft: Add support for tapered, precise, absolute faketime start time #456

aalekseyev commented Feb 1, 2024

aalekseyev commented Feb 1, 2024

aalekseyev commented Feb 2, 2024

wolfcw commented Feb 2, 2024

aalekseyev commented Feb 5, 2024

Draft: Add support for tapered, precise, absolute faketime start time #456

Are you sure you want to change the base?

Draft: Add support for tapered, precise, absolute faketime start time #456

Conversation

aalekseyev commented Feb 1, 2024

aalekseyev commented Feb 1, 2024

aalekseyev commented Feb 2, 2024

wolfcw commented Feb 2, 2024

aalekseyev commented Feb 5, 2024