Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.x: compensation for significant clock drifts in schedulePeriodically #3467

Merged
merged 1 commit into from
Dec 15, 2015

Conversation

akarnokd
Copy link
Member

There is a problem, reported in #3461 and #2943, in which if the system clock drifts, the periodic calculation inside Scheduler.Worker gets off and either taking a longer time for the next invocation of the task or doing "catching-up" with all the lost invocations.

The solution checks the wall clock difference between the last run and the current run and if it went back or forward significantly, it rebases the timer period and schedules the next execution relative to now.

If the clock goes back, the original code scheduled the next invocation way into the future. This PR will schedule it after the period.

If the clock goes forward, the original code scheduled executions for all the missed time between the last run and the new time immediately, yielding a bunch of 0 delays. This PR will simply schedule the next invocation after the period.

The algorithm for both cases is the same: make sure the next invocation is scheduled relative to now and recalculate the start timestamp as if the whole sequence run under the new drifted clock all along. The subsequent invocations will be scheduled at a fixed rate again.

I've added the system parameter rx.scheduler.drift-tolerance (unit: minutes, default: 15 minutes), which is used to determine if the clock drifted too far between invocations of the periodic task.

@headinthebox
Copy link
Contributor

Time drift is pretty nasty. We spent a lot of time on this in Rx.NET http://blogs.msdn.com/b/rxteam/archive/2012/06/20/reactive-extensions-v2-0-release-candidate-available-now.aspx

*/
static final long CLOCK_DRIFT_TOLERANCE_NANOS;
static {
CLOCK_DRIFT_TOLERANCE_NANOS = TimeUnit.MINUTES.toNanos(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to put this into separate static block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, where is the other one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicitly using an initializer.

static final long CLOCK_DRIFT_TOLERANCE_NANOS =
    TimeUnit.MINUTES.toNanos(Long.getLong("rx.scheduler.drift-tolerance", 15));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, ^.

BTW, @akarnokd don't you think that minutes is not the best unit of time measurement in programming? I'd prefer millis, long seems enough for CLOCK_DRIFT_TOLERANCE, and anyway, you're converting it to nanos then so long is 100% enough for millis.

Also, 15 minutes seems too much for a default clock drift tolerance…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't inline it in case more parameters are asked for within this PR.

It is a question what is a reasonable tolerance for detecting drifts. One could argue it should be the function of the drift direction and the length of the period instead:

  • if now moves backwards by any number, do the correction,
  • if now moves ahead more than 1.5 times the period, do the correction.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akarnokd Is it your preference to create a separate factor rx.scheduler.drift-tolerance-backward and rx.scheduler.drift-tolerance-forward?

I think the tolerance setting would be more easily generalized as a percentage of the period but it would be slightly more difficult to grok (if that is ever actually a concern for users). I'm okay with converting this to a percentage value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it as it is now. I'm not sure about the percentage because what if the task itself is slow instead of a clock drift?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akarnokd May you please allow to set CLOCK_DRIFT_TOLERANCE_NANOS in seconds/milliseconds? I have a code which controls player position to play short pieces (few seconds) of media and pause them when the position is reached. Such as player may have various playback issues (fast/slow playback, delay in initialization) i use Flowable.interval(50ms) to check its state/position periodically. However when system time synchronization occurs during playback flowable stops fire events until time become the same as it was before synchronization
Here how it looks in my log

14:21:43.273 z1 D PlaybackDurationControlCompletable.onTick: player position is unchanged, skipping check
14:21:37.727 z1 D some other event, time syncrhonization occurrs
// ... no onTick events for 6 seconds
14:21:43.372 z1 D PlaybackDurationControlCompletable: Initializing position 41769; elapsedTimerTime=5757 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please post a separate issue about this case.

@akarnokd akarnokd force-pushed the SchedulePeriodicallyClockShiftFix branch from 8ec6e81 to 3992b99 Compare December 14, 2015 22:59
@akarnokd
Copy link
Member Author

I've updated the code with the suggestion of @stealthcode

@stealthcode
Copy link

👍 LGTM

@akarnokd
Copy link
Member Author

Do you need additional changes? If not, feel free to merge this PR.

@abersnaze
Copy link
Contributor

👍

abersnaze added a commit that referenced this pull request Dec 15, 2015
1.x: compensation for significant clock drifts in schedulePeriodically
@abersnaze abersnaze merged commit 8602550 into ReactiveX:1.x Dec 15, 2015
@artem-zinnatullin
Copy link
Contributor

Yay! This was important issue.

So, I did test (similar to #3530):

Observable
  .interval(5, 10, SECONDS)
  .subscribe(aLong -> logger.d("Interval: %d", aLong));

And then changed system clock to +2 hours at runtime of the app.

Before fix
As expected computation scheduler gone crazy and fired about 700 events during one second.

screen shot 2015-12-15 at 08 44 57

screen shot 2015-12-15 at 08 43 54

After fix
Everything was just fine! No incorrect events, no CPU consuming

screen shot 2015-12-15 at 09 01 06

screen shot 2015-12-15 at 09 00 42


So, I think #3461 and #3530 can be closed now!

Thanks, @akarnokd!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants