Skip to content

Commit

Permalink
Upstream: zil_commit_waiter() unsigned test
Browse files Browse the repository at this point in the history
  • Loading branch information
lundman committed May 25, 2020
1 parent 18d5cd4 commit adeb452
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion module/zfs/zil.c
Original file line number Diff line number Diff line change
Expand Up @@ -2691,7 +2691,7 @@ zil_commit_waiter(zilog_t *zilog, zil_commit_waiter_t *zcw)
&zcw->zcw_lock, wakeup, USEC2NSEC(1),
CALLOUT_FLAG_ABSOLUTE);

if (timeleft >= 0 || zcw->zcw_done)
if (timeleft != -1 || zcw->zcw_done)

This comment has been minimized.

Copy link
@adamdmoss

adamdmoss Jun 15, 2020

Contributor

AFAICT, the magic -1 number only means 'timeout' in the user-space version of cv_timedwait_hires(), and not (at least for the freebsd spl) the kernel versions of cv_timedwait_hires().

So I'm a bit worried that this is incorrect on some (or all ;) ) platforms, and is equivalent to just removing the 'timeleft' comparison completely...

This comment has been minimized.

Copy link
@lundman

lundman Jun 15, 2020

Author Contributor

Since SPL is emulating the illumos API, we can see that cv_timedwait_hires() returns -1 for timeout:
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/condvar.c#L298

Linux and macOS both also do the same:
https://github.com/openzfs/zfs/blob/master/module/os/linux/spl/spl-condvar.c#L391
https://github.com/openzfsonosx/openzfs/blob/sorted2/module/os/macos/spl/spl-condvar.c#L243

Linux clock_t is signed, so they were fine. macOS clock_t is unsigned.
This test only cared that it had been signaled ( >= 0) or not. Which unfortunately, "-1" is also triggered(unsigned).
The only way out of this loop was to return -1.

I just negated the test, to check directly for -1 (or the unsigned value) The caller doesn't care about the "time left". This was to fix a hang in kernel code, since -1,0,timeleft all triggered ">=0" forever.

FreeBSD should probably correct their cv_timedwait_hires. Currently, you have no way to know if something was signaled (when returning 0 time left) and timeout expired, no signal. Over all, positive return value means "signal received" and negative means "no signal received".

continue;

timedout = B_TRUE;
Expand Down

0 comments on commit adeb452

Please sign in to comment.