Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux 4.3 compat: bio_end_io_t / BIO_UPTODATE #3828

Closed
wants to merge 1 commit into from
Closed

Conversation

l1k
Copy link
Contributor

@l1k l1k commented Sep 24, 2015

This seems to be the only compat issue with 4.3 so far, I have it running fine now with 4.3-rc2. Let's see what the build bot says.

I refactored the code in vdev_disk_physio_completion() a bit in an attempt to retain readability. If you want anything changed do not hesitate to post a comment. Thanks.

One thing that struck me as odd is that dio_request_t->dr_error (and consequently zio->io_error) stores a positive errno. Is this a Solaris convention?

Fixes #3799

@l1k
Copy link
Contributor Author

l1k commented Sep 24, 2015

Hm, 37f9dac (authored by @ryao) broke the build for pre 2.6.24 kernels because it assumes 2 arguments for bio_endio() and these older kernels used 3 arguments. This got merged into 0.6.5 and didn't show up on my machine because I'm still on 0.6.4. It's trivial to fix this for 4.3 but to fix it for pre 2.6.24 kernels I need to do some git archaeology first to understand what the size argument should be set to.

@behlendorf
Copy link
Contributor

I'll need to take a more careful look but it looks reasonable. A few quick comments.

stores a positive errno. Is this a Solaris convention

Yes, I find it strange as well coming from a Linux background but Illumos never used negative errnos. This means where the illumos code and Linux code need we often need to invert the logic. The biggest place this happened is between the zpl_* wrappers which hook in the Linux VFS and the original zfs_* implementations of this functionality. It's definitely something to keep in mind.

broke the build for pre 2.6.24

This was done knowingly. We only support back to 2.6.32 kernels to keep the amount of compatibility code and required testing to a reasonable level. So don't worry about that.

@l1k
Copy link
Contributor Author

l1k commented Sep 24, 2015

Thanks a lot for the explanation and feedback @behlendorf.

I fixed the build breakage introduced by 37f9dac, though I'm not 100% certain if this works on these ancient kernels: The third argument to bio_endio() that was dropped with torvalds/linux@6712ecf indicates the amount of bytes successfully read/written. Unless I'm missing something, this cannot easily be determined in zvol.c:zvol_request() so I resorted to passing 0.

This was done knowingly. We only support back to 2.6.32 kernels to keep the amount of compatibility code and required testing to a reasonable level. So don't worry about that.

Hm, in that case we can either keep the patch as is and continue supporting these old kernels on a best effort basis, or I could rework the patch to drop support for the three argument bio_endio() entirely. Your call. :)

@behlendorf
Copy link
Contributor

@l1k our unstated policy has been to drop dead code like this so can you rework the patch to drop the three argument version. Then I'll take another look at the patch.

@l1k
Copy link
Contributor Author

l1k commented Sep 25, 2015

@behlendorf: Voilà, sans support for 3 argument bio_endio().

I kept the small refactoring of vdev_disk_physio_completion() because I felt that it improves readability/clarity. But that's just me so if you want that or anything else changed just let me know. The alternative would have been to use the same approach as in vdev_disk_io_flush_completion(), i.e. set int error = bio->bi_error; if HAVE_1ARG_BIO_END_IO_T is defined.

Build bot fails on 3 platforms but pull #3833 which you opened yesterday fails on those same platforms at the exact same checks so I assume this is not related to my patch. Notably ubuntu-14.04-amd64-current-builder, which uses 4.3 and thus doesn't compile for #3833, compiles fine now but fails on "test import". Sadly I can't find any hints in the stdio log as to the cause.

@l1k
Copy link
Contributor Author

l1k commented Sep 25, 2015

By the way, vdev_disk_io_flush_completion() does not check BIO_UPTODATE (unlike vdev_disk_physio_completion). Is this correct?

@behlendorf
Copy link
Contributor

@l1k nice work, this LGTM. I think the approach you settled on works and does make the code more readable. Thanks for tackling this, it'll be nice to be caught back up to the kernel again.

As for the buildbot failures the two ztest failures are definitely not related to this change. As for the import failure that's also to be expected because it due to the tip of master failing to build on the current builder. Once this patch is merged that will be resolved.

Regarding vdev_disk_io_flush_completion() it problem should check BIO_UPTODATE as well but let's go for the minimal change here as you've done. I should be able to get this merged today.

@behlendorf behlendorf added the Type: Building Indicates an issue related to building binaries label Sep 25, 2015
@behlendorf behlendorf modified the milestones: 0.7.0, 0.6.5.2 Sep 25, 2015
@behlendorf
Copy link
Contributor

Merged as:

784a7fe Linux 4.3 compat: bio_end_io_t / BIO_UPTODATE

@behlendorf behlendorf closed this Sep 25, 2015
@behlendorf behlendorf reopened this Sep 25, 2015
@ryao
Copy link
Contributor

ryao commented Sep 26, 2015

If anyone wants ZoL to reintroduce support pre-2.6.32 kernels and can produce patches that make it work in a maintainable manner, feel free to open a pull request.

The reason pre-2.6.32 support was dropped was because no one used it and maintaining support was becoming too arduous. The cleanup from removing support was a reduction of something like 3 to 4 thousand lines of code across both zfsonlinux/spl and zfsonlinux/spl. No one seemed to use those kernels such that the downsides of dropping support were virtually zero and in comparison to refactoring.

openzfs/spl@ec18fe3 showed that there can be ways to refactor code that reduce complexity while maintaining support for both old and new kernels. Barring kernel API limitations in older kernels that I do not recall, I would not consider reintroducing support for pre-2.6.32 kernels to be infeasible if one were willing to spend time on it. If someone were to do this, he likely should do it by picking a modern interface and reimplementing it for the places where it is not available so that the code is remains maintainable.

@ryao
Copy link
Contributor

ryao commented Sep 26, 2015

0f37d0c might be a better example of that concept. Rather than dealing with a symbol being removed like that SPL patch did, this patch had to deal with an API change intended to make the kernel API better. The solution made the code cleaner by refactoring it to use the new interface and implementing a shim that allowed the same code to be used on older kernels.

@ryao
Copy link
Contributor

ryao commented Sep 26, 2015

It seems that the patch that the SPL patch refactored was in response to a kernel API improvement after looking at torvalds/linux@79714f7. The new private API is really what we wanted for maintainability, but it was not immediately obvious that we could implement it ourselves. The refactoring was done upon the realization that a basename operation on a string that we had made it possible.

bio_endio() already checks the BIO_UPTODATE flag and sets error = -EIO
before invoking the ->bi_end_io callback, obviating the need that we
check the flag once more.

It's been doing that since torvalds/linux@9cc54d40b8ca ("Only call
bi_end_io once for any bio") which was merged into 2.6.24. So the
additional check in our callback would only be needed for pre 2.6.24
kernels which we no longer support. Drop the additional check.

(In Linux 4.3, the BIO_UPTODATE flag was removed altogether with
torvalds/linux@4246a0b63bd8, "block: add a bi_error field to struct
bio".)

Signed-off-by: Lukas Wunner <[email protected]>
@l1k
Copy link
Contributor Author

l1k commented Sep 27, 2015

@behlendorf: In trying to find out why vdev_disk_io_flush_completion() doesn't check the BIO_UPTODATE flag while vdev_disk_physio_completion() does, I've realized that the check in the latter is obsolete: bio_endio() already checks the BIO_UPTODATE flag and sets error = -EIO before invoking the ->bi_end_io callback, obviating the need that we check the flag once more. And it's been doing that since 2.6.24 (specifically torvalds/linux@9cc54d40b8ca). So we'd only need the additional BIO_UPTODATE check for pre 2.6.24 kernels which are no longer supported. This new commit thus drops the additional check. However somebody should doublecheck this in case I'm missing something.

@ryao: Thank you for these tips, I will keep that in mind when I need to fix future compat issues. (Which may happen sooner rather than later because I dev on the drivers/gpu subsystem and use a root zpool, so I'm bitten by compat issues very early.)

@behlendorf behlendorf closed this Sep 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Building Indicates an issue related to building binaries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants