Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS Crypto: size overflow detected #6714

Closed
sempervictus opened this issue Oct 4, 2017 · 7 comments
Closed

ZFS Crypto: size overflow detected #6714

sempervictus opened this issue Oct 4, 2017 · 7 comments

Comments

@sempervictus
Copy link
Contributor

System information

Type Version/Name
Distribution Name Arch
Distribution Version Rolling
Linux Kernel 4.9.51 with minipli's unofficial grsec patch set and other changes
Architecture x86_64
ZFS Version master + PRs
SPL Version master

Describe the problem you're observing

During a chef deployment run writing out a reasonable amount of data to a raidz1 on 2T spinners, process was killed by grsec detecting a size overflow:

Oct 04 00:21:27 zfs-host00 kernel: PAX: size overflow detected in function zio_do_crypt_data /var/lib/dkms/zfs/0.7.0/build/module/zfs/zio_crypt.c:1506 cicus.247_667 max, count: 37, decl: iov_len; num: 0; context: iovec;
Oct 04 00:21:27 zfs-host00 kernel: CPU: 2 PID: 11859 Comm: z_wr_iss Tainted: P           OE   4.9.51-1-sv #1
Oct 04 00:21:28 zfs-host00 kernel: Hardware name: Dell Inc. PowerEdge 2950/0M332H, BIOS 2.7.0 10/30/2010
Oct 04 00:21:28 zfs-host00 kernel:  ffffc9002b8379d0 ffffffff813d41b9 ffffffffa06743f0 00000000000005e2
Oct 04 00:21:28 zfs-host00 kernel:  ffffc9002b837a00 ffffffff812098ff fefefefefefefefe ffffc9004ec6d090
Oct 04 00:21:28 zfs-host00 kernel:  ffffc900364600b8 ffffc900207de0b8 ffffc9002b837bb8 ffffffffa05d243f
Oct 04 00:21:28 zfs-host00 kernel: Call Trace:
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff813d41b9>] dump_stack+0x68/0x8f
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa06743f0>] ? _fini+0x31a75/0x63680 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff812098ff>] report_size_overflow+0x7f/0x90
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa05d243f>] zio_do_crypt_data+0x123f/0x1390 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa0288fec>] ? SHA2Final+0x9c/0x1b0 [icp]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa05d1178>] ? zio_crypt_do_indirect_mac_checksum+0xd8/0xf0 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa053d355>] spa_do_crypt_abd+0x345/0x370 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa05c74a0>] zio_encrypt+0x650/0x6f0 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa05c6677>] zio_execute+0x97/0x100 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa0244ca1>] taskq_thread+0x2b1/0x4d0 [spl]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff810a0020>] ? wake_up_q+0x90/0x90
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa05c65e0>] ? zio_taskq_member.isra.10.constprop.16+0x70/0x70 [zfs]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffffa02449f0>] ? task_done+0xa0/0xa0 [spl]
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff81095cbd>] kthread+0xfd/0x120
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff81095bc0>] ? kthread_parkme+0x40/0x40
Oct 04 00:21:28 zfs-host00 kernel:  [<ffffffff8183dca7>] ret_from_fork+0x37/0x50
Oct 04 00:23:52 zfs-host00 kernel: INFO: task txg_quiesce:11977 blocked for more than 120 seconds.
Oct 04 00:23:52 zfs-host00 kernel:       Tainted: P           OE   4.9.51-1-sv #1
Oct 04 00:23:52 zfs-host00 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 04 00:23:52 zfs-host00 kernel: txg_quiesce     D    0 11977      2 0x00000080
Oct 04 00:23:52 zfs-host00 kernel:  ffffffff810a8700 0000000000000000 ffff880101965400 ffff8801f5bc6300
Oct 04 00:23:52 zfs-host00 kernel:  ffff88022fc16b40 ffffc9002bbfbd18 ffffffff8183835a ffffffff814009a3
Oct 04 00:23:52 zfs-host00 kernel:  ffff88022fc16b40 ffffc9002bbfbd40 ffff8801f5bc6300 ffff8801ecb78038
Oct 04 00:23:52 zfs-host00 kernel: Call Trace:
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff810a8700>] ? switched_to_idle+0x20/0x20
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8183835a>] ? __schedule+0x24a/0x6e0
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff814009a3>] ? __list_add+0x33/0x60
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81838834>] schedule+0x44/0x90
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa02494c4>] cv_wait_common+0x144/0x160 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff810bc730>] ? prepare_to_wait_event+0x110/0x110
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa02494ff>] __cv_wait+0x1f/0x30 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa0571153>] txg_quiesce_thread+0x2e3/0x3e0 [zfs]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa0570e70>] ? txg_sync_thread+0x4b0/0x4b0 [zfs]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa0243ae0>] ? __thread_exit+0x20/0x20 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa0243b5a>] thread_generic_wrapper+0x7a/0x90 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81095cbd>] kthread+0xfd/0x120
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81095bc0>] ? kthread_parkme+0x40/0x40
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8183dca7>] ret_from_fork+0x37/0x50
Oct 04 00:23:52 zfs-host00 kernel: INFO: task mysqld:31782 blocked for more than 120 seconds.
Oct 04 00:23:52 zfs-host00 kernel:       Tainted: P           OE   4.9.51-1-sv #1
Oct 04 00:23:52 zfs-host00 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 04 00:23:52 zfs-host00 kernel: mysqld          D    0 31782  31607 0x00000180
Oct 04 00:23:52 zfs-host00 kernel:  8000000000000000 0000000000000000 ffff880101965400 ffff8801db3b2100
Oct 04 00:23:52 zfs-host00 kernel:  ffff88022fc16b40 ffffc90008483c90 ffffffff8183835a ffffffff814009a3
Oct 04 00:23:52 zfs-host00 kernel:  ffff88022fc16b40 ffffc90008483cb8 ffff8801db3b2100 ffff8802249b15e8
Oct 04 00:23:52 zfs-host00 kernel: Call Trace:
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8183835a>] ? __schedule+0x24a/0x6e0
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff814009a3>] ? __list_add+0x33/0x60
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81838834>] schedule+0x44/0x90
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa02494c4>] cv_wait_common+0x144/0x160 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff810bc730>] ? prepare_to_wait_event+0x110/0x110
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa02494ff>] __cv_wait+0x1f/0x30 [spl]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa05c3863>] zil_commit+0x2b3/0xd00 [zfs]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa05b6ef3>] zfs_fsync+0x73/0xe0 [zfs]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffffa05d5184>] zpl_fsync+0x74/0xb0 [zfs]
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81240679>] vfs_fsync_range+0x59/0xc0
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8124077b>] do_fsync+0x4b/0x80
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff81240ae3>] sys_fsync+0x23/0x30
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8100191f>] do_syscall_64+0x7f/0x1b0
Oct 04 00:23:52 zfs-host00 kernel:  [<ffffffff8183dae5>] entry_SYSCALL64_slow_path+0x25/0x25

Ping @tcaputi @behlendorf: Might be a good target for #6595, mysql atop encrypted ZFS is going to be a pretty common use case.

@tcaputi
Copy link
Contributor

tcaputi commented Oct 4, 2017

@sempervictus I'm not really sure what I'm looking at here..... What happened exactly? I see the size overflow message but who's printing that? That line of code is just an end brace for me so I don't know what exactly it thinks overflowed. Do you have any more specific info about the workload or steps to reproduce?

@sempervictus
Copy link
Contributor Author

sempervictus commented Oct 4, 2017 via email

@tcaputi
Copy link
Contributor

tcaputi commented Oct 4, 2017

What about the line number? Can you tell me what line it thinks is causing the problem? Like I said, for me this is just an end brace.

@sempervictus
Copy link
Contributor Author

@tcaputi: from my source tree, this is happening in:

1501         if (nr_dst != 0) {
1502                 dst_iovecs = kmem_alloc(nr_dst * sizeof (iovec_t), KM_SLEEP);
1503                 if (dst_iovecs == NULL) {
1504                         ret = SET_ERROR(ENOMEM);
1505                         goto error;
1506                 }
1507         }

There was a prior issue with overflow checks a few years back - #2505, which has response from the grsec folks stating that we use some unsafe C semantics in our code which should be fixed to avoid logical false positives which are functional true positives from the perspective of the GCC plugin.

@tcaputi
Copy link
Contributor

tcaputi commented Oct 5, 2017

I've looked over this several times (along with a few of my coworkers) and I can't see any reason this might be happening. All variables involved are uint_ts and are well bounded. The only thing I can think of is that you haven't reinserted your kernel module and are thus using an older version (even if the newer one is whats installed). This could explain why the line number grsec is concerned with is so strange. Could you try rebuilding everything from scratch (with the latest set of patches from #6595), installing it and then running modprobe -r zfs; modprobe zfs? You can be sure that you are running the correct version by checking cat /sys/module/zfs/version.

@tcaputi
Copy link
Contributor

tcaputi commented Oct 5, 2017

Actually, looking at the current version of the code might have given me the answer; Can you try changing the type of lr_len in zio_crypt_init_uios_zil() from a uint_t to a uint64_t and seeing if that solves the problem?

tcaputi pushed a commit to datto/zfs that referenced this issue Oct 5, 2017
Signed-off-by: Tom Caputi <[email protected]>
@sempervictus
Copy link
Contributor Author

WIth this morning's commits, zloop doesnt catch it, but it didn't catch it with the broken branch either. However, the fix looks rational. I'm pushing to the original system affected, lets see how the fares.

tcaputi pushed a commit to datto/zfs that referenced this issue Oct 11, 2017
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Oct 15, 2017
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
wli5 pushed a commit to wli5/zfs that referenced this issue Oct 20, 2017
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Nov 4, 2017
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Nov 6, 2017
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Jan 29, 2018
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this issue Feb 13, 2018
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
FransUrbo pushed a commit to FransUrbo/zfs that referenced this issue Apr 28, 2019
This 2 line patch fixes a possible integer overflow reported by grsec.

Signed-off-by: Tom Caputi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants