Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ztest crash: range_tree_space(alloctree) == 0 (0x2000 == 0) #15307

Closed
pcd1193182 opened this issue Sep 22, 2023 · 0 comments
Closed

ztest crash: range_tree_space(alloctree) == 0 (0x2000 == 0) #15307

pcd1193182 opened this issue Sep 22, 2023 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@pcd1193182
Copy link
Contributor

pcd1193182 commented Sep 22, 2023

As the topic says, there is a crash in ztest that indicates some sort of issue is going on in the metaslab accounting. This is the stack trace:

range_tree_space(alloctree) == 0 (0x2000 == 0)
ASSERT at ../module/zfs/metaslab.c:4079:metaslab_sync()/lib/x86_64-linux-gnu/libasan.so.5(+0x6cd40)[0x7f5606df7d40]
/export/home/delphix/zfs/.libs/ztest(+0x2021a)[0x55e53db1221a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f560281b420]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f560265800b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f5602637859]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(+0x183e90)[0x7f560643ee90]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(metaslab_sync+0x172b)[0x7f560665922b]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(vdev_sync+0x2db)[0x7f56066dbe5b]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(+0x3ca388)[0x7f5606685388]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(spa_sync+0x676)[0x7f560668e4c6]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(+0x40fff0)[0x7f56066caff0]
/export/home/delphix/zfs/lib/.libs/libzpool.so.5(+0x188a6f)[0x7f5606443a6f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f560280f609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f5602734133]

Looking at the code, this assertion in the code, this is the assertion that the alloctree is empty if ms_new is set. That variable is set in metaslab_init and unset the first time a sync is finished. However, it is not considered anywhere in the allocation path. Instead, we appear to rely on the fact that the metaslab will not be loaded or have a weight or ms_max_size set, which makes it always return false when considered in metaslab_should_allocate. However, metaslabs can be loaded by a variety of factors, including trim, initialize, and preloading. There are no actual mechanisms preventing this from happening to brand-new metaslabs, and once loaded, nothing in the allocation path prevents them from being used for allocations.

Analysis of the core file indicates that the metaslab was indeed selected for allocation for this txg; it is loaded, ms_primary is set, and ms_allocator is set to 2.

If my theory is correct, a simple fix is to add a check to metaslab_should_allocate that does not allow new metaslabs to be used. We may need to add similar lines in other places, to prevent new metaslabs from being trimmed or intiialized.

@pcd1193182 pcd1193182 added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 22, 2023
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Sep 28, 2023
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15307
Closes openzfs#15308
behlendorf pushed a commit that referenced this issue Sep 28, 2023
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #15307
Closes #15308
lundman pushed a commit to openzfsonwindows/openzfs that referenced this issue Dec 12, 2023
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15307
Closes openzfs#15308
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant