Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backfilling metadnode degrades object create rates #4460

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions man/man5/zfs-module-parameters.5
Original file line number Diff line number Diff line change
Expand Up @@ -1170,6 +1170,19 @@ Disable meta data compression
Use \fB1\fR for yes and \fB0\fR for no (default).
.RE

.sp
.ne 2
.na
\fBzfs_metadnode_backfill\fR (int)
.ad
.RS 12n
Enable backfilling of the metadnode array to avoid sparse dnode blocks.
Sparse blocks can cost disk space, memory overhead, and increased disk
I/O, while backfilling can limit overall object creation rates.
.sp
Use \fB1\fR for yes and \fB0\fR for no (default).
.RE

.sp
.ne 2
.na
Expand Down
10 changes: 9 additions & 1 deletion module/zfs/dmu_object.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
#include <sys/zap.h>
#include <sys/zfeature.h>

int zfs_metadnode_backfill = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to add a pool parameter for this instead of just a module option, so that it can be set persistently if users care more about dense allocations than create performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to make it a zpool property? I think that would be a bad idea, since this is essentially a workaround for a performance bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we can reliably detect holes using openzfs/openzfs#82, and add appropriate handling of ESRCH from dmu_object_next(), then this patch shouldn't be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that openzfs/openzfs#82 is sufficient to address the performance problem that this patch is working around. The problem is that dnode_next_offset (and dmu_object_next) don’t take into account objects allocated in memory but not yet synced to disk. Therefore if we allocate more than a L1 (the comment about L2 is inaccurate) worth of dnodes in one txg, we will end up calling dnode_hold_impl / dmu_object_next on every allocated-in-memory object when cross the L1 boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I misunderstood what that patch fixes. The symptoms are similar (dnode_next_offset() can detect fictional holes) but happen under different conditions.


uint64_t
dmu_object_alloc(objset_t *os, dmu_object_type_t ot, int blocksize,
dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx)
Expand Down Expand Up @@ -58,7 +60,9 @@ dmu_object_alloc(objset_t *os, dmu_object_type_t ot, int blocksize,
* described in traverse_visitbp.
*/
if (P2PHASE(object, L2_dnode_count) == 0) {
uint64_t offset = restarted ? object << DNODE_SHIFT : 0;
uint64_t offset =
(restarted || !zfs_metadnode_backfill) ?
object << DNODE_SHIFT : 0;
int error = dnode_next_offset(DMU_META_DNODE(os),
DNODE_FIND_HOLE,
&offset, 2, DNODES_PER_BLOCK >> 2, 0);
Expand Down Expand Up @@ -225,6 +229,10 @@ dmu_object_free_zapified(objset_t *mos, uint64_t object, dmu_tx_t *tx)
}

#if defined(_KERNEL) && defined(HAVE_SPL)
module_param(zfs_metadnode_backfill, int, 0644);
MODULE_PARM_DESC(zfs_metadnode_backfill,
"Enable backfilling of the metadnode array");

EXPORT_SYMBOL(dmu_object_alloc);
EXPORT_SYMBOL(dmu_object_claim);
EXPORT_SYMBOL(dmu_object_reclaim);
Expand Down