illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object #4973

ahrens · 2016-08-15T20:56:26Z

illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object

Note: this depends on #4972

Using a benchmark which has 32 threads creating 2 million files in the
same directory, on a machine with 16 CPU cores, I observed poor
performance. I noticed that dmu_tx_hold_zap() was using about 30% of
all CPU, and doing dnode_hold() 7 times on the same object (the ZAP
object that is being held).

dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is
running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the
dnode_t that we already have in hand, rather than repeatedly calling
dnode_hold(). To do this, we need to pass the dnode_t down through
all the intermediate calls that dmu_tx_hold_zap() makes, making these
routines take the dnode_t* rather than an objset_t* and a uint64_t
object number. In particular, the following routines will need to have
analogous *_by_dnode() variants created:

dmu_buf_hold_noread()
dmu_buf_hold()
zap_lookup()
zap_lookup_norm()
zap_count_write()
zap_lockdir()
zap_count_write()

This can improve performance on the benchmark described above by 100%,
from 30,000 file creations per second to 60,000. (This improvement is on
top of that provided by working around the object allocation issue. Peak
performance of ~90,000 creations per second was observed with 8 CPUs;
adding CPUs past that decreased performance due to lock contention.) The
CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds
to 40 CPU-seconds.

Sponsored by: Intel Corp.

Closes #4641

zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp.

Using a benchmark which has 32 threads creating 2 million files in the same directory, on a machine with 16 CPU cores, I observed poor performance. I noticed that dmu_tx_hold_zap() was using about 30% of all CPU, and doing dnode_hold() 7 times on the same object (the ZAP object that is being held). dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the dnode_t that we already have in hand, rather than repeatedly calling dnode_hold(). To do this, we need to pass the dnode_t down through all the intermediate calls that dmu_tx_hold_zap() makes, making these routines take the dnode_t* rather than an objset_t* and a uint64_t object number. In particular, the following routines will need to have analogous *_by_dnode() variants created: dmu_buf_hold_noread() dmu_buf_hold() zap_lookup() zap_lookup_norm() zap_count_write() zap_lockdir() zap_count_write() This can improve performance on the benchmark described above by 100%, from 30,000 file creations per second to 60,000. (This improvement is on top of that provided by working around the object allocation issue. Peak performance of ~90,000 creations per second was observed with 8 CPUs; adding CPUs past that decreased performance due to lock contention.) The CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds to 40 CPU-seconds. Sponsored by: Intel Corp. Closes openzfs#4641

kernelOfTruth · 2016-08-16T23:10:20Z

looks familiar: #4710

great improvement, btw ! 👍

thanks

behlendorf · 2016-08-16T23:32:18Z

@kernelOfTruth yes, they're the same patches. We've been running these patches locally for a while and they work well.

I had been holding off merging them until they were also applies to OpenZFS, just to make sure both repositories applied the same version of the patch. But I'm OK applying them now as long as the expectation is that OpenZFS will also apply them. @ahrens do you have a guess when that might happen?

The CentOS 6.x failure here was an infrastructure failure and unrelated. And as I mentioned above we do already have a lot of run time on these patches.

ahrens · 2016-08-16T23:33:49Z

Whoops, sorry I didn't realize there was already a PR open for this. Let me see about merging the OpenZFS/illumos patches this week.

behlendorf · 2016-08-19T19:53:39Z

Merged as:

2bce804 OpenZFS 7004 - dmu_tx_hold_zap() does dnode_hold() 7x on same object

ahrens added 2 commits August 15, 2016 13:51

7003 zap_lockdir() should tag hold

e1a4e9c

zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which tags the hold on the zap. This will help diagnose programming errors which misuse the hold on the ZAP. Sponsored by: Intel Corp.

behlendorf added this to the 0.7.0 milestone Aug 16, 2016

behlendorf closed this Aug 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object #4973

illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object #4973

ahrens commented Aug 15, 2016

kernelOfTruth commented Aug 16, 2016 •

edited

Loading

behlendorf commented Aug 16, 2016

ahrens commented Aug 16, 2016

behlendorf commented Aug 19, 2016

illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object #4973

illumos 7004 dmu_tx_hold_zap() does dnode_hold() 7x on same object #4973

Conversation

ahrens commented Aug 15, 2016

kernelOfTruth commented Aug 16, 2016 • edited Loading

behlendorf commented Aug 16, 2016

ahrens commented Aug 16, 2016

behlendorf commented Aug 19, 2016

kernelOfTruth commented Aug 16, 2016 •

edited

Loading