Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lock contention on sa_lock during concurrent file creation #4806

Closed
ahrens opened this issue Jun 26, 2016 · 1 comment
Closed

lock contention on sa_lock during concurrent file creation #4806

ahrens opened this issue Jun 26, 2016 · 1 comment
Labels
Status: Design Review Needed Architecture or design is under discussion Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem

Comments

@ahrens
Copy link
Member

ahrens commented Jun 26, 2016

When many threads are concurrently creating files, I observed that we spend quite a bit of time acquiring the sa_lock mutex from sa_find_layout(), sa_build_index(), and sa_index_tab_rele(). In this workload, we are not actually modifying anything protected by the sa_lock. We could change this to be a rwlock, and acquire it for RW_READER in these code paths. However, each thread acquiring it for RW_READER must modify the lock to increment/decrement the hold count. This requires moving the cache line to the running CPU, inducing cache line contention.

A fix for this bug could involve using a "fanned out reader-writer lock", i.e. a lock which has several sub-locks on different cache lines, approximately one for each CPU. To acquire this type of lock for reader, only the CPU's assigned sub-lock need be acquired. To acquire this type of lock for writer, all sub-locks must be acquired. Therefore, acquiring the lock for writer is more expensive than with a normal rwlock. The uses of sa_lock where the protected values are being modified, and therefore we would need to acquire the lock for writer, appear to not be exercised frequently on common workloads.

Addressing this problem can improve performance by up to 5%. This was measured by removing the lock entirely from these code paths (which works because nothing that the lock protects is modified in this workload). Note that the bottlenecks caused by #4636 #4641 #4703 #4804 #4805 may need to be fixed before fixing this bug will show a performance improvement. In terms of cost/reward, once the other bottlenecks are addressed, fixing this bug will provide a small performance gain and require a small amount of effort to implement.

Analysis sponsored by Intel Corp.

@behlendorf behlendorf added this to the 0.7.0 milestone Jul 11, 2016
@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Jul 11, 2016
@behlendorf behlendorf modified the milestones: 0.8.0, 0.7.0 Oct 11, 2016
@behlendorf behlendorf modified the milestones: 0.8.0, 0.9.0 Aug 3, 2018
@behlendorf behlendorf removed this from the 0.9.0 milestone Nov 11, 2019
@stale
Copy link

stale bot commented Nov 11, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Nov 11, 2020
@ahrens ahrens closed this as completed Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Design Review Needed Architecture or design is under discussion Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

3 participants
@behlendorf @ahrens and others