-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel zpool import hangs #16172
Comments
@grwilson I plan to work on this bug this week. But to get me started, do you have any good guesses about what could be wrong? |
Update: I've identified the root cause of the bug. The process is running into a thread limit. In my test setup, I could conceivably need 3079 threads to fully populate both the thread pool created in Digging deeper, I see that the As a solution, I suggest:
|
Not too long ago I've hit/fixed similar problem when we created enormous amount threads to read spacemaps, that broke zdb on large FreeBSD systems due to the same inability to create zillions of threads. We should be more careful of number of threads we create. Direct execution may not work if different threads are used to decouple locks or some other resource. |
My only other ideas aren't perfect either:
Do you favor any of those, @amotin ? |
During parallel zpool import, /sbin/zpool will create a separate thread pool for each pool, used to mount that pool's datasets. If the total thread count exceed's the system's limit on threads per process, then tpool_dispatch may fail. If it does, directly execute the mount operation instead. Sponsored by: Axcient Signed-off-by: Alan Somers <[email protected]> Fixes openzfs#16172
During parallel zpool import, /sbin/zpool will create a separate thread pool for each pool, used to mount that pool's datasets. If the total thread count exceed's the system's limit on threads per process, then tpool_dispatch may fail. If it does, directly execute the mount operation instead. Sponsored by: Axcient Signed-off-by: Alan Somers <[email protected]> Fixes openzfs#16172
During parallel zpool import, /sbin/zpool will create a separate thread pool for each pool, used to mount that pool's datasets. If the total thread count exceed's the system's limit on threads per process, then tpool_dispatch may fail. If it does, directly execute the mount operation instead. Sponsored by: Axcient Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Alan Somers <[email protected]> Closes openzfs#16178 Fixes openzfs#16172
System information
Describe the problem you're observing
When attempting to import six pools in parallel, the
zpool import
command hangs. It seems to import all pools, but fail to mount datasets for some of them. It has hanged on six attempts so far out of seven. (The good news is that the one time it didn't hang, it ran twice as fast as before the parallel pool import feature.)Describe how to reproduce the problem
I created six zpools, each consisting of a 4-disk RAIDZ2 array. All disks were 2.5" SAS 1.8TB SEAGATE ST2000NX0263. Then I created 1024 child file systems on each pool. All children are mounted, but they contain no files. To reproduce the hang, I just run
zpool import -a
.Include any warning/errors/backtraces from the system logs
There are no errors in any logs.
Here are the stack traces from the hung zpool process:
Thread 1
Thread 2
Thread 3
Thread 4
Thread 5
Thread 6
Thread 7
The text was updated successfully, but these errors were encountered: