-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory allocation spamming the system #917
Comments
Confirmed here. Also while scrubbing. |
I knew that despite my best efforts to test the swap patches I'd miss a few instances of KM_SLEEP -> KM_PUSHPAGE. So I add the above warning which would correct this issue and log it to the console so it would get immediately fixed. The above patch will fix those particular allocations but there may still be a few more so please let me know if you see any. |
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <[email protected]> Issue #917
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Chris Dunlop <[email protected]> Issue openzfs#917
With @b404a3f applied, on starting a replace:
What's the approach to fixing these? E.g. from the first non-'?' line in the call trace (what do the '?' lines actually mean?) after the
So this If so, I can scan through the rest of the call traces with:
which gets me:
...all encapsulated in chrisrd/zfs@1706452 |
Fix typo in 1706452 Signed-off-by: Chris Dunlop <[email protected]> Issue openzfs#917
FWIW, with @b404a3f + chrisrd/zfs@1706452 + chrisrd/zfs@ae3b38a, my resilver has been running for 1.5 hrs & 2.3 TB without any "fixed allocation" kernel messages. |
Yes, that's exactly the right fix. We need to change those KM_SLEEPs to KM_PUSHPAGEs to disable direct memory reclaim for these paths. Otherwise we risk an unlikely but possible deadlock. The new warning we address to detect this possibility, fix the flags to prevent it, and log a message so we can get it fixed. I'll get your fixes merged in to master. I'm glad it's running quietly now. Please make sure you report any new instances we make find some in less common code paths. |
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #917
More:
|
Congrats, you found another. The above patch should resolve it, commit 17f3cc0648b8427d859c7573b3b2c9635d2eb8c8 |
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#917
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <[email protected]> Issue #917
Wash... rinse... repeat... It has occurred to me that it might be wise to rate limit these warnings. Getting one warning with be enough to fix the next instance and there's no reason to spam the machine. Thoughts? |
Whilst that's certainly better than filling people's log partition, filling the log partition has the nice side effect of |
I was thinking maybe we log once without the stack-trace then halve the limit down to say order-1 (if you have a lots of those fail things are bad) that way you'll get an order-5 faliure in most cases and nothing else, and in rare cases 4, 3, 2, then a series one order-1 failures which i expect to be rare (amd64 has 8k stacks and various other things so order-1 failures would be very noticable) |
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <[email protected]> Issue #917
[ Sorry this is so long, I am trying to include everything that might be relevant. ] Last night I updated to the newest module available for Debian (v0.6.0.74-rc10) on my 64 bit machine, kernel 3.5.3 + Con Kolivas' BFS scheduler. About 3 hours after boot I got 5,507 stack traces after a single warning:
The stack trace:
A couple hours after that I started getting page allocation falures from z_wr_iss, though one was from a java backup system.
If I understand order correctly, the system was trying to allocate 512MB? That seems huge. My kernel.shmmax is set to 512MB for zoneminder, but I have no ideas if this could be related. zoneminder is currently working correctly, but the java backup system is hung. A full dump from the most recent allocation failure:
Did I miss any needed info? Thanks, |
The first issue, the stack traces, is already fixed in the latest source. You may need to pull from master. The second memory allocations are a more interesting side effect. Your right, they are huge (blame gzip), and they are usually on a slab. But the emergency slab object look like the kicked in here. Probably the most straight forward fix is to simply add an slab flag to optionally disable this support for the gzip work slabs. That would basically revert this to the previous code which was working well. I'll make patch. |
I installed v0.6.0.75-rc10 last night and got neither error. Thanks! |
The workspace required by zlib to perform compression is roughly 512MB (order-7). These allocations are so large that we should never attempt to directly kmalloc an emergency object for them. It is far preferable to asynchronously vmalloc an additional slab in case it's needed. Then simply block waiting for an existing object to be released or for the new slab to be allocated. This can be accomplished by disabling emergency slab objects by passing the KMC_NOEMERGENCY flag at slab creation time. Signed-off-by: Brian Behlendorf <[email protected]> openzfs/zfs#917
Has anyone running the latest source observed any more of these warnings? |
I have seen no warning from the SPL after upgrading to 0.6.0.75-0ubuntu1~oneiric1 (about 6 days ago). |
The issue seems to have gone away. We can reopen this if it occurs again. |
Good. To ensure we don't accidentally introduce any new instances of this, when the packages are built with --enable-debug this messages are fatal. That means I should catch most of them in my automated testing. |
Fatal? Ouch. Perhaps the fatal should be enabled with a specific non-default compile time flag for a release or two? |
It's part of --enable-debug which is disabled by default. it's also fatal to ensure I notice. But it could have its own option if this becomes a problem. |
I've been running with --enable-debug, but I'd prefer not to have avoidable fatals (although perhaps I'm already subject to these). A possible feature suggestion would be to have --enable-debug as "slower with additional checks, but just as solid" and a new --enable-debug-fatal for "you'll really notice these!". |
@chrisrd If your enabling debug then your already subject to these. A large number of fatal assertions will be enabled in the code. These are all for things that should never happen, but if they do they are fatal. The same is true for this debugging. It should never happen. |
Running latest daily build (from 14-09-2012), noticed this message in the kernel log: SPL: Fixing allocation for task txg_sync (782) which used GFP flags 0xbc7bb1c with PF_NOFS set |
@Dremon Thank you, fixed. |
This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit b8d06fc for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <[email protected]> Issue #917
…#917) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.7 to 0.7.8. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-util-0.7.7...tokio-util-0.7.8) --- updated-dependencies: - dependency-name: tokio-util dependency-type: indirect update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
During srub:
SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set
SPL: Showing stack for process 6093
Pid: 6093, comm: txg_sync Tainted: G O 3.5.3-cw3 #5
Call Trace:
[] spl_debug_dumpstack+0x30/0x32 [spl]
[] sanitize_flags+0x73/0x82 [spl]
[] kmem_alloc_debug+0x13d/0x2d9 [spl]
[] ? load_balance+0xa8/0x5e2
[] fm_nvlist_create+0x36/0xb6 [zfs]
[] zfs_ereport_start+0x93/0x620 [zfs]
[] ? load_TLS+0xb/0xf
[] ? __switch_to+0x1a7/0x37c
[] ? paravirt_read_tsc+0x9/0xd
[] ? vdev_resilver_needed+0x8f/0x112 [zfs]
[] zfs_ereport_post+0x3f/0x5a [zfs]
[] spa_event_notify+0x22/0x24 [zfs]
[] dsl_scan_setup_sync+0xf6/0x1bd [zfs]
[] dsl_sync_task_group_sync+0x11a/0x175 [zfs]
[] dsl_pool_sync+0x1ac/0x3d2 [zfs]
[] spa_sync+0x475/0x82d [zfs]
[] txg_sync_thread+0x247/0x3b7 [zfs]
[] ? txg_do_callbacks+0x52/0x52 [zfs]
[] thread_generic_wrapper+0x71/0x7e [spl]
[] ? __thread_create+0x2db/0x2db [spl]
[] kthread+0x8b/0x93
[] kernel_thread_helper+0x4/0x10
[] ? retint_restore_args+0x13/0x13
[] ? kthread_worker_fn+0x149/0x149
[] ? gs_change+0x13/0x13
{ repeats over and over crushing the system beacuse i have a serial console enabled }
The text was updated successfully, but these errors were encountered: