*: node crash with OOM despite memory limits set #20609

guotie · 2017-12-11T12:10:43Z

Is this a question, feature request, or bug report?

QUESTION

Have you checked our documentation at https://cockroachlabs.com/docs/stable/? If you could not find an answer there, please consider asking your question in our community forum at https://forum.cockroachlabs.com/, as it would benefit other members of our community.

Prefer live chat? Message our engineers on our Gitter channel at https://gitter.im/cockroachdb/cockroach.

BUG REPORT

Please supply the header (i.e. the first few lines) of your most recent
log file for each node in your cluster. On most unix-based systems
running with defaults, this boils down to the output of

grep -F '[config]' cockroach-data/logs/cockroach.log


I171211 00:37:44.628748 1 util/log/clog.go:992  [config] file created at: 2017/12/11 00:37:44
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] running on machine: node-02
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] binary: CockroachDB CCL v1.1.3 (linux amd64, built 2017/11/27 13:59:10, go1.8.3)
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] arguments: [cockroach start --certs-dir=/home/guotie/certs --host=10.104.54.205 --join=10.104.189.106:26257,10.135.69.133:26257 --cache=25% --max-sql-memory=30%]

Please describe the issue you observed:

I set memory usage less than 30%, but now , cockroach use 70% memory


  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                     
 9136 guotie    20   0 5010372 2.656g  15084 S  46.7 71.7  85:57.31 cockroach

The text was updated successfully, but these errors were encountered:

guotie · 2017-12-11T12:27:36Z

Now :

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                     
 9136 guotie    20   0 5010372 2.734g  15084 S   2.7 73.8  86:41.58 cockroach

jordanlewis · 2017-12-11T15:45:47Z

Hi @guotie, what workload are you running?

The --cache argument specifies what fraction of memory to give to RocksDB, and the --max-sql-memory specifies what fraction of memory to give to SQL execution. There will still be some overhead on top of that.

guotie · 2017-12-12T00:49:10Z

it finally crashed

guotie · 2017-12-12T00:49:51Z

[883436.024047] [22499]  1000 22499  1053641   898995    2015        0             0 cockroach
[883436.032105] Out of memory: Kill process 22499 (cockroach) score 928 or sacrifice child
[883436.033973] Killed process 22499 (cockroach) total-vm:4214564kB, anon-rss:3595980kB, file-rss:0kB, shmem-rss:0kB

guotie · 2017-12-12T00:51:21Z

top - 08:51:05 up 10 days, 16:29,  1 user,  load average: 1.47, 0.63, 0.28

guotie · 2017-12-13T07:29:32Z

Now, the cockroach panic on connect , the process was killed by os because of out of memory when client connect to server.

[guotie@node-01 logs]$ dmesg
[1033512.895767] YDService invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[1033512.897942] YDService cpuset=/ mems_allowed=0
[1033512.899496] CPU: 0 PID: 819 Comm: YDService Not tainted 3.10.0-514.26.2.el7.x86_64 #1
[1033512.901438] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[1033512.903078]  ffff88011655ce70 00000000048696c5 ffff880036927938 ffffffff81687133
[1033512.904979]  ffff8800369279c8 ffffffff816820de ffffffff810eb0dc ffff8800dba3bbb0
[1033512.906945]  ffff8800dba3bbc8 0000000000000206 ffff88011655ce70 ffff8800369279b8
[1033512.908649] Call Trace:
[1033512.909831]  [<ffffffff81687133>] dump_stack+0x19/0x1b
[1033512.911364]  [<ffffffff816820de>] dump_header+0x8e/0x225
[1033512.912906]  [<ffffffff810eb0dc>] ? ktime_get_ts64+0x4c/0xf0
[1033512.914591]  [<ffffffff8113d22f>] ? delayacct_end+0x8f/0xb0
[1033512.916267]  [<ffffffff81184d0e>] oom_kill_process+0x24e/0x3c0
[1033512.917841]  [<ffffffff811847ad>] ? oom_unkillable_task+0xcd/0x120
[1033512.919548]  [<ffffffff81184856>] ? find_lock_task_mm+0x56/0xc0
[1033512.921146]  [<ffffffff81093c0e>] ? has_capability_noaudit+0x1e/0x30
[1033512.922758]  [<ffffffff81185546>] out_of_memory+0x4b6/0x4f0
[1033512.924219]  [<ffffffff81682be7>] __alloc_pages_slowpath+0x5d7/0x725
[1033512.926064]  [<ffffffff8118b655>] __alloc_pages_nodemask+0x405/0x420
[1033512.927652]  [<ffffffff811cf9ca>] alloc_pages_current+0xaa/0x170
[1033512.929359]  [<ffffffff81180be7>] __page_cache_alloc+0x97/0xb0
[1033512.930949]  [<ffffffff81183760>] filemap_fault+0x170/0x410
[1033512.932845]  [<ffffffffa01b9016>] ext4_filemap_fault+0x36/0x50 [ext4]
[1033512.934575]  [<ffffffff811ac83c>] __do_fault+0x4c/0xc0
[1033512.936228]  [<ffffffff811accd3>] do_read_fault.isra.42+0x43/0x130
[1033512.937789]  [<ffffffff811b1461>] handle_mm_fault+0x6b1/0x1000
[1033512.939286]  [<ffffffff8168b6e7>] ? do_nanosleep+0xa7/0xf0
[1033512.940997]  [<ffffffff81692cc4>] __do_page_fault+0x154/0x450
[1033512.942586]  [<ffffffff81692ff5>] do_page_fault+0x35/0x90
[1033512.944158]  [<ffffffff8168f208>] page_fault+0x28/0x30
[1033512.945784] Mem-Info:
[1033512.947033] active_anon:920880 inactive_anon:12 isolated_anon:0
 active_file:1686 inactive_file:1911 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6527 slab_unreclaimable:2921
 mapped:420 shmem:92 pagetables:2917 bounce:0
 free:21045 free_pcp:195 free_cma:0
[1033512.957221] Node 0 DMA free:15340kB min:276kB low:344kB high:412kB active_anon:484kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:28kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:122 all_unreclaimable? yes
[1033512.966781] lowmem_reserve[]: 0 3327 3773 3773
[1033512.968668] Node 0 DMA32 free:60996kB min:59340kB low:74172kB high:89008kB active_anon:3275844kB inactive_anon:36kB active_file:6620kB inactive_file:7380kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3653624kB managed:3408868kB mlocked:0kB dirty:0kB writeback:0kB mapped:1668kB shmem:308kB slab_reclaimable:19676kB slab_unreclaimable:6208kB kernel_stack:1456kB pagetables:9940kB unstable:0kB bounce:0kB free_pcp:660kB local_pcp:540kB free_cma:0kB writeback_tmp:0kB pages_scanned:21860 all_unreclaimable? yes
[1033512.980250] lowmem_reserve[]: 0 0 446 446
[1033512.982165] Node 0 Normal free:7844kB min:7964kB low:9952kB high:11944kB active_anon:407192kB inactive_anon:12kB active_file:120kB inactive_file:260kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:457256kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:60kB slab_reclaimable:6404kB slab_unreclaimable:5472kB kernel_stack:656kB pagetables:1720kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:763 all_unreclaimable? yes
[1033512.993459] lowmem_reserve[]: 0 0 0 0
[1033512.995265] Node 0 DMA: 5*4kB (UE) 2*8kB (UM) 3*16kB (UM) 3*32kB (UEM) 3*64kB (UEM) 3*128kB (UEM) 3*256kB (UEM) 1*512kB (E) 3*1024kB (UEM) 1*2048kB (E) 2*4096kB (M) = 15348kB
[1033513.000564] Node 0 DMA32: 224*4kB (UEM) 497*8kB (UE) 461*16kB (EM) 294*32kB (UE) 223*64kB (UEM) 112*128kB (UE) 32*256kB (UEM) 3*512kB (UEM) 1*1024kB (M) 0*2048kB 0*4096kB = 61016kB
[1033513.005837] Node 0 Normal: 229*4kB (UEM) 204*8kB (UE) 96*16kB (UE) 48*32kB (UE) 15*64kB (E) 8*128kB (UE) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7860kB
[1033513.010732] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[1033513.013472] 3737 total pagecache pages
[1033513.015367] 0 pages in swap cache
[1033513.017252] Swap cache stats: add 0, delete 0, find 0/0
[1033513.020491] Free swap  = 0kB
[1033513.022621] Total swap = 0kB
[1033513.024695] 1048476 pages RAM
[1033513.027233] 0 pages HighMem/MovableOnly
[1033513.029203] 77968 pages reserved
[1033513.031638] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[1033513.034736] [  342]     0   342    41054     1116      81        0             0 systemd-journal
[1033513.038191] [  365]     0   365    10887      210      22        0         -1000 systemd-udevd
[1033513.042083] [  366]     0   366    29703       81      26        0             0 lvmetad
[1033513.044754] [  447]     0   447    28641      116      25        0         -1000 auditd
[1033513.047401] [  469]     0   469    26364      246      53        0         -1000 sshd
[1033513.050054] [  471]     0   471   118972      806     124        0             0 rsyslogd
[1033513.052726] [  472]     0   472   138281     2644      89        0             0 tuned
[1033513.055132] [  474]     0   474     6080      129      15        0             0 systemd-logind
[1033513.057879] [  478]   996   478     2130       36       9        0             0 lsmd
[1033513.060353] [  480]    81   480     6620      107      17        0          -900 dbus-daemon
[1033513.063290] [  508]    38   508    11677      182      26        0             0 ntpd
[1033513.065803] [  567]     0   567     5941       49      15        0             0 atd
[1033513.068394] [  568]     0   568    31034      158      17        0             0 crond
[1033513.070853] [  577]     0   577    27508       32      10        0             0 agetty
[1033513.073262] [  578]     0   578    27508       31      12        0             0 agetty
[1033513.075911] [  579]     0   579     1082       35       7        0             0 acpid
[1033513.078218] [  810]     0   810   204373     1546      73        0             0 YDService
[1033513.080653] [ 1917]     0  1917    24321       86      18        0             0 sgagent
[1033513.083271] [ 1929]     0  1929    38803     1638      30        0             0 barad_agent
[1033513.085876] [ 1975]     0  1975    25165      151      18        0             0 YDLive
[1033513.088057] [ 2118]  1000  2118    15253     3501      31        0             0 consul
[1033513.090215] [32764]   997 32764   132394     1895      57        0             0 polkitd
[1033513.092463] [32436]     0 32436    41136     1852      34        0             0 barad_agent
[1033513.094820] [32437]     0 32437   201249     2121      50        0             0 barad_agent
[1033513.097143] [23485]     0 23485    36939      328      73        0             0 sshd
[1033513.099473] [23487]  1000 23487    37029      366      71        0             0 sshd
[1033513.101982] [23488]  1000 23488    29038      295      15        0             0 bash
[1033513.104257] [ 7676]  1000  7676   989741   901557    1872        0             0 cockroach
[1033513.106155] [ 7729]  1000  7729    26983       24      10        0             0 tail
[1033513.108109] Out of memory: Kill process 7676 (cockroach) score 930 or sacrifice child
[1033513.110066] Killed process 7676 (cockroach) total-vm:3958964kB, anon-rss:3605292kB, file-rss:936kB, shmem-rss:0kB

jordanlewis · 2017-12-20T04:12:07Z

Hi @guotie, what workload were you running against the server? It's possible that you need to devote more RAM to your instance to prevent OOM errors.

knz · 2018-01-30T15:15:00Z

@guotie in addition to Jordan's question, I would like to remind you it is unsafe to have the sum of the RocksDB cache size and the max SQL memory exceed 50-60% of your total available RAM. This is because CockroachDB also uses data in memory in addition to these two amounts.

jordanlewis · 2018-10-11T03:35:12Z

Closing this as unactionable. If you see something like this again, please check the logs/heap_profiler directory for a heap profile, which will help us understand the issue. We'll be happy to check it out.

victorggsimida · 2018-12-26T14:49:56Z

@jordanlewis I meet the same question.

my server has 96G RAM

cockroachdb config;
--max-sql-memory=.25 --cache=.25

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28218 xxx 20 0 0.105t 0.082t 54244 S 617.8 88.7 3045:26 cockroach

heap.tar.gz

just now the network is bad and can't upload it here.

jordanlewis · 2018-12-26T15:15:43Z

Can you upload it to this issue? It shouldn't be too big. Or email it to me [email protected]

victorggsimida · 2018-12-26T15:26:45Z

@jordanlewis upload it just now! thank you. and i also send it to the email.

jordanlewis · 2018-12-27T23:04:32Z

@victorggsimida, what workload are you running to get this issue? There is a very large amount of memory tied up in the SQL parser. Are you running queries with very large strings?

victorggsimida · 2018-12-28T01:51:33Z

@jordanlewis

worload
60 threads to insert values to cockroachdb
in web ui ： I seed 4k inserts ops

the picture you show means that parse use 12G RAM. But the total memory used by cockroachdb is
about 96*0.88 GB.

Where the rest memory use for ?

only insert sql， has no select sql

I user haproxy to balance query to different cockroachdb node.

but only this node has memory problem

knz · 2018-12-28T10:14:54Z

I'll bet on long-lived connections with boatloads of prepared queries, never released -- the ASTs / input syntax strngs are preserved but never reused.

Maybe we could test that by placing a limit on the max number of prepared statements in a session. (Postgres has one - we could use the exact same)

knz · 2018-12-28T10:43:41Z

@victorggsimida please explain how the client is sending the insert SQL to the haproxy.

Also please share haproxy configuration. How long does it keep the connection open?

victorggsimida · 2018-12-28T14:28:00Z

@knz
haproxy port: 1234
ip: a server , not one of the cockroadb nodes
and redirect the request to cockroachdb node (5 nodes)

client connect the ip+1234, with the threading pool developed by ourselves.

the connect is long connection;

knz · 2018-12-28T14:29:01Z

See my comment above. I think your connection pool is using prepared statements and never releases the prepared statements. That would be the cause of the large memory usage.

knz · 2018-12-28T15:00:05Z

@victorggsimida please check http://initd.org/psycopg/articles/2012/10/01/prepared-statements-psycopg/ -- are you doing anything similar to this?

victorggsimida · 2018-12-28T15:12:01Z

@knz
1.version info
Build Tag: v2.1.0
Build Time: 2018/10/30 12:32:34
Distribution: CCL
Platform: linux amd64 (x86_64-unknown-linux-gnu)
Go Version: go1.10.3
C Compiler: gcc 6.3.0
Build SHA-1: f0dd1dc
Build Type: release

are you doing anything similar to this?

no . absolutely no. did not user Prepare in my sqls.

I use like this:
conn = psycopg2.connect(..)
cursor= conn.cursor()
cursor.execute_many or execute

no Prepare in my SQLs.

victorggsimida · 2018-12-30T14:26:21Z

@knz hi, is there more information I need to provide ?

knz · 2018-12-30T14:30:26Z

I don't have further questions at that time but perhaps Jordan has.

victorggsimida · 2019-01-02T03:36:04Z

@jordanlewis hi Is there any information i need to provide to solve
this problem

jordanlewis · 2019-01-02T15:53:06Z

@victorggsimida how many concurrent connections do you have in your app?

victorggsimida · 2019-01-03T02:11:11Z

@jordanlewis
hi,thank you for your reply. Pls help me to find the reason for this.

concurrent connections is set to 5 and 60 seperately.

but the problem is existed in these two concurrency.

And the memory used by cockroachdb is always in the specific one node.

when I start the app (extract data from mysql, and load data into cockrochdb with upsert sql), the problem happens

knz · 2019-01-03T10:41:50Z

Please share the exact syntax of the upsert statements, and also the output of EXPLAIN(VERBOSE) on them

victorggsimida · 2019-01-07T01:56:52Z

UPSERT INTO {db}.{table} ({fields},synctime) VALUES ({value_fields},'{now}')

victorggsimida · 2019-01-07T01:57:00Z

@knz

knz · 2019-01-07T08:13:30Z

How many value fields do you have per statement?

How large is the table you are inserting into?

Are you running upsert between begin..commit, or as individual statements?

victorggsimida · 2019-01-12T15:00:32Z

@knz

per statment: may be server ten thousands.
the table: are different size: from 100w to 1kw. in the future ,may be 10kw
upsert is between begin and commit

victorggsimida · 2019-01-18T03:16:17Z

@knz
any progress about this issue? Because it has already lead the node to be killed by system.

knz · 2019-01-18T12:24:57Z

Ok so a transaction with "tens of thousands" of values in a single mutation statement will not work and will cause various of the problems you observe, including server crashes if you do this multiple times in concurrent connections. We know of this problem and the workaround is simple: modify/insert fewer rows/values at a time!

What about you try again with a max of 100-1000 values per statement, and a max of 10000 rows modified per transaction. Would that work better?

victorggsimida · 2019-01-21T03:09:31Z

@knz
hi, tens of thousands valus will cause problem.
1.But why always cause problem in one node? I use the haproxy to blance. And actually the request is balanced

100-1000 insert per statement: will lower the qps?

victorggsimida · 2019-01-31T03:37:13Z

@knz I still have one question: why always one node with memory problem? because I use haproxy as load balance. and haproxy installed in another machine not in the cluster

knz · 2019-01-31T14:04:11Z

You need to check in the web ui the number of SQL transactions/statements per node. I suspect that one node is taking more than its fair share despite the load balancer.

jordanlewis · 2019-05-15T19:03:22Z

Closing again as unactionable. If you have a new problem we can help you with, please open a new issue and we'll take a look!

coderplay · 2022-02-05T18:39:18Z

We are facing the same problem, looks like cockroachdb can't limit its memory usage.

jordanlewis · 2022-02-05T18:58:15Z

@coderplay many things have changed since 2017. You will need to provide much more information for us to be able to help you. You can open a new issue if you have a reproducible issue. Please include information like your schema, workload, and data in the heap_profiler directory of your log output. Thanks!

coderplay · 2022-02-06T01:14:12Z

@jordanlewis The use case and the failures are almost the same as what it was in 2017. Do you suggest posting the information here, or in another thread?

jordanlewis · 2022-02-06T02:46:55Z

You can post here. If you have something reproducible, open a new thread.

coderplay · 2022-02-09T06:57:36Z

@jordanlewis just posted on #76282

guotie changed the title ~~memory usage exceeds quota memory ratio~~ memory leak Dec 12, 2017

knz changed the title ~~memory leak~~ *: node crash with OOM despite memory limits set Jan 30, 2018

petermattis added the C-investigation Further steps needed to qualify. C-label will change. label Jul 21, 2018

tbg added this to the 2.1 milestone Jul 22, 2018

jordanlewis closed this as completed Oct 11, 2018

jordanlewis added the X-unactionable This was closed because it was unactionable. label Oct 11, 2018

jordanlewis reopened this Dec 26, 2018

knz added O-community Originated from the community S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting. and removed X-unactionable This was closed because it was unactionable. labels Dec 27, 2018

knz removed this from the 2.1 milestone Dec 27, 2018

jordanlewis added the A-admission-control label Feb 7, 2019

jordanlewis closed this as completed May 15, 2019

coderplay mentioned this issue Feb 9, 2022

kv: many large, concurrent INSERTs can overload intent resolution and lead to OOM #76282

Open

jlinder added the sync-me-3 label May 24, 2022

*: node crash with OOM despite memory limits set #20609

*: node crash with OOM despite memory limits set #20609

Comments

guotie commented Dec 11, 2017

guotie commented Dec 11, 2017

jordanlewis commented Dec 11, 2017

guotie commented Dec 12, 2017

guotie commented Dec 12, 2017

guotie commented Dec 12, 2017

guotie commented Dec 13, 2017

jordanlewis commented Dec 20, 2017

knz commented Jan 30, 2018

jordanlewis commented Oct 11, 2018

victorggsimida commented Dec 26, 2018 • edited Loading

jordanlewis commented Dec 26, 2018

victorggsimida commented Dec 26, 2018

jordanlewis commented Dec 27, 2018

victorggsimida commented Dec 28, 2018

knz commented Dec 28, 2018

knz commented Dec 28, 2018

victorggsimida commented Dec 28, 2018

knz commented Dec 28, 2018

knz commented Dec 28, 2018

victorggsimida commented Dec 28, 2018

victorggsimida commented Dec 30, 2018

knz commented Dec 30, 2018

victorggsimida commented Jan 2, 2019

jordanlewis commented Jan 2, 2019

victorggsimida commented Jan 3, 2019

knz commented Jan 3, 2019

victorggsimida commented Jan 7, 2019

victorggsimida commented Jan 7, 2019

knz commented Jan 7, 2019

victorggsimida commented Jan 12, 2019

victorggsimida commented Jan 18, 2019

knz commented Jan 18, 2019

victorggsimida commented Jan 21, 2019

victorggsimida commented Jan 31, 2019

knz commented Jan 31, 2019

jordanlewis commented May 15, 2019

coderplay commented Feb 5, 2022

jordanlewis commented Feb 5, 2022

coderplay commented Feb 6, 2022

jordanlewis commented Feb 6, 2022

coderplay commented Feb 9, 2022

victorggsimida commented Dec 26, 2018 •

edited

Loading