Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: node crash with OOM despite memory limits set #20609

Closed
guotie opened this issue Dec 11, 2017 · 43 comments
Closed

*: node crash with OOM despite memory limits set #20609

guotie opened this issue Dec 11, 2017 · 43 comments
Labels
A-admission-control C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting.

Comments

@guotie
Copy link

guotie commented Dec 11, 2017

Is this a question, feature request, or bug report?

QUESTION

Have you checked our documentation at https://cockroachlabs.com/docs/stable/? If you could not find an answer there, please consider asking your question in our community forum at https://forum.cockroachlabs.com/, as it would benefit other members of our community.

Prefer live chat? Message our engineers on our Gitter channel at https://gitter.im/cockroachdb/cockroach.

BUG REPORT

  1. Please supply the header (i.e. the first few lines) of your most recent
    log file for each node in your cluster. On most unix-based systems
    running with defaults, this boils down to the output of

    grep -F '[config]' cockroach-data/logs/cockroach.log


I171211 00:37:44.628748 1 util/log/clog.go:992  [config] file created at: 2017/12/11 00:37:44
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] running on machine: node-02
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] binary: CockroachDB CCL v1.1.3 (linux amd64, built 2017/11/27 13:59:10, go1.8.3)
I171211 00:37:44.628748 1 util/log/clog.go:992  [config] arguments: [cockroach start --certs-dir=/home/guotie/certs --host=10.104.54.205 --join=10.104.189.106:26257,10.135.69.133:26257 --cache=25% --max-sql-memory=30%]

  1. Please describe the issue you observed:

I set memory usage less than 30%, but now , cockroach use 70% memory


  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                     
 9136 guotie    20   0 5010372 2.656g  15084 S  46.7 71.7  85:57.31 cockroach  
@guotie
Copy link
Author

guotie commented Dec 11, 2017

Now :

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                     
 9136 guotie    20   0 5010372 2.734g  15084 S   2.7 73.8  86:41.58 cockroach  

@jordanlewis
Copy link
Member

Hi @guotie, what workload are you running?

The --cache argument specifies what fraction of memory to give to RocksDB, and the --max-sql-memory specifies what fraction of memory to give to SQL execution. There will still be some overhead on top of that.

@guotie
Copy link
Author

guotie commented Dec 12, 2017

it finally crashed

@guotie
Copy link
Author

guotie commented Dec 12, 2017

[883436.024047] [22499]  1000 22499  1053641   898995    2015        0             0 cockroach
[883436.032105] Out of memory: Kill process 22499 (cockroach) score 928 or sacrifice child
[883436.033973] Killed process 22499 (cockroach) total-vm:4214564kB, anon-rss:3595980kB, file-rss:0kB, shmem-rss:0kB

@guotie
Copy link
Author

guotie commented Dec 12, 2017

top - 08:51:05 up 10 days, 16:29,  1 user,  load average: 1.47, 0.63, 0.28

@guotie guotie changed the title memory usage exceeds quota memory ratio memory leak Dec 12, 2017
@guotie
Copy link
Author

guotie commented Dec 13, 2017

Now, the cockroach panic on connect , the process was killed by os because of out of memory when client connect to server.

[guotie@node-01 logs]$ dmesg
[1033512.895767] YDService invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[1033512.897942] YDService cpuset=/ mems_allowed=0
[1033512.899496] CPU: 0 PID: 819 Comm: YDService Not tainted 3.10.0-514.26.2.el7.x86_64 #1
[1033512.901438] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[1033512.903078]  ffff88011655ce70 00000000048696c5 ffff880036927938 ffffffff81687133
[1033512.904979]  ffff8800369279c8 ffffffff816820de ffffffff810eb0dc ffff8800dba3bbb0
[1033512.906945]  ffff8800dba3bbc8 0000000000000206 ffff88011655ce70 ffff8800369279b8
[1033512.908649] Call Trace:
[1033512.909831]  [<ffffffff81687133>] dump_stack+0x19/0x1b
[1033512.911364]  [<ffffffff816820de>] dump_header+0x8e/0x225
[1033512.912906]  [<ffffffff810eb0dc>] ? ktime_get_ts64+0x4c/0xf0
[1033512.914591]  [<ffffffff8113d22f>] ? delayacct_end+0x8f/0xb0
[1033512.916267]  [<ffffffff81184d0e>] oom_kill_process+0x24e/0x3c0
[1033512.917841]  [<ffffffff811847ad>] ? oom_unkillable_task+0xcd/0x120
[1033512.919548]  [<ffffffff81184856>] ? find_lock_task_mm+0x56/0xc0
[1033512.921146]  [<ffffffff81093c0e>] ? has_capability_noaudit+0x1e/0x30
[1033512.922758]  [<ffffffff81185546>] out_of_memory+0x4b6/0x4f0
[1033512.924219]  [<ffffffff81682be7>] __alloc_pages_slowpath+0x5d7/0x725
[1033512.926064]  [<ffffffff8118b655>] __alloc_pages_nodemask+0x405/0x420
[1033512.927652]  [<ffffffff811cf9ca>] alloc_pages_current+0xaa/0x170
[1033512.929359]  [<ffffffff81180be7>] __page_cache_alloc+0x97/0xb0
[1033512.930949]  [<ffffffff81183760>] filemap_fault+0x170/0x410
[1033512.932845]  [<ffffffffa01b9016>] ext4_filemap_fault+0x36/0x50 [ext4]
[1033512.934575]  [<ffffffff811ac83c>] __do_fault+0x4c/0xc0
[1033512.936228]  [<ffffffff811accd3>] do_read_fault.isra.42+0x43/0x130
[1033512.937789]  [<ffffffff811b1461>] handle_mm_fault+0x6b1/0x1000
[1033512.939286]  [<ffffffff8168b6e7>] ? do_nanosleep+0xa7/0xf0
[1033512.940997]  [<ffffffff81692cc4>] __do_page_fault+0x154/0x450
[1033512.942586]  [<ffffffff81692ff5>] do_page_fault+0x35/0x90
[1033512.944158]  [<ffffffff8168f208>] page_fault+0x28/0x30
[1033512.945784] Mem-Info:
[1033512.947033] active_anon:920880 inactive_anon:12 isolated_anon:0
 active_file:1686 inactive_file:1911 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6527 slab_unreclaimable:2921
 mapped:420 shmem:92 pagetables:2917 bounce:0
 free:21045 free_pcp:195 free_cma:0
[1033512.957221] Node 0 DMA free:15340kB min:276kB low:344kB high:412kB active_anon:484kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:28kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:122 all_unreclaimable? yes
[1033512.966781] lowmem_reserve[]: 0 3327 3773 3773
[1033512.968668] Node 0 DMA32 free:60996kB min:59340kB low:74172kB high:89008kB active_anon:3275844kB inactive_anon:36kB active_file:6620kB inactive_file:7380kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3653624kB managed:3408868kB mlocked:0kB dirty:0kB writeback:0kB mapped:1668kB shmem:308kB slab_reclaimable:19676kB slab_unreclaimable:6208kB kernel_stack:1456kB pagetables:9940kB unstable:0kB bounce:0kB free_pcp:660kB local_pcp:540kB free_cma:0kB writeback_tmp:0kB pages_scanned:21860 all_unreclaimable? yes
[1033512.980250] lowmem_reserve[]: 0 0 446 446
[1033512.982165] Node 0 Normal free:7844kB min:7964kB low:9952kB high:11944kB active_anon:407192kB inactive_anon:12kB active_file:120kB inactive_file:260kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:457256kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:60kB slab_reclaimable:6404kB slab_unreclaimable:5472kB kernel_stack:656kB pagetables:1720kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:763 all_unreclaimable? yes
[1033512.993459] lowmem_reserve[]: 0 0 0 0
[1033512.995265] Node 0 DMA: 5*4kB (UE) 2*8kB (UM) 3*16kB (UM) 3*32kB (UEM) 3*64kB (UEM) 3*128kB (UEM) 3*256kB (UEM) 1*512kB (E) 3*1024kB (UEM) 1*2048kB (E) 2*4096kB (M) = 15348kB
[1033513.000564] Node 0 DMA32: 224*4kB (UEM) 497*8kB (UE) 461*16kB (EM) 294*32kB (UE) 223*64kB (UEM) 112*128kB (UE) 32*256kB (UEM) 3*512kB (UEM) 1*1024kB (M) 0*2048kB 0*4096kB = 61016kB
[1033513.005837] Node 0 Normal: 229*4kB (UEM) 204*8kB (UE) 96*16kB (UE) 48*32kB (UE) 15*64kB (E) 8*128kB (UE) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7860kB
[1033513.010732] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[1033513.013472] 3737 total pagecache pages
[1033513.015367] 0 pages in swap cache
[1033513.017252] Swap cache stats: add 0, delete 0, find 0/0
[1033513.020491] Free swap  = 0kB
[1033513.022621] Total swap = 0kB
[1033513.024695] 1048476 pages RAM
[1033513.027233] 0 pages HighMem/MovableOnly
[1033513.029203] 77968 pages reserved
[1033513.031638] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[1033513.034736] [  342]     0   342    41054     1116      81        0             0 systemd-journal
[1033513.038191] [  365]     0   365    10887      210      22        0         -1000 systemd-udevd
[1033513.042083] [  366]     0   366    29703       81      26        0             0 lvmetad
[1033513.044754] [  447]     0   447    28641      116      25        0         -1000 auditd
[1033513.047401] [  469]     0   469    26364      246      53        0         -1000 sshd
[1033513.050054] [  471]     0   471   118972      806     124        0             0 rsyslogd
[1033513.052726] [  472]     0   472   138281     2644      89        0             0 tuned
[1033513.055132] [  474]     0   474     6080      129      15        0             0 systemd-logind
[1033513.057879] [  478]   996   478     2130       36       9        0             0 lsmd
[1033513.060353] [  480]    81   480     6620      107      17        0          -900 dbus-daemon
[1033513.063290] [  508]    38   508    11677      182      26        0             0 ntpd
[1033513.065803] [  567]     0   567     5941       49      15        0             0 atd
[1033513.068394] [  568]     0   568    31034      158      17        0             0 crond
[1033513.070853] [  577]     0   577    27508       32      10        0             0 agetty
[1033513.073262] [  578]     0   578    27508       31      12        0             0 agetty
[1033513.075911] [  579]     0   579     1082       35       7        0             0 acpid
[1033513.078218] [  810]     0   810   204373     1546      73        0             0 YDService
[1033513.080653] [ 1917]     0  1917    24321       86      18        0             0 sgagent
[1033513.083271] [ 1929]     0  1929    38803     1638      30        0             0 barad_agent
[1033513.085876] [ 1975]     0  1975    25165      151      18        0             0 YDLive
[1033513.088057] [ 2118]  1000  2118    15253     3501      31        0             0 consul
[1033513.090215] [32764]   997 32764   132394     1895      57        0             0 polkitd
[1033513.092463] [32436]     0 32436    41136     1852      34        0             0 barad_agent
[1033513.094820] [32437]     0 32437   201249     2121      50        0             0 barad_agent
[1033513.097143] [23485]     0 23485    36939      328      73        0             0 sshd
[1033513.099473] [23487]  1000 23487    37029      366      71        0             0 sshd
[1033513.101982] [23488]  1000 23488    29038      295      15        0             0 bash
[1033513.104257] [ 7676]  1000  7676   989741   901557    1872        0             0 cockroach
[1033513.106155] [ 7729]  1000  7729    26983       24      10        0             0 tail
[1033513.108109] Out of memory: Kill process 7676 (cockroach) score 930 or sacrifice child
[1033513.110066] Killed process 7676 (cockroach) total-vm:3958964kB, anon-rss:3605292kB, file-rss:936kB, shmem-rss:0kB

@jordanlewis
Copy link
Member

Hi @guotie, what workload were you running against the server? It's possible that you need to devote more RAM to your instance to prevent OOM errors.

@knz knz changed the title memory leak *: node crash with OOM despite memory limits set Jan 30, 2018
@knz
Copy link
Contributor

knz commented Jan 30, 2018

@guotie in addition to Jordan's question, I would like to remind you it is unsafe to have the sum of the RocksDB cache size and the max SQL memory exceed 50-60% of your total available RAM. This is because CockroachDB also uses data in memory in addition to these two amounts.

@petermattis petermattis added the C-investigation Further steps needed to qualify. C-label will change. label Jul 21, 2018
@tbg tbg added this to the 2.1 milestone Jul 22, 2018
@jordanlewis
Copy link
Member

Closing this as unactionable. If you see something like this again, please check the logs/heap_profiler directory for a heap profile, which will help us understand the issue. We'll be happy to check it out.

@jordanlewis jordanlewis added the X-unactionable This was closed because it was unactionable. label Oct 11, 2018
@victorggsimida
Copy link

victorggsimida commented Dec 26, 2018

@jordanlewis I meet the same question.

my server has 96G RAM

cockroachdb config;
--max-sql-memory=.25 --cache=.25

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28218 xxx 20 0 0.105t 0.082t 54244 S 617.8 88.7 3045:26 cockroach

heap.tar.gz

just now the network is bad and can't upload it here.

@jordanlewis
Copy link
Member

Can you upload it to this issue? It shouldn't be too big. Or email it to me [email protected]

@jordanlewis jordanlewis reopened this Dec 26, 2018
@victorggsimida
Copy link

@jordanlewis upload it just now! thank you. and i also send it to the email.

@knz knz added O-community Originated from the community S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting. and removed X-unactionable This was closed because it was unactionable. labels Dec 27, 2018
@knz knz removed this from the 2.1 milestone Dec 27, 2018
@jordanlewis
Copy link
Member

@victorggsimida, what workload are you running to get this issue? There is a very large amount of memory tied up in the SQL parser. Are you running queries with very large strings?

image

@victorggsimida
Copy link

@jordanlewis

  1. worload
    60 threads to insert values to cockroachdb
    in web ui : I seed 4k inserts ops

the picture you show means that parse use 12G RAM. But the total memory used by cockroachdb is
about 96*0.88 GB.

Where the rest memory use for ?

  1. only insert sql, has no select sql

I user haproxy to balance query to different cockroachdb node.

but only this node has memory problem

@knz
Copy link
Contributor

knz commented Dec 28, 2018

I'll bet on long-lived connections with boatloads of prepared queries, never released -- the ASTs / input syntax strngs are preserved but never reused.

Maybe we could test that by placing a limit on the max number of prepared statements in a session. (Postgres has one - we could use the exact same)

@knz
Copy link
Contributor

knz commented Dec 28, 2018

@victorggsimida please explain how the client is sending the insert SQL to the haproxy.

Also please share haproxy configuration. How long does it keep the connection open?

@victorggsimida
Copy link

@knz
haproxy port: 1234
ip: a server , not one of the cockroadb nodes
and redirect the request to cockroachdb node (5 nodes)

client connect the ip+1234, with the threading pool developed by ourselves.

the connect is long connection;

@knz
Copy link
Contributor

knz commented Dec 28, 2018

See my comment above. I think your connection pool is using prepared statements and never releases the prepared statements. That would be the cause of the large memory usage.

@knz
Copy link
Contributor

knz commented Dec 28, 2018

@victorggsimida please check http://initd.org/psycopg/articles/2012/10/01/prepared-statements-psycopg/ -- are you doing anything similar to this?

@victorggsimida
Copy link

@knz
1.version info
Build Tag: v2.1.0
Build Time: 2018/10/30 12:32:34
Distribution: CCL
Platform: linux amd64 (x86_64-unknown-linux-gnu)
Go Version: go1.10.3
C Compiler: gcc 6.3.0
Build SHA-1: f0dd1dc
Build Type: release

  1. are you doing anything similar to this?

no . absolutely no. did not user Prepare in my sqls.

I use like this:
conn = psycopg2.connect(..)
cursor= conn.cursor()
cursor.execute_many or execute

no Prepare in my SQLs.

@victorggsimida
Copy link

@knz hi, is there more information I need to provide ?

@knz
Copy link
Contributor

knz commented Dec 30, 2018

I don't have further questions at that time but perhaps Jordan has.

@victorggsimida
Copy link

@jordanlewis hi Is there any information i need to provide to solve
this problem

@jordanlewis
Copy link
Member

@victorggsimida how many concurrent connections do you have in your app?

@victorggsimida
Copy link

@jordanlewis
hi,thank you for your reply. Pls help me to find the reason for this.

concurrent connections is set to 5 and 60 seperately.

but the problem is existed in these two concurrency.

And the memory used by cockroachdb is always in the specific one node.

when I start the app (extract data from mysql, and load data into cockrochdb with upsert sql), the problem happens

@knz
Copy link
Contributor

knz commented Jan 3, 2019

Please share the exact syntax of the upsert statements, and also the output of EXPLAIN(VERBOSE) on them

@victorggsimida
Copy link

UPSERT INTO {db}.{table} ({fields},synctime) VALUES ({value_fields},'{now}')

@victorggsimida
Copy link

@knz

@knz
Copy link
Contributor

knz commented Jan 7, 2019

How many value fields do you have per statement?

How large is the table you are inserting into?

Are you running upsert between begin..commit, or as individual statements?

@victorggsimida
Copy link

@knz

  1. per statment: may be server ten thousands.
  2. the table: are different size: from 100w to 1kw. in the future ,may be 10kw
  3. upsert is between begin and commit

@victorggsimida
Copy link

@knz
any progress about this issue? Because it has already lead the node to be killed by system.

@knz
Copy link
Contributor

knz commented Jan 18, 2019

Ok so a transaction with "tens of thousands" of values in a single mutation statement will not work and will cause various of the problems you observe, including server crashes if you do this multiple times in concurrent connections. We know of this problem and the workaround is simple: modify/insert fewer rows/values at a time!

What about you try again with a max of 100-1000 values per statement, and a max of 10000 rows modified per transaction. Would that work better?

@victorggsimida
Copy link

@knz
hi, tens of thousands valus will cause problem.
1.But why always cause problem in one node? I use the haproxy to blance. And actually the request is balanced

  1. 100-1000 insert per statement: will lower the qps?

@victorggsimida
Copy link

@knz I still have one question: why always one node with memory problem? because I use haproxy as load balance. and haproxy installed in another machine not in the cluster

@knz
Copy link
Contributor

knz commented Jan 31, 2019

You need to check in the web ui the number of SQL transactions/statements per node. I suspect that one node is taking more than its fair share despite the load balancer.

@jordanlewis
Copy link
Member

Closing again as unactionable. If you have a new problem we can help you with, please open a new issue and we'll take a look!

@coderplay
Copy link

We are facing the same problem, looks like cockroachdb can't limit its memory usage.

@jordanlewis
Copy link
Member

@coderplay many things have changed since 2017. You will need to provide much more information for us to be able to help you. You can open a new issue if you have a reproducible issue. Please include information like your schema, workload, and data in the heap_profiler directory of your log output. Thanks!

@coderplay
Copy link

@jordanlewis The use case and the failures are almost the same as what it was in 2017. Do you suggest posting the information here, or in another thread?

@jordanlewis
Copy link
Member

You can post here. If you have something reproducible, open a new thread.

@coderplay
Copy link

@jordanlewis just posted on #76282

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-admission-control C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting.
Projects
None yet
Development

No branches or pull requests

8 participants