Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.3] Queue worker memory leak #16783

Closed
steve-rhodes opened this issue Dec 13, 2016 · 30 comments
Closed

[5.3] Queue worker memory leak #16783

steve-rhodes opened this issue Dec 13, 2016 · 30 comments

Comments

@steve-rhodes
Copy link

steve-rhodes commented Dec 13, 2016

  • Laravel Version: 5.3.26
  • PHP Version: 70
  • Database Driver & Version:

I run a single queue worker on my production server (EC2, Amazon Linux, nginx, PHP70) with supervisor.

The supervisor config is:

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/html/artisan queue:work sqs --sleep=3 --tries=3
autostart=true
autorestart=true
numprocs=1
redirect_stderr=true
stdout_logfile=/tmp/supervisor_worker.log

The php process than slowly starts eating up memory and after 3-4 days the server runs out of memory and becomes unresponsive.

I'm not even running jobs yet! It's just idle. I'm tracking the memory usage now and can see that it slowly and steadily goes up.

memory

@it-can
Copy link
Contributor

it-can commented Dec 13, 2016

Does this also happen when using a different driver? Beanstalk, redis DB?

@steve-rhodes
Copy link
Author

I think I know what it is, but want to wait a few hours watching the metrics before reporting back with 100% certainty.

@steve-rhodes
Copy link
Author

steve-rhodes commented Dec 13, 2016

Ok, I've solved the problem, which was caused by my own stupidity, but still should not have lead to a memory leak.

As I said, I haven't sent out any jobs yet. I was setting up a server for an upcoming project I'm working on and wanted to get the infrastructure like supervisor in place.

Although I set the QUEUE_DRIVER=sqs in .env (I'm using AWS SQS), I didn't have the details filled out in the config/queue.php. They were still set to all the default settings.

After setting the keys and other bits for 'sqs', there was no more memory leak.

I still think this shouldn't happen, even if the 'sqs' details are incorrect, so I leave it up to you to decide if the issue should be closed or not.

@steve-rhodes steve-rhodes changed the title Queue worker memory leak - Laravel 5.3 [5.3] Queue worker memory leak Dec 13, 2016
@GrahamCampbell
Copy link
Member

You should lower the flag that tells laravel how often to restart the queue. It's designed because there will be memory leaks.

@GrahamCampbell
Copy link
Member

GrahamCampbell commented Dec 15, 2016

--memory option. The default is 128 (MB) before it restarts. Maybe 32 would be better for your use case?

@steve-rhodes
Copy link
Author

I think it was going way above 128MB otherwise my server would not have run out of memory...

@steve-rhodes
Copy link
Author

@GrahamCampbell After I configured my SQS correctly it didn't go up for half a day, but then again slowly starting to creep up. I wanted to see if it will restart when reaching the default 128MB, but as you can see in the screenshot I took today, it's not happening. The workers are now at 16.1%, 15.6%, and 15.4% (I have four running). The server has a total of 1GB, so 16% = over 164MB. Why do they not restart?

This is my config and below a screenshot showing the memory on my server:

; Sample supervisor config file.
;
; For more information on the config file, please see:
; http://supervisord.org/configuration.html
;
; Notes:
;  - Shell expansion ("~" or "$HOME") is not supported.  Environment
;    variables can be expanded using this syntax: "%(ENV_HOME)s".
;  - Comments must have a leading space: "a=b ;comment" not "a=b;comment".

[unix_http_server]
file=/tmp/supervisor.sock   ; (the path to the socket file)
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; (default is no username (open server))
;password=123               ; (default is no password (open server))

;[inet_http_server]         ; inet (TCP) server disabled by default
;port=127.0.0.1:9001        ; (ip_address:port specifier, *:port for all iface)
;username=user              ; (default is no username (open server))
;password=123               ; (default is no password (open server))

[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB        ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10           ; (num of main logfile rotation backups;default 10)
loglevel=info                ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false               ; (start in foreground if true;default false)
minfds=1024                  ; (min. avail startup file descriptors;default 1024)
minprocs=200                 ; (min. avail process descriptors;default 200)
;umask=022                   ; (process file creation umask;default 022)
;user=chrism                 ; (default is current user, required if root)
;identifier=supervisor       ; (supervisord identifier, default is 'supervisor')
;directory=/tmp              ; (default is not to cd during start)
;nocleanup=true              ; (don't clean up tempfiles at start;default false)
;childlogdir=/tmp            ; ('AUTO' child log dir, default $TEMP)
;environment=KEY="value"     ; (key value pairs to add to environment)
;strip_ansi=false            ; (strip ansi escape codes in logs; def. false)

; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as http_username if set
;password=123                ; should be same as http_password if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

; The below sample program section shows all possible program subsection values,
; create one or more 'real' program: sections to be able to control them under
; supervisor.

;[program:theprogramname]
;command=/bin/cat              ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1                    ; number of processes copies to start (def 1)
;directory=/tmp                ; directory to cwd to before exec (def no cwd)
;umask=022                     ; umask for process (default None)
;priority=999                  ; the relative start priority (default 999)
;autostart=true                ; start at supervisord start (default: true)
;startsecs=1                   ; # of secs prog must stay up to be running (def. 1)
;startretries=3                ; max # of serial start failures when starting (default 3)
;autorestart=unexpected        ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2                 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT               ; signal used to kill process (default TERM)
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; send stop signal to the UNIX process group (default false)
;killasgroup=false             ; SIGKILL the UNIX process group (def false)
;user=chrism                   ; setuid to this UNIX account to run the program
;redirect_stderr=true          ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path        ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
;stderr_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A="1",B="2"       ; process environment additions (def no adds)
;serverurl=AUTO                ; override serverurl computation (childutils)

; The below sample eventlistener section shows all possible
; eventlistener subsection values, create one or more 'real'
; eventlistener: sections to be able to handle event notifications
; sent by supervisor.

;[eventlistener:theeventlistenername]
;command=/bin/eventlistener    ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1                    ; number of processes copies to start (def 1)
;events=EVENT                  ; event notif. types to subscribe to (req'd)
;buffer_size=10                ; event buffer queue size (default 10)
;directory=/tmp                ; directory to cwd to before exec (def no cwd)
;umask=022                     ; umask for process (default None)
;priority=-1                   ; the relative start priority (default -1)
;autostart=true                ; start at supervisord start (default: true)
;startsecs=1                   ; # of secs prog must stay up to be running (def. 1)
;startretries=3                ; max # of serial start failures when starting (default 3)
;autorestart=unexpected        ; autorestart if exited after running (def: unexpected)
;exitcodes=0,2                 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT               ; signal used to kill process (default TERM)
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; send stop signal to the UNIX process group (default false)
;killasgroup=false             ; SIGKILL the UNIX process group (def false)
;user=chrism                   ; setuid to this UNIX account to run the program
;redirect_stderr=false         ; redirect_stderr=true is not allowed for eventlisteners
;stdout_logfile=/a/path        ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A="1",B="2"       ; process environment additions
;serverurl=AUTO                ; override serverurl computation (childutils)

; The below sample group section shows all possible group values,
; create one or more 'real' group: sections to create "heterogeneous"
; process groups.

;[group:thegroupname]
;programs=progname1,progname2  ; each refers to 'x' in [program:x] definitions
;priority=999                  ; the relative start priority (default 999)

; The [include] section can just contain the "files" setting.  This
; setting can list multiple files (separated by whitespace or
; newlines).  It can also contain wildcards.  The filenames are
; interpreted as relative to this file.  Included files *cannot*
; include files themselves.

;[include]
;files = relative/directory/*.ini

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/html/artisan queue:work sqs --sleep=3 --tries=3
autostart=true
autorestart=true
numprocs=4
redirect_stderr=true
stdout_logfile=/tmp/supervisor_worker.log

screen

@GrahamCampbell
Copy link
Member

Why do they not restart?

Try lowering the limit. What memory_get_usage returns is probably less than you expect.

@steve-rhodes
Copy link
Author

@GrahamCampbell I don't understand. When you look what top reports, you can see that there is only 924360k free from a total of 1019316k. What has memory_get_usage to do with my issue?

I lowered it to 64MB. Let's see if they will restart.

@GrahamCampbell
Copy link
Member

memory_get_usage is the function laravel calls from inside PHP, and it will report differently to top.

@steve-rhodes
Copy link
Author

@GrahamCampbell But isn't 164MB way over where a restart is due, even if there is a difference in what PHP reports? I still don't get what you are trying to tell me.

I understand that how the worker measures its memory usage might be different from what I see in the console with top. But what is the solution? Because I'm running out of RAM. Why is there such a big discrepancy? I got a couple of alarms today from AWS that my server is running low on memory.

@nokios
Copy link

nokios commented Mar 23, 2017

I realize this issue is closed but I am experiencing this problem too. I am running php 7.1.2 on ubuntu 16.04 and centos (CentOS Linux release 7.3.1611 (Core)) systems and I see a drastic difference between what top/htop/ps -aux all report vs what php's own memory_get_usage() reports (way less).

Thus, my system runs out of memory while the processes themselves think they're well under the 128MB limit. I am not sure if this is a php internals issue or what.

I will say my current work around is a scheduled hourly soft restart with this in Console/Kernel.php:

        // Used to combat memory creep issues that I can't solve otherwise at this moment.
        $schedule->command('queue:restart')->hourly();

@djoks
Copy link

djoks commented Apr 6, 2017

This is most definitely an unresolved bug.

@Eugst
Copy link

Eugst commented Jun 12, 2017

+1 unresolved bug!
@GrahamCampbell please reopen an issue.

@tedokon
Copy link

tedokon commented Jul 18, 2017

I've got the same issue.
+1 unresolved bug.

@GrahamCampbell

@decadence
Copy link
Contributor

5.3 is not supported. Raise an issue with fresh Laravel version.

@kevindingwang
Copy link

+1 unresolved bug.

@chasebolt
Copy link

also hitting this memory leak issue with laravel 5.6 php 7.2 on centos 7

@kalemdzievski
Copy link

Also having the same issue.

+1 unresolved bug!
@GrahamCampbell please reopen an issue.

@ahmeddabak
Copy link
Contributor

+1

1 similar comment
@ajosephjohnson
Copy link

+1

@movAX13h
Copy link

movAX13h commented Oct 8, 2018

+1 ... as soon as I call a page that uses the database, memory usage increases by 200mb/s. It also happens with php artisan routes. Tried in windows with wamp and ubuntu apache/php/mysql latest versions. No, the limit is not the issue. If I let it, it consumes more than 10GB in both of my tests and it never finishes. No error reported that is of any use. This problem goes deep. I'm quite sure it is a misconfiguration somewhere or an incompatibility of a module but without somewhat useful error messages it's impossible to find out.

@majuansari
Copy link

This issue still persists in 5.5. Event after using memory_get_usage(true)

public function memoryExceeded($memoryLimit)
{
    return (memory_get_usage(true) / 1024 / 1024) >= $memoryLimit;
}

197.266 MB php
199.75 MB php
201.355 MB php
207.438 MB php
211.863 MB php
215.105 MB php
217.805 MB php
219.723 MB php
229.617 MB php
235.758 MB php

I am using the default value laravel configured => 128mb

@t202wes
Copy link

t202wes commented Dec 24, 2018

I'm having the same problem, with AWS and workers. the larval server just uses more and more memory until it fails. I have no idea what the issue is. All of the corncobs are queued up via HTTP requests from AWS workers environment, so they should all be closing and clearing memory after the completion of each.

image

@briedis
Copy link

briedis commented Jan 2, 2019

Running in to memory issues after upgrading PHP from 7.0 to 7.2 (Laravel 5.4). Perhaps an gc issue?

I assume this --memory flag is not documented anywhere? If the default is 128, it does not look like it works, because my long running workers just keep hogging more and more (until 16GB are exhausted).

Would be nice if the documentation was updated with description how this flag works. Does it kill the process mid-way, or it just stops it in between jobs?

@chris97pl
Copy link

I just noticed the same thing for Redis driver and Laravel 5.7. Memory usage goes up about 1MB every 10-25 seconds. Thankfully artisan queue:restart works great, but the problem still persists.

@majuansari
Copy link

The problem here is the memory used being calculated always returning the wrong value. What I am doing is running artisan queue:restart every 20 minutes as we have some memory intensive jobs. And its working well so far

@mfn
Copy link
Contributor

mfn commented Jan 10, 2019

See also laravel/horizon#375

So far I've seen no one who experiences this actually sitting down and properly debugging this.

@majuansari

always returning the wrong value

This usually is the case when there's leak in some extension/3rd party code which does not count towards PHPs own memory; likely because they use their own allocator. But just a guess, you didn't really provide much information (and maybe this issue isn't the best place to debug your code).

@chris97pl

Memory usage goes up about 1MB every 10-25 seconds.

This sounds like it increases with every processed job? It's just a guess but it sounds like this. Only you are able to diagnose as you wrote the job/code.

@briedis

Perhaps an gc issue?

There was posted an interesting solution more or less recently at laravel/ideas#1380 (comment) , you can add this yourself for the time being if it solves your issue.

Just trying to help here, not dissing your problems 😄

@imonoid
Copy link

imonoid commented Feb 21, 2019

@FoxxMD
Copy link

FoxxMD commented Jul 24, 2019

Also experiencing this issue. @mfn 's suggestion above to use gc_collect_cycles() works well!

I have another workaround that is helpful for anyone using supervisord with long-running workers: the superlance plugin for supervisord can monitor memory usage for each program it runs using memmon. superlance installed the same way supervisord is (through pip).

In supervisord.conf add another program like so to automatically restart any process consuming more than 100MB of memory:

[eventlistener:memmonListener]
command=memmon -a 100MB
events=TICK_60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests