Increasing memory usage in `update_current_time` #1500

FUSAKLA · 2019-02-28T07:10:01Z

Describe the bug
I see constantly increasing memory usage of my running Sanic instance.
After using tracemalloc to identify cause of this I found top 5 memory allocations

/usr/local/lib/python3.5/dist-packages/sanic/server.py:547: size=16.5 MiB, count=113860, average=152 B
<frozen importlib._bootstrap_external>:484: size=1075 KiB, count=17138, average=64 B
/usr/lib/python3.5/copy.py:240: size=882 KiB, count=14116, average=64 B
/usr/local/lib/python3.5/dist-packages/sanic/server.py:328: size=540 KiB, count=4940, average=112 B
/usr/local/lib/python3.5/dist-packages/sanic/server.py:157: size=434 KiB, count=3967, average=112 B

If I look at the /usr/local/lib/python3.5/dist-packages/sanic/server.py:547 which has size=16.5 MiB it leads to
https://github.com/huge-success/sanic/blob/52deebaf65ab48d5fbfa92607f22d96ee1bdb7a7/sanic/server.py#L607

Memory consumption graph (gaps are OOM kills due to memory limits in Kubernetes)

Could there be an issue with this part of coude in terms of leaking memory?

Environment (please complete the following information):

Debian stretch running in Kubernetes (Docker)
Version sanic-18.12.0-py3

Additional context
Sorry I'm not a Python expert, maybe I'm using the tracemalloc the wrong way or the issue s somewhere else. In that case I'm sorry and would be glad for any Ideas where to look or how to profile it.

Thanks in advance

The text was updated successfully, but these errors were encountered:

kbcasagrande · 2019-02-28T12:45:37Z

I was about to open an issue about it. I've also just identified where the possible leak comes from;
I checked the memory usage using mem_top;

After few requests, the output is:

refs:
1210    <class 'dict'> {140519667199136: <weakref at 0x7fcd3b70d638; to 'type' at 0x7fcd48d8eca0 (type)>, 140519667196960: 
1122    <class 'list'> ['import logging\n', 'import logging.config\n', 'import os\n', 'import re\n', 'import warnings\n', '
910     <class 'dict'> {'application/octet-stream': ['.a', '.bin', '.dll', '.exe', '.o', '.obj', '.so', '.lha', '.lzh', '.c
444     <class 'dict'> {'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': 
366     <class 'dict'> {'__name__': 'socket', '__doc__': "This module provides socket operations and some related functions
339     <class 'dict'> {'__name__': 'os', '__doc__': "OS routines for NT or Posix depending on what system we're on.\n\nThi
337     <class 'dict'> {'__name__': '_socket', '__doc__': 'Implementation module for socket operations.\n\nSee the socket m
337     <class 'dict'> {'__name__': '_socket', '__doc__': 'Implementation module for socket operations.\n\nSee the socket m
336     <class 'list'> ['fromfd', 'getfqdn', 'create_connection', 'AddressFamily', 'SocketKind', 'AF_ALG', 'AF_APPLETALK', 
310     <class 'list'> ['altsep', 'curdir', 'pardir', 'sep', 'pathsep', 'linesep', 'defpath', 'name', 'path', 'devnull', 'S

bytes:
256544   <uvloop.Loop running=True closed=False debug=False>
36968    {'application/octet-stream': ['.a', '.bin', '.dll', '.exe', '.o', '.obj', '.so', '.lha', '.lzh', '.c
18528    {140519667199136: <weakref at 0x7fcd3b70d638; to 'type' at 0x7fcd48d8eca0 (type)>, 140519667196960: 
18528    {'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': 
18528    {'__name__': 'os', '__doc__': "OS routines for NT or Posix depending on what system we're on.\n\nThi
18528    {'__name__': 'socket', '__doc__': "This module provides socket operations and some related functions
9656     ['import logging\n', 'import logging.config\n', 'import os\n', 'import re\n', 'import warnings\n', '
9320     {'__name__': 'posix', '__doc__': 'This module provides access to operating system functionality that
9320     {'__name__': 'posix', '__doc__': 'This module provides access to operating system functionality that
9320     {'__name__': '_socket', '__doc__': 'Implementation module for socket operations.\n\nSee the socket m

types:
8104     <class 'function'>
4781     <class 'tuple'>
4560     <class 'dict'>
2488     <class 'weakref'>
2018     <class 'list'>
1416     <class 'method_descriptor'>
1361     <class 'wrapper_descriptor'>
1241     <class 'getset_descriptor'>
1090     <class 'type'>
1045     <class 'set'>

After many requests for few minutes it turns to:

refs:
1210    <class 'dict'> {139867876654240: <weakref at 0x7f3579b26638; to 'type' at 0x7f35871a7ca0 (type)>, 139867876652064: 
1122    <class 'list'> ['import logging\n', 'import logging.config\n', 'import os\n', 'import re\n', 'import warnings\n', '
1079    <class 'set'> {<TimerHandle functools.partial(<function update_current_time at 0x7f3578b510d0>, <uvloop.Loop runni
910     <class 'dict'> {'application/octet-stream': ['.a', '.bin', '.dll', '.exe', '.o', '.obj', '.so', '.lha', '.lzh', '.c
444     <class 'dict'> {'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': 
366     <class 'dict'> {'__name__': 'socket', '__doc__': "This module provides socket operations and some related functions
339     <class 'dict'> {'__name__': 'os', '__doc__': "OS routines for NT or Posix depending on what system we're on.\n\nThi
337     <class 'dict'> {'__name__': '_socket', '__doc__': 'Implementation module for socket operations.\n\nSee the socket m
337     <class 'dict'> {'__name__': '_socket', '__doc__': 'Implementation module for socket operations.\n\nSee the socket m
336     <class 'list'> ['fromfd', 'getfqdn', 'create_connection', 'AddressFamily', 'SocketKind', 'AF_ALG', 'AF_APPLETALK', 

bytes:
256544   <uvloop.Loop running=True closed=False debug=False>
36968    {'application/octet-stream': ['.a', '.bin', '.dll', '.exe', '.o', '.obj', '.so', '.lha', '.lzh', '.c
32992    {<TimerHandle functools.partial(<function update_current_time at 0x7f3578b510d0>, <uvloop.Loop runni
18528    {139867876654240: <weakref at 0x7f3579b26638; to 'type' at 0x7f35871a7ca0 (type)>, 139867876652064: 
18528    {'builtins': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, '_frozen_importlib': 
18528    {'__name__': 'os', '__doc__': "OS routines for NT or Posix depending on what system we're on.\n\nThi
18528    {'__name__': 'socket', '__doc__': "This module provides socket operations and some related functions
9656     ['import logging\n', 'import logging.config\n', 'import os\n', 'import re\n', 'import warnings\n', '
9320     {'__name__': 'posix', '__doc__': 'This module provides access to operating system functionality that
9320     {'__name__': 'posix', '__doc__': 'This module provides access to operating system functionality that

types:
8104     <class 'function'>
5849     <class 'tuple'>
4556     <class 'dict'>
2486     <class 'weakref'>
2018     <class 'list'>
1416     <class 'method_descriptor'>
1361     <class 'wrapper_descriptor'>
1241     <class 'getset_descriptor'>
1090     <class 'type'>
1084     <class 'functools.partial'>

The following image show the memory usage evolution of the same application running in the Kubernetes cluster:

During the time where the memory usage grew, the application basically received health check requests, which is a simple get that returns a simple json saying that the application is alive;

Environment (please complete the following information):

Alpine 3.8 running in Kubernetes (Docker)
Version sanic-18.12.0
Python version 3.6.6

vltr · 2019-02-28T14:06:17Z

I would take a wild guess that Python's GC is not collecting the hanging futures inside each function call to itself - but I might be wrong.

IMO, we don't need to update the clock every second; we just need to call time() where current_time is required.

FUSAKLA · 2019-02-28T15:41:09Z

Might be. According to the comment it was supposed to mitigate the possible overhead but I'm not sure how much it helps.

Cache the current time, since it is needed at the end of every
keep-alive request to update the request timeout time

vltr · 2019-02-28T15:50:29Z

Yep, but I think this is unnecessary since calling time() has an almost neglectable overhead:

import timeit

setup_stmt = """
from time import time


def call_time():
    assert time() is not None

"""
timed_stmt = {
    "call_time": 'call_time()',
}
times = 1_000_000

for name, stmt in timed_stmt.items():
    time_to_run = timeit.timeit(stmt, setup=setup_stmt, number=times)
    time_each = time_to_run / times
    print("%s: %d results in %f s (%f us per result, %f per sec)" %
          (name, times, time_to_run, time_each * times, 1. / time_each))

call_time: 1000000 results in 0.110726 s (0.110726 us per result, 9031334.549457 per sec)

langma · 2019-03-01T07:38:01Z

I can confirm that removing the update_current_time logic keeps the pod's memory in check. I think the leak is happening only when there are requests? I'm not totally sure, but I agree with @vltr that the benefit of caching time() is probably really small, so I'm not gonna dig deeper for now.

Removing current_time variable also removes the only global variable in the codebase :)

I can make a PR if this sounds like the proper way to go forward?

FUSAKLA · 2019-03-04T08:33:18Z

Great, thanks for resolving it so quickly!
IIUC the fix will be in 18.13 and possibly 18.12.1, is there any estimation on the release date?

ahopkins · 2019-03-05T09:15:47Z

@FUSAKLA Look for two releases for 18.12.1 (a few fixes for the LTS) and 19.3 (the next releases in the schedule) sometime in the next couple weeks.

FUSAKLA · 2019-03-25T09:28:09Z

FYI memory usage after 19.3.1 upgrade, thanks works great!

ahopkins · 2019-03-25T18:22:05Z

Thanks for sharing

RifeWang · 2019-04-10T03:51:54Z

I had the same issue with memory leak for 18.12.1 , and then I have updated to 19.3 ，but the memory is still increasing in a very slow speed.

memory usage in two days ：

and I use multiple workers, the machine is 4 cores and 6GB memory:

workers = multiprocessing.cpu_count()
if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8000, workers=workers)

the result when I exec ps -aux | grep python

the another question is why so many processes , I think it should be 4 processes ? I use pm2 fork mode , its relation with pm2 ?

ahopkins · 2019-04-11T04:08:52Z

@RifeWang I would look to pm2 first. I am not 100% familiar with it and how it handles python applications. Does it act as a proxy server? How and when does it decide to start a process? To continue this conversation, please start a new issue and link to this here.

ladler0320 · 2019-05-31T14:08:11Z

I also experience the memory leak issue when using more than one worker on Sanic 19.6.0 with python 3.7.3

sjsadowski · 2019-05-31T14:14:24Z

@ladler0320 19.6.0 is not out yet, do you mean 19.3?

ladler0320 · 2019-05-31T14:28:02Z

@sjsadowski I'm not quite sure, but it's probably the 19.6.0.
I installed it using
pip install git+https://github.com/huge-success/sanic.git@a57c14c70bca3514d502db3ce53fd3262d1af149
because of #1549 fix and it was shown that the Sanic version is 19.6.0

sjsadowski · 2019-05-31T14:50:03Z

Got it, you installed from git master. Can you share your code and process to determine memory leaks?

ladler0320 · 2019-06-10T09:05:21Z

@sjsadowski I do apologize, but I'm not sure I'm allowed to share the code. However, I can confirm that memory leak occurs when I'm using more than one worker and only when the service receives requests.
Sure, I'll try to investigate that issue and I'll let you know if there will be any update.

sjsadowski · 2019-06-10T13:44:46Z

@ladler0320 if you can do pseudocode, that would help too. It's hard to know what to fix if we can't identify what and where it's broken.

ladler0320 · 2019-06-11T06:27:20Z

@sjsadowski Sure.

def init_app():
   app_local = Sanic(name="App", log_config=LOGGING_CONFIG)
   app_local.config.update(APP_CONFIG) # "RESPONSE_TIMEOUT": 0.5

   app_local.error_handler.add(exceptions.ServiceUnavailable, timeout_error_handler)
   app_local.add_route(handler, "/models/<model>", methods=["POST"])

   return app_local


async def handler(request, model):
   loop = asyncio.get_event_loop()
   run = loop.run_in_executor

   try:
       with async_timeout.timeout(0.5, loop=loop):
           data = request.json
           resp = await run(pool, function, model, data)
           status = 200
           resp = js.dumps(resp, ensure_ascii=False)

   except (concurrent.futures.CancelledError,
           concurrent.futures.TimeoutError,
           asyncio.TimeoutError):
       resp = ""
       status = 504

   except Exception as e:
       logger.info(e)
       resp = ""
       status = 500

   return response.text(resp, status=status)


def timeout_error_handler(request, exception):
   return response.text("", 504)


if __name__ == '__main__':
   parser = argparse.ArgumentParser()
   parser.add_argument("--host", default=os.getenv("HOST", "0.0.0.0"))
   parser.add_argument("--port", type=int, default=os.getenv("PORT", 1234))

   args = parser.parse_args()

   host = args.host
   port: int = args.port
   workers = int(os.getenv("WORKERS")) 
   threads = int(os.getenv("THREADS"))

   app = init_app()
   pool = concurrent.futures.ThreadPoolExecutor(max_workers=threads)
   app.run(host=host, port=port, workers=workers, debug=False)

vankata-ashokkumar · 2021-06-09T12:45:28Z

Facing a similar issue with 20.6.3, Any update on this?

@sjsadowski

ahopkins · 2021-06-09T12:52:50Z

Update on what? This is closed 2 years. If you are experiencing something, please open a new issue and include some relevant details and source.

langma mentioned this issue Mar 1, 2019

Remove current_time prefetch #1502

Merged

sjsadowski closed this as completed in #1502 Mar 3, 2019

harshanarayana mentioned this issue Mar 17, 2019

Sanic server killed #1518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing memory usage in `update_current_time` #1500

Increasing memory usage in `update_current_time` #1500

FUSAKLA commented Feb 28, 2019

kbcasagrande commented Feb 28, 2019 •

edited

Loading

vltr commented Feb 28, 2019

FUSAKLA commented Feb 28, 2019

vltr commented Feb 28, 2019

langma commented Mar 1, 2019

FUSAKLA commented Mar 4, 2019

ahopkins commented Mar 5, 2019

FUSAKLA commented Mar 25, 2019

ahopkins commented Mar 25, 2019

RifeWang commented Apr 10, 2019

ahopkins commented Apr 11, 2019

ladler0320 commented May 31, 2019

sjsadowski commented May 31, 2019

ladler0320 commented May 31, 2019

sjsadowski commented May 31, 2019

ladler0320 commented Jun 10, 2019

sjsadowski commented Jun 10, 2019

ladler0320 commented Jun 11, 2019

vankata-ashokkumar commented Jun 9, 2021 •

edited

Loading

ahopkins commented Jun 9, 2021

Increasing memory usage in update_current_time #1500

Increasing memory usage in update_current_time #1500

Comments

FUSAKLA commented Feb 28, 2019

kbcasagrande commented Feb 28, 2019 • edited Loading

vltr commented Feb 28, 2019

FUSAKLA commented Feb 28, 2019

vltr commented Feb 28, 2019

langma commented Mar 1, 2019

FUSAKLA commented Mar 4, 2019

ahopkins commented Mar 5, 2019

FUSAKLA commented Mar 25, 2019

ahopkins commented Mar 25, 2019

RifeWang commented Apr 10, 2019

ahopkins commented Apr 11, 2019

ladler0320 commented May 31, 2019

sjsadowski commented May 31, 2019

ladler0320 commented May 31, 2019

sjsadowski commented May 31, 2019

ladler0320 commented Jun 10, 2019

sjsadowski commented Jun 10, 2019

ladler0320 commented Jun 11, 2019

vankata-ashokkumar commented Jun 9, 2021 • edited Loading

ahopkins commented Jun 9, 2021

Increasing memory usage in `update_current_time` #1500

Increasing memory usage in `update_current_time` #1500

kbcasagrande commented Feb 28, 2019 •

edited

Loading

vankata-ashokkumar commented Jun 9, 2021 •

edited

Loading