Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak of GenConverter with detailed_validation = True #290

Closed
Bryant-Yang opened this issue Jul 27, 2022 · 6 comments
Closed

memory leak of GenConverter with detailed_validation = True #290

Bryant-Yang opened this issue Jul 27, 2022 · 6 comments

Comments

@Bryant-Yang
Copy link

while True:
# creating GenConverter instance with all default params.
# registering some simple structure hook,
# loading some data,

the memory usage of process keep increasing.

If set detailed_validation = False, there is no memory leak. I'm not sure if it cost by line cache.

@Tinche
Copy link
Member

Tinche commented Jul 27, 2022

Interesting, the hooks should be generated only once in any case. Would be curious if you can get a small repro?

@Tinche
Copy link
Member

Tinche commented Jul 27, 2022

I tried getting a repro:

from attrs import define

from cattrs import GenConverter

c = GenConverter()


@define
class Test:
    a: int
    b: int
    c: int


while True:
    c.structure({"a": 1, "b": 2, "c": 3}, Test)

but it doesn't leak.

@Bryant-Yang
Copy link
Author

This issue is found in one of my test server which has little memory, in a scheduled task loop which cause memory usage too high, had to reboot after running 2 days.

I have made my converter usage as singleton.
It's just not a clear constraint that this converter object could not be gc collected in some situations.

I can give a simple test as below:

import gc
from datetime import datetime
from typing import List

import cattr
import time
from attr import attrs

def create_converter(detailed_validation):
    converter = cattr.GenConverter(detailed_validation=detailed_validation)
    converter.register_structure_hook(datetime,
                                      lambda val, _: datetime.strptime(val, '%Y-%m-%d %H:%M:%S')
                                      if val is not None else None)
    return converter


@attrs(auto_attribs=True)
class TestData:
    @attrs(auto_attribs=True)
    class InfoList:
        name: str

    name: str
    dt: datetime
    info_list: List[InfoList]


def test(detailed_validation):
    converter = create_converter(detailed_validation)
    data = {'name': "testAAA", "dt": datetime.now().strftime('%Y-%m-%d %H:%M:%S'), "info_list": [{'name': "testAAA"}] * 100}
    for _ in range(1000):
        converter.structure(data, TestData)


if __name__ == '__main__':
    import os
    import psutil
    process = psutil.Process(os.getpid())
    while True:
        test(detailed_validation=True)  # memory usage increasing
        # test(detailed_validation=False)   # memory usage not increasing
        gc.collect()
        time.sleep(.1)
        print(f'mem info: {process.memory_info().rss}')

Interesting, the hooks should be generated only once in any case. Would be curious if you can get a small repro?

@Tinche
Copy link
Member

Tinche commented Jul 27, 2022

Ah, you seem to be creating a converter in a loop. I strongly advise not doing that, not only with cattrs, but also with libraries like attrs and dataclasses. All of these libraries have heavy initialization steps (including code generation) to make them faster afterwards; if you're constantly creating new converters you're paying the price and getting no benefit.

This is probably a leak somewhere in linecache, yeah. The hooks are generated once per converter. I can take a look when I have spare cycles, but due to what I said in the first paragraph I think this is lower prio.

@Bryant-Yang
Copy link
Author

yes,don't create instance repeatedly if not necessary,agree that.

@Tinche
Copy link
Member

Tinche commented Dec 26, 2023

This should actually already be fixed by #464

@Tinche Tinche closed this as completed Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants