Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marshmallow Memory Leak #105

Closed
SoundsSerious opened this issue Apr 2, 2022 · 11 comments
Closed

Marshmallow Memory Leak #105

SoundsSerious opened this issue Apr 2, 2022 · 11 comments

Comments

@SoundsSerious
Copy link
Contributor

@kashifpk finding a pretty significant memory leak at scale with the marshmallow library.

marshmallow-code/marshmallow#1943

certainly seems like that lib is the source of a few problems.

@SoundsSerious
Copy link
Contributor Author

haven't tested this yet but reimplementing del is supposed to fix this issue

@SoundsSerious
Copy link
Contributor Author

OK a bit more info to go on...

Here is the trace malloc output for the top memory using items.

Exiting....
[ Top 100 Mem Items]
/home/olly/miniconda3/envs/smxenv/lib/python3.8/copy.py:279: size=8094 KiB, count=71450, average=116 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:108: size=5500 KiB, count=23632, average=238 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:1051: size=3310 KiB, count=43560, average=78 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:396: size=2884 KiB, count=7926, average=373 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:147: size=1763 KiB, count=6263, average=288 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:138: size=1725 KiB, count=8377, average=211 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/copyreg.py:91: size=1675 KiB, count=35725, average=48 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/copy.py:230: size=1543 KiB, count=2514, average=629 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:983: size=1503 KiB, count=2269, average=678 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:981: size=1503 KiB, count=2269, average=678 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:976: size=1503 KiB, count=2269, average=678 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/json/decoder.py:353: size=1370 KiB, count=19830, average=71 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:143: size=1281 KiB, count=1683, average=779 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:199: size=900 KiB, count=7948, average=116 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:390: size=834 KiB, count=3952, average=216 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:389: size=834 KiB, count=3952, average=216 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:387: size=834 KiB, count=3952, average=216 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:113: size=776 KiB, count=3681, average=216 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:978: size=531 KiB, count=8128, average=67 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:74: size=443 KiB, count=7947, average=57 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:190: size=342 KiB, count=7948, average=44 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/fields.py:364: size=289 KiB, count=1743, average=170 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:163: size=284 KiB, count=4002, average=73 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:72: size=267 KiB, count=1630, average=168 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:227: size=248 KiB, count=3974, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:972: size=247 KiB, count=3953, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:400: size=247 KiB, count=3952, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:393: size=247 KiB, count=3952, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:384: size=247 KiB, count=3952, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/copy.py:227: size=247 KiB, count=3952, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:115: size=230 KiB, count=3681, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/linecache.py:137: size=222 KiB, count=2082, average=109 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:1097: size=217 KiB, count=3960, average=56 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/utils.py:165: size=214 KiB, count=5480, average=40 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/utils.py:187: size=204 KiB, count=1242, average=168 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:1098: size=204 KiB, count=2067, average=101 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:114: size=187 KiB, count=3975, average=48 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:82: size=186 KiB, count=3953, average=48 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/twisted/internet/defer.py:286: size=181 KiB, count=1556, average=119 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:59: size=174 KiB, count=3670, average=49 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/arango_orm/collections.py:385: size=168 KiB, count=1027, average=168 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:656: size=165 KiB, count=1001, average=168 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/schema.py:401: size=138 KiB, count=843, average=168 B

@SoundsSerious
Copy link
Contributor Author

Based on the info in marshmallow/#1934

Everytime Collection.schema() is called, it creates creates a new type for schema class!!

We should use class-based caching here. Opening a fork to try a few thing.

    def schema(cls, *args, **kwargs):

        SchemaClass = type(
            cls.__name__ + "Schema", (Schema,), cls._fields.copy()
        )

        # Extra fields related schema configuration
        unknown = EXCLUDE
        if cls._allow_extra_fields is True:
            unknown = INCLUDE

        SC = SchemaClass(*args, **kwargs)
        SC.unknown = unknown
        return SC

@SoundsSerious
Copy link
Contributor Author

Found this issue as well but appears to be resolved:
marshmallow-code/marshmallow#732

@kashifpk
Copy link
Contributor

kashifpk commented Apr 3, 2022

@SoundsSerious nice work. Looking forward to the results.

@SoundsSerious
Copy link
Contributor Author

OK.

Testing this out now, but it appears to save a bunch of memory on access. the retreiving class commands get called quite a lot so I feel that is promising.

    @classmethod
    def schema(cls,only=None):
        '''schema caches Marshmellow Schemas on this class to preserve memory'''
        if not hasattr(cls,'_cls_schema'):
            objects_dict = cls.get_objects_dict()
            cls._cls_schema = type(
                cls.__name__ + "Schema", (ObjectSchema,), objects_dict
            )
        
        # Extra fields related schema configuration
        unknown = EXCLUDE
        if cls._allow_extra_fields is True:
            unknown = INCLUDE            

        if not hasattr(cls,'_cls_schema_cache'):
            #print(f'making {cls.__name__} schema with only=None')
            SC = cls._cls_schema()
            SC.unknown = unknown
            SC.object_class = cls
            cls._cls_schema_cache = {None:SC}

        if only is not None:
            if only not in cls._cls_schema_cache:
                #print(f'making {cls.__name__} schema with only={only}')
                SC = cls._cls_schema(only=only)
                SC.unknown = unknown
                SC.object_class = cls
                cls._cls_schema_cache[only] = SC
            
            #print(f'retrieving {cls.__name__} schema with only={only}')
            return cls._cls_schema_cache[only]
        
        else:
            #print(f'retreiving {cls.__name__} schema with only=None')
            return cls._cls_schema_cache[None]
     ```

@SoundsSerious
Copy link
Contributor Author

I dont think its too harmful to remove the **kwargs since "only" was the only option used

@SoundsSerious
Copy link
Contributor Author

This certainly seems to have had an effect, the top memory usage (ala tracemalloc) is no longer marshmallow related, so that's promising. Will continue investigating.

@SoundsSerious
Copy link
Contributor Author

SoundsSerious commented Apr 3, 2022

So just testing on my application workload I can report that on a high level per item basis this seems to improve memory consumption by 2x, but there is still a growth issue. I'd say it also runs about 25% faster too. I guess I'll have to figure out how to PR from a fork now :)

@SoundsSerious
Copy link
Contributor Author

Looks like progress but there is still some kind of growing memory, here's a stack trace of items after about an hour:

/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:184: size=175 MiB, count=1035693, average=177 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/json/decoder.py:353: size=170 MiB, count=2400541, average=74 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:154: size=107 MiB, count=517066, average=216 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:189: size=80.3 MiB, count=152332, average=553 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:156: size=31.6 MiB, count=517057, average=64 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/utils.py:165: size=29.2 MiB, count=766704, average=40 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:67: size=23.8 MiB, count=517624, average=48 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/cloud_managers/arango.py:173: size=18.2 MiB, count=298713, average=64 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:485: size=8633 KiB, count=152332, average=58 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/graph.py:257: size=8328 KiB, count=142026, average=60 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/graph.py:230: size=7875 KiB, count=140842, average=57 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/utils.py:188: size=7440 KiB, count=238073, average=32 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/urllib3/poolmanager.py:311: size=5905 KiB, count=35990, average=168 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/graph.py:260: size=4446 KiB, count=140841, average=32 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/cloud_managers/arango.py:233: size=4125 KiB, count=65993, average=64 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:1098: size=3807 KiB, count=38381, average=102 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/fields.py:775: size=2875 KiB, count=1577, average=1867 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/relationships.py:63: size=2585 KiB, count=50906, average=52 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/query.py:267: size=1835 KiB, count=20, average=91.7 KiB
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:347: size=1563 KiB, count=9036, average=177 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/fields.py:1592: size=1407 KiB, count=22282, average=65 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/fields.py:940: size=1360 KiB, count=58011, average=24 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/twisted/internet/defer.py:286: size=1356 KiB, count=8995, average=154 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/relationships.py:119: size=1239 KiB, count=27165, average=47 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/relationships.py:235: size=1194 KiB, count=25454, average=48 B
<attrs generated init static_models.StaticModelIntraday>:2: size=1085 KiB, count=9579, average=116 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/smartx_common.py:295: size=699 KiB, count=17897, average=40 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:718: size=693 KiB, count=5910, average=120 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/twisted/internet/defer.py:1081: size=622 KiB, count=3986, average=160 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/relationships.py:247: size=604 KiB, count=8584, average=72 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:704: size=596 KiB, count=5910, average=103 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/cloud_managers/arango.py:237: size=527 KiB, count=9, average=58.5 KiB
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:612: size=510 KiB, count=4343, average=120 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/crud_manager.py:281: size=500 KiB, count=11, average=45.4 KiB
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/twisted/internet/base.py:70: size=487 KiB, count=2996, average=166 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/libs/arango-orm/arango_orm/collections.py:431: size=454 KiB, count=2767, average=168 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/schema.py:323: size=426 KiB, count=4343, average=100 B
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/relationships.py:375: size=315 KiB, count=1535, average=210 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/http/client.py:1266: size=295 KiB, count=1795, average=168 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/weakref.py:162: size=290 KiB, count=4, average=72.5 KiB
/mnt/c/Users/Sup/Ottermatics Dropbox/Projects/SMART_X/smartx_performance/cloud_sink/crud_manager.py:271: size=288 KiB, count=1, average=288 KiB
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/urllib3/util/timeout.py:193: size=248 KiB, count=1514, average=168 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/linecache.py:137: size=222 KiB, count=2087, average=109 B
/home/olly/miniconda3/envs/smxenv/lib/python3.8/site-packages/marshmallow/fields.py:364: size=205 KiB, count=1225, average=171 B
<attrs generated init static_models.StaticEnterprise>:2: size=202 KiB, count=2871, average=72 B
<attrs generated init static_models.StaticModelInfo>:2: size=197 KiB, count=2805, average=72 B

@SoundsSerious
Copy link
Contributor Author

Looks like this is working for me in the new version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants