Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampler doesn't free memory after deleting #169

Closed
mstechly opened this issue Mar 19, 2019 · 9 comments
Closed

Sampler doesn't free memory after deleting #169

mstechly opened this issue Mar 19, 2019 · 9 comments

Comments

@mstechly
Copy link

Description
I noticed, that after running my script for a while, the memory footprint starts to grow over time.
It seems to be a problem with both DWaveSampler/EmbeddingComposite and QBSolv.
Deleting the objects and running the garbage collector doesn't help to solve the problem.

Here is the output of the code posted below for use_dwave=True:

MEMORY BEFORE: 52.2 MB
MEMORY AFTER: 110.14 MB
MEMORY GROWTH: 57.94 MB
MEMORY FREED:  1.0 MB

and for use_dwave=False (using QBSolve):

MEMORY BEFORE: 96.0 MB
MEMORY AFTER: 119.31 MB
MEMORY GROWTH: 23.31 MB
MEMORY FREED:  2.51 MB

As you can see, the memory impact of QBSolv is much smaller (I needed bigger qubo to even observe this effect), but it's still there.

To Reproduce
https://gist.github.com/mstechly/1a0b077382d64fb417651f5cf405e67b

Expected behavior
I would expect that deleting an object and running the garbage collector should free the memory.

Environment:

  • OS: OSX 10.14.2
  • Python version: 3.7.2
  • dwave-qbsolv==0.2.9
  • dwave-system==0.7.2
@randomir
Copy link
Member

randomir commented Mar 19, 2019

There's a known issue with DWaveSampler/dwave-cloud-client and releasing of system resources. Entrance to the rabbit hole here: #91. Solution could be dwavesystems/dwave-cloud-client#118.

In short, thread pools are the reason we can't have automatic garbage collection. Until force close in the cloud client is implemented, and DWaveSystem provided as a context manager, there are two workarounds:

  1. treat your DWaveSampler-based sampler as a singleton object - don't create disposable ones
  2. after job(s) are completed (problems downloaded), you can manually close the client (DWaveSampler.client.close()) and force-garbage-collect the client/solver/sampler.

The first approach is preferred at this point.

And regarding the possible memory leak in qbsolv - we need to investigate that further.

@mstechly
Copy link
Author

@randomir Thanks!
I tried the first approach earlier but it didn't work for some reason - but it might have been implementation error. I will let you know once I implement it again.

Considering the second one I did:

    sampler_1 = DWaveSampler(token=token, endpoint=dwave_endpoint, solver={'qpu': True})
    sampler_2 = EmbeddingComposite(sampler_1)
    sampler_1.client.close()

and it had no effect on memory.

Writing DWaveSampler.client.close() explicitly as you suggested resulted in error: AttributeError: type object 'DWaveSampler' has no attribute 'client', so not sure if it doesn't work or I messed syntax.

@mstechly
Copy link
Author

@randomir Actually the first method also doesn't seem to work:

mem0 = proc.memory_info().rss
print("MEMORY BEFORE:", convert_size(mem0))
if use_dwave:
    sampler_1 = DWaveSampler(token=token, endpoint=dwave_endpoint, solver={'qpu': True})
    sampler_2 = EmbeddingComposite(sampler_1)
    for i in range(5):
        result = sampler_2.sample_qubo(qubo, num_reads=1000, chain_strength=800)
        mem = proc.memory_info().rss
        print("MEMORY", i , convert_size(mem))

resulted in:

MEMORY BEFORE: 52.27 MB
MEMORY 0 109.8 MB
MEMORY 1 128.14 MB
MEMORY 2 145.49 MB
MEMORY 3 167.12 MB
MEMORY 4 175.41 MB
MEMORY AFTER: 175.41 MB
MEMORY GROWTH: 123.15 MB
MEMORY FREED:  1.0 MB

Or am I doing it wrong?

@randomir
Copy link
Member

In the first example, yes, I meant "DWaveSampler::client::close()" - client is a member variable of a DWaveSampler object. But in addition to closing the client, you probably also need to loose all references to it (solver, samples, etc), and invoke GC. (Also I didn't test any of this, so there might be additional nuances.)

In your second example, you're doing it wrong 😃. Python doesn't collect garbage too often. You need to either wait a lot longer / allocate a lot more (iterate much more than 5 times), or invoke the GC yourself.

@mstechly
Copy link
Author

@randomir I implemented the singleton approach and it works better now - good enough for my case, though not sure what's exactly happening under the hood.
Thanks!

@randomir
Copy link
Member

@mstechly, when you say better, what exactly do you mean? Are you seeing a constant memory overhead (no memory increase on each sampling), or something else?

FWIW, we plan to make the underlying thread pools (in the cloud client) singleton objects. That should make DWaveSampler objects relatively cheap (there's still going to be the overhead of solver selection per each instantiation).

@mstechly
Copy link
Author

@randomir It means that the memory still grows with each iteration, but slower with each consecutive one and stops growing at all at some point.
Not sure why it behaves like this since I run gc.collect() every time, but it caps at the point that fits my memory.

@randomir
Copy link
Member

That's interesting. We'll have that in mind (tests) when implementing singleton Client/thread pools. Thanks, @mstechly!

@randomir
Copy link
Member

I'm closing this issue because it's a duplicate / known bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants