Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dangling references prevent GC to collect instances #255

Closed
vakker opened this issue Oct 7, 2021 · 5 comments
Closed

Dangling references prevent GC to collect instances #255

vakker opened this issue Oct 7, 2021 · 5 comments
Labels
bug Something isn't working cannot_reproduce This issue cannot be reproduced with the provided code snippet meaning they are given a low priority

Comments

@vakker
Copy link

vakker commented Oct 7, 2021

Environment

  • Grid2op version: 1.6.3
  • System: ubuntu18.04

Bug description

Then the env instance is deleted (e.g. by del) then GC should collect it.
It does happen in most scenarios, but in some cases there are references to the env instance that doesn't allow for the collection of obsolete instances.

How to reproduce

Running the following shows that GC does it's job:

import gc
import sys

import grid2op


class TestClass:
    pass


def main():
    for i in range(50):
        env = grid2op.make('l2rpn_icaps_2021_small')
        d = TestClass()
        print(
            'Grid2Op instances:',
            len([
                o for o in gc.get_objects()
                if isinstance(o, grid2op.Environment.BaseEnv)
            ]))
        print('Test instances:',
              len([o for o in gc.get_objects() if isinstance(o, TestClass)]))
        # print('Ref', sys.getrefcount(d))
        # print('Refs', gc.get_referrers(d))
        del d
        del env
        gc.collect()
        print(
            'Grid2Op instances:',
            len([
                o for o in gc.get_objects()
                if isinstance(o, grid2op.Environment.BaseEnv)
            ]))
        print('Test instances:',
              len([o for o in gc.get_objects() if isinstance(o, TestClass)]))
        print('#########')


main()

However, if there's a wrapper around the env (as it's required for RLlib) there are calls to env.step and the wrapper (or other parts of the framework) might store a reference to the env through the returned observation, reward or info.

In my case using deepcopy (see below) resolves the issue:

obs_grid, reward, done, info = copy.deepcopy(self.env_grid.step(action_grid))

Otherwise, there are a pile of dangling instances that doesn't get collected.
Currently I'm using the following (that's called in the reset method of my wrapper when it's needed e.g. for curriculum learning):

    def _reset_env(self):
        self.env_grid.close()

        self.env_grid = None
        self._init_all()  # runs grid2op.make 
        self.episodes = 0

        gc.collect()

        # print(
        #     'Grid2Op instances:',
        #     len([
        #         o for o in gc.get_objects()
        #         if isinstance(o, grid2op.Environment.BaseEnv)
        #     ]))

The commented out bit prints the number of grid2op env instances in the given process.

@vakker vakker added the bug Something isn't working label Oct 7, 2021
@BDonnot
Copy link
Collaborator

BDonnot commented Oct 13, 2021

I think one of the problem is that the "observation" contains a reference to the observation_space (in the environment) that has a backend.
This is used for "simulate" and i am not sure that this is "closed" properly. I would need to double check that to be sure it's the cause of this issue.

@BDonnot BDonnot added the cannot_reproduce This issue cannot be reproduced with the provided code snippet meaning they are given a low priority label Nov 5, 2021
@BDonnot
Copy link
Collaborator

BDonnot commented Nov 5, 2021

Hello,
With the current information, i am totally unable to reproduce the bug. I know we discussed it on discord a few weeks ago but i cannot manage to reproduce it though i'm convinced that there is a bug there.

As I cannot reproduce it, it is likely that this bug will be in next release of grid2op.

@vakker
Copy link
Author

vakker commented Nov 5, 2021

You mean you cannot reproduce the bug with an RLlib wrapper? If it helps, I can put together a simple repro example. It's also possible that I'm doing something wrong, so let's see.

@BDonnot
Copy link
Collaborator

BDonnot commented Nov 5, 2021

Can you put some simple code here ? If possible without using RLLIB :-/ The smallest / simplest piece of code, the easiest it will be for me to identifiy the bug and then fix it :-)

@BDonnot
Copy link
Collaborator

BDonnot commented Jun 6, 2023

Closing at it seems solved

@BDonnot BDonnot closed this as completed Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cannot_reproduce This issue cannot be reproduced with the provided code snippet meaning they are given a low priority
Projects
None yet
Development

No branches or pull requests

2 participants