Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make groupoids garbage collectable #12357

Closed
simon-king-jena opened this issue Jan 25, 2012 · 39 comments
Closed

Make groupoids garbage collectable #12357

simon-king-jena opened this issue Jan 25, 2012 · 39 comments

Comments

@simon-king-jena
Copy link
Member

Currently, the groupoid of an object P can not be garbage collected, even when deleting P.

Even worse: The persistence of Groupoid(P) also prevents P from being garbage collected.

The attached patch aims at solving it: An external reference to either P or Groupoid(P) is enough to keep both alive. But without an external reference, both P and Groupoid(P) become collectable.

Example from the docs:

            sage: P = GF(151)['x','y']
            sage: n = id(Groupoid(P))
            sage: import gc
            sage: _ = gc.collect()
            sage: n == id(Groupoid(P))   # indirect doctest
            True

        Thus, the groupoid is cached. But when deleting ``P``, its
        groupoid can be garbage collected as well::

            sage: del P
            sage: _ = gc.collect()
            sage: P = GF(151)['x','y']
            sage: n == id(Groupoid(P))
            False

        TESTS:

        We test against some corner cases::

            sage: Groupoid(None)
            Traceback (most recent call last):
            ...
            TypeError: Groupoid of None is not defined
            sage: Groupoid(1)
            Traceback (most recent call last):
            ...
            TypeError: 1 must either allow attribute assignment or be instances of <type 'sage.structure.parent.Parent'>
            sage: class A: pass
            sage: a = A()
            sage: Groupoid(a)
            Groupoid with underlying set <__main__.A instance at ...>
            sage: Groupoid(a) is Groupoid(a)
            True

Superseded by #12313.

Depends on #715
Depends on #11521
Depends on #12313
Depends on #11943

CC: @nthiery @jpflori

Component: memleak

Keywords: groupoid cache Cernay2012

Reviewer: Simon King, Jean-Pierre Flori

Issue created by migration from https://trac.sagemath.org/ticket/12357

@simon-king-jena
Copy link
Member Author

comment:1

I did not run the full test suite yet. It could be that some silly old doctests use the groupoids in a way they are not intended, such as Groupoid(1). If this is the case, then my patch has to change. Otherwise: Needs review!

@simon-king-jena
Copy link
Member Author

comment:2

To my surprise, there were no nasty old doctests. Hence, make ptest went well, with the patch from here and its dependencies!

@simon-king-jena
Copy link
Member Author

comment:3

Concerning timings:

Without the patch, the groupoid of P is stored in a dictionary, indexed by P. Hence, access to the cache is slow if hash(P) is slow.

Some data points:

sage: P = GF(151)['x','y']
sage: %timeit G = Groupoid(P)  # slow hash
625 loops, best of 3: 20.9 µs per loop
sage: %timeit G = Groupoid(ZZ) # fast hash
625 loops, best of 3: 1.75 µs per loop
sage: class Bar(Parent): pass
....: 
sage: P = Bar()
sage: %timeit G = Groupoid(P)  # slow hash
625 loops, best of 3: 15.3 µs per loop

But with the patch, it is stored as an attribute of P. Hence, it is slow if attribute access is slow. The point is that even slow attribute access is faster than slow hash, and slow attribute access is only little slower than fast hash:

sage: P = GF(151)['x','y']
sage: %timeit G = Groupoid(P)  # slow attribute access
625 loops, best of 3: 3.11 µs per loop
sage: %timeit G = Groupoid(ZZ) # slow attribute access
625 loops, best of 3: 3.1 µs per loop
sage: class Bar(Parent): pass
....: 
sage: P = Bar()
sage: %timeit G = Groupoid(P)  # fast attribute access
625 loops, best of 3: 1.59 µs per loop

Hence, I believe that the patch is not only fixing a memory leak, but has the potential to generate a speed-up.

@jpflori
Copy link

jpflori commented Feb 8, 2012

Changed keywords from none to groupoid cache Cernay2012

@jpflori
Copy link

jpflori commented Feb 8, 2012

Work Issues: tests for None?

@jpflori
Copy link

jpflori commented Feb 8, 2012

comment:4

This is the simplest patch of the cache problems suite, and it does not depend on !#715, so I begin with this one.

I'm a little bit confused by all those tests for None in your patch.

Isn't the first one sufficient?

More specifically, the second test "if S is not None" isn't superfluous ?

And if it is not None, can G be None ? (I guess there is some cython voodoo involved here)

To answer my own question, I guess it is none if S is not a Parent (as in your example with 1).

Once you answer me back, I'll post a reviewer patch with some additional minor corrections.

@simon-king-jena
Copy link
Member Author

comment:5

Yes, the test "if S is not None" in line 103 seems redundant, as S being None would result in an error being raised in line 92.

Concerning Cython voodoo: If S is not None and not of type Parent, then G == S in line 106 would result in a type error (since G is cdefined) - which is desired behaviour. But if S were None, then G == S in line 106 would not complain, and then line 109 would segfault. This is why there must be a test for the corner case - of course, one test would be enough.

@jpflori
Copy link

jpflori commented Feb 10, 2012

Changed work issues from tests for None? to none

@jpflori
Copy link

jpflori commented Feb 10, 2012

comment:6

Ok, I've posted a reviewer patch removing the second test and adding some comments on the code.

Everything is fine for me (as long as further modification in #715 and #12313 do not affect the ticket here).

So I'll put is as positive review now.

@jpflori
Copy link

jpflori commented Feb 10, 2012

Reviewer patch

@simon-king-jena
Copy link
Member Author

comment:7

Attachment: trac12357_reviewer.patch.gz

The reviewer patch looks fine to me. Thank you!

@jdemeyer
Copy link

Changed dependencies from #12313 to #715, #12313

@jdemeyer
Copy link

comment:9

Moving this to sage-pending for now since it depends on two non-positively reviewed tickets.

@jdemeyer jdemeyer removed this from the sage-5.0 milestone Feb 12, 2012
@simon-king-jena
Copy link
Member Author

Changed dependencies from #715, #12313 to #715, #12313, #11943

@simon-king-jena
Copy link
Member Author

comment:10

There is a conflict with #11943.

Therefore, I suggest to rebase my patch with respect to #11943.

@simon-king-jena
Copy link
Member Author

Work Issues: rebase rel #11943

@jdemeyer jdemeyer added this to the sage-5.1 milestone Apr 26, 2012
@jdemeyer jdemeyer removed the pending label Apr 26, 2012
@jdemeyer jdemeyer removed this from the sage-5.1 milestone May 15, 2012
@jdemeyer
Copy link

comment:20

Unfortunalely, this regularly (not always but quite often) gives Segmentation Faults in sage/schemes/elliptic_curves/gal_reps.py after the file has been tested:

$ ./sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py"
[...]
4 items had no tests:
    __main__
    __main__.change_warning_output
    __main__.check_with_tolerance
    __main__.warning_function
25 items passed all tests:
  26 tests in __main__.example_0
[...]
  10 tests in __main__.example_9
263 tests in 29 items.
263 passed and 0 failed.
Test passed.
Segmentation fault
         [21.5 s]

----------------------------------------------------------------------
The following tests failed:


        sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py"
Total time for all tests: 21.6 seconds

This is with sage-5.1.beta4 + #12357 (beta4 is not yet released, but essentially finished).

@simon-king-jena
Copy link
Member Author

comment:21

Jeroen, does the segmentation fault persists, now that #715 seems fine due to Cython upgrade?

Anyway, I am now running make ptest with MALLOCK_CHECK_=3 on sage-5.6.rc0 debug version (#13864) plus #12215, #12313 and #12357.

@simon-king-jena
Copy link
Member Author

comment:22

Patchbot, apply trac12357_internal_strong_groupoid_cache.patch attachment: trac12357_reviewer.patch

@simon-king-jena
Copy link
Member Author

comment:23

I got one doctest error, which unfortunately shows that the patch does not solve the problem it was created for:

File "/home/simon/SAGE/debug/sage-5.6.rc0/devel/sage/sage/categories/groupoid.pyx", line 64:
    sage: n in map(id, gc.get_objects())
Expected:
    False
Got:
    True

Hence, some garbage that was supposed to get collected did not become collected. And there also seems to be a timeout in

sage -t -force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst"

@simon-king-jena
Copy link
Member Author

comment:24

In fact, the polynomial ring in the failing example does not become collected, even without creating its groupoid. I thought that this would have been fixed in one of the dependencies?

sage: P = GF(151)['x','y'] 
sage: import gc
sage: m = id(P)
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
True

@simon-king-jena
Copy link
Member Author

comment:25

Perhaps I should use a test that avoids polynomial rings, but uses a custom UniqueRepresentation. Namely, #12215 makes UniqueRepresentation collectable, and the patch from here makes the groupoid collectable as well, namely:

sage: class MyParent(UniqueRepresentation, Parent):
....:     pass
....: 
sage: P = MyParent()
sage: import gc
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
False
sage: n in map(id, gc.get_objects()) 
False

What do you think?

By the way, the timeout is no problem: I had closed my laptop during the tests. Hence, the tests were interrupted, but the time was running.

@simon-king-jena
Copy link
Member Author

comment:26

PS: In unpatched sage-5.6.rc0, one would obtain

sage: P = Parent()
sage: import gc
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True

And this is even without using a cache for P!!

Hence, the tests does demonstrate that some leak was fixed.

@simon-king-jena
Copy link
Member Author

comment:27

Apart from changing the test, I'd like to change something more. Namely, it is stated that UniqueRepresentation is using cached_function - but that is wrong by #12215, which is a dependency of this ticket.

The main idea of storing the groupoid of P as an attribute of P is to avoid creating an external strong reference on the groupoid. But that would not be the case, when using a weak_cached_function for caching.

Hence, it could actually be that #12215 already fixes what is attempted here! I'll test that.

@simon-king-jena
Copy link
Member Author

comment:28

Indeed. With just #12215 and #12313 on top of sage-5.6.rc0, one obtains:

sage: P = Parent()
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: import gc
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
False
sage: n in map(id, gc.get_objects()) 
False

Hence, I suggest to resolve this as a duplicate. Jean-Pierre, please change if you disagree.

@jdemeyer
Copy link

Changed reviewer from Jean-Pierre Flori to Simon King, Jean-Pierre Flori

@jdemeyer

This comment has been minimized.

@jdemeyer
Copy link

Changed author from Simon King to none

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants