Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two bugs in sage.misc.c3's implementation of the algorithm C3 #13501

Closed
nthiery opened this issue Sep 17, 2012 · 46 comments
Closed

Fix two bugs in sage.misc.c3's implementation of the algorithm C3 #13501

nthiery opened this issue Sep 17, 2012 · 46 comments

Comments

@nthiery
Copy link
Contributor

nthiery commented Sep 17, 2012

The current implementation of C3's algorithm in sage.misc.c3 (which is
used to compute the list of all the super categories of a category) is
incorrect.

Define indeed the following classes::

sage: class C(object): pass
....: 
sage: class F(object): pass
....: 
sage: class G(object): pass
....: 
sage: class B(C,F): pass
....: 
sage: class D(F,G): pass
....: 
sage: class E(F): pass
....: 
sage: class A(B,D,E): pass
....: 

Which gives the following mro:

sage: A.mro()
[<class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.D'>, <class '__main__.E'>, <class '__main__.F'>, <class '__main__.G'>, <type 'object'>]

Here is now the same hierarchy using categories::

class Cs(Category): 
    def super_categories(self): return []

class Fs(Category): 
    def super_categories(self): return []

class Gs(Category): 
    def super_categories(self): return []

class Bs(Category): 
    def super_categories(self): return [Cs(), Fs()]

class Ds(Category): 
    def super_categories(self): return [Fs(), Gs()]

class Es(Category): 
    def super_categories(self): return [Fs()]

class As(Category): 
    def super_categories(self): return [Bs(), Ds(), Es()]

The super categories are not given in the same order (g,e,f instead of
e f g)::

sage: As().all_super_categories()
[Category of as, Category of bs, Category of cs, Category of ds, Category of gs, Category of es, Category of fs]

This is due to popping X from the tail too early around line 186: O is
a candidate for being the next in line, but has not yet been
definitely chosen.

And indeed this is caught by the test suite::

sage: TestSuite(As()).run()
Failure in _test_category_graph:
Traceback (most recent call last):
...
AssertionError: False is not true

Here is another hierarchy with an inconsistent MRO which is not
detected by Sage's current C3::

class B(object): pass

class C(B): pass

class A(B, C): pass

Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
TypeError: Error when calling the metaclass bases
    Cannot create a consistent method resolution
order (MRO) for bases C, B
class Bs(Category): 
    def super_categories(self): return []

class Cs(Category): 
    def super_categories(self): return [Bs()]

class As(Category): 
    def super_categories(self): return [Bs(), Cs()]
    class subcategory_class(object): # just to bypass the MRO error when building the subcategory class
        pass

sage: As()
Category of as
sage: As().all_super_categories()
[Category of as, Category of cs, Category of bs]

Here the issue comes from the fact that the C3 algorithm is supposed
to work on the mro's of the bases together with the list of the
bases, which the current Sage's implementation does not.

Apply:

CC: @sagetrac-sage-combinat @simon-king-jena

Component: categories

Keywords: method resolution order

Author: Nicolas M. Thiéry, Simon King

Reviewer: Simon King

Merged: sage-5.5.beta1

Issue created by migration from https://trac.sagemath.org/ticket/13501

@nthiery nthiery added this to the sage-5.4 milestone Sep 17, 2012
@nthiery nthiery self-assigned this Sep 17, 2012
@nthiery
Copy link
Contributor Author

nthiery commented Sep 17, 2012

comment:1

Hi Simon,

I'll be working on this ticket today or so. We are currently exploring with Florent the theoretical limits of C3, and ways to fix the MRO issues once for all for categories. For that, we need a clean C3 implementation for further instrumentation. More on that topic on sage-combinat-devel shortly!

@nthiery
Copy link
Contributor Author

nthiery commented Sep 17, 2012

comment:2

Here is a preliminary patch. It fixes the bug and simplifies a bit the logic. I now need to benchmark it, and reintroduce a couple PyList... where timings will call for it.

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

comment:3

Hi!

I optimized and cleaned a bit the code; here are some timings, using the following for benchmarking (better suggestions welcome):

sage: l = GradedHopfAlgebrasWithBasis(GF(3)).all_super_categories() # to force preloading everything
sage: def f(n):
...       for i in range(5,n+5):
...            GradedHopfAlgebras(GF(nth_prime(i))).all_super_categories()

Before:

sage: %prun f(200)
...
2800/200    0.189    0.000    1.298    0.006 {sage.misc.c3.C3_algorithm}

With attachment: trac_13501-categories-c3_fix-nt.patch:

sage: %prun f(200)
...
 2800/200    0.249    0.000    1.430    0.007 {sage.misc.c3.C3_algorithm}

With further attachment: trac_13501-categories-c3_fix-useless_optimization-nt.patch:

sage: %prun f(200)
...
 2800/200    0.249    0.000    1.429    0.007 {sage.misc.c3.C3_algorithm}

So, we have a 15% loss compared to the previous implementation. I am
not sure exactly where it's slower, but I guess it's just the price
for correctness (the previous implementation was popping out stuff too
early). As the second attachment shows, using PyList_GET_ITEM and
friends does not seem to bring anything substantial, so I'd vote for
sticking to more readable Python notations.

Apply:

Simon: let me know in case you are busy, I can possibly ask Florent for the review.

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

Reviewer: Simon King

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

Author: Nicolas M. Thiéry

@nthiery

This comment has been minimized.

@simon-king-jena
Copy link
Member

comment:5

I'll see whether using the low-level functions PyList_GET_ITEM and friends will help to reduce the slow-down. After all, when doing tailsets[j], we already now that j is not out of bounds. Hence, we can drop the checks done in O in tailsets[j] and could as well do O in <set>PyList_GET_ITEM(tailsets, j).

@simon-king-jena
Copy link
Member

comment:6

Replying to @simon-king-jena:

I'll see whether using the low-level functions PyList_GET_ITEM and friends will help to reduce the slow-down. After all, when doing tailsets[j], we already now that j is not out of bounds. Hence, we can drop the checks done in O in tailsets[j] and could as well do O in <set>PyList_GET_ITEM(tailsets, j).

Or perhaps this would be even faster: PySet_Contains(<object>PyList_GET_ITEM(tailsets, j), O) is int(1). Let me do some tests...

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

comment:7

Replying to @simon-king-jena:

Replying to @simon-king-jena:

I'll see whether using the low-level functions PyList_GET_ITEM and friends will help to reduce the slow-down. After all, when doing tailsets[j], we already now that j is not out of bounds. Hence, we can drop the checks done in O in tailsets[j] and could as well do O in <set>PyList_GET_ITEM(tailsets, j).

Or perhaps this would be even faster: PySet_Contains(<object>PyList_GET_ITEM(tailsets, j), O) is int(1). Let me do some tests...

That's basically what the second patch does, and it does not seem to do much of a change. But feel free to explore!

@simon-king-jena
Copy link
Member

comment:8

Another optimization:

for j from 0 <= j < nbheads: 
    if j == i: 
        continue 
    if O in tailsets[j]: 
        next_item_found = False 
if next_item_found: 
    ...

Apparently, after the line next_item_found = False, one should insert a break, because otherwise the loop is finished to no avail.

@simon-king-jena
Copy link
Member

comment:9

Replying to @nthiery:

Or perhaps this would be even faster: PySet_Contains(<object>PyList_GET_ITEM(tailsets, j), O) is int(1). Let me do some tests...

That's basically what the second patch does, and it does not seem to do much of a change. But feel free to explore!

Sorry, I didn't look at the second patch. Yes, if that doesn't help, it would probably be useless. But I think breaking the loop would make sense.

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

comment:10

Replying to @simon-king-jena:

Another optimization:

for j from 0 <= j < nbheads: 
    if j == i: 
        continue 
    if O in tailsets[j]: 
        next_item_found = False 
if next_item_found: 
    ...

Apparently, after the line next_item_found = False, one should insert a break, because otherwise the loop is finished to no avail.

Good point; I am trying it right now!

@simon-king-jena
Copy link
Member

comment:11

Yes, the effect of using the C Api really seems marginal. I defined

def test_pythonic(O,list L):
    cdef int j 
    cdef int l = len(L)
    next_item_found = True
    for j from 0 <= j < l:
        if O in L[j]:
            next_item_found = False
            
def test_cythonic(O, list L):
    cdef int j
    cdef int l = len(L)
    next_item_found = True
    for j from 0 <= j < l:
        if PySet_Contains(<object>PyList_GET_ITEM(L, j), O):
            next_item_found = True

and obtained:

sage: L = [set(range(i,i+1000)) for i in range(1000)]
sage: %timeit test_pythonic(1001,L)
625 loops, best of 3: 116 µs per loop
sage: %timeit test_cythonic(1001,L)
625 loops, best of 3: 111 µs per loop

@simon-king-jena
Copy link
Member

comment:12

PS: The difference being 5%, one could still argue that 5% is the lowest number that is called "significant".

@nthiery
Copy link
Contributor Author

nthiery commented Sep 18, 2012

comment:13

Replying to @nthiery:

Apparently, after the line next_item_found = False, one should insert a break, because otherwise the loop is finished to no avail.

Good point; I am trying it right now!

Here it is: without the break:

 2800/200    0.247    0.000    1.417    0.007 {sage.misc.c3.C3_algorithm}

with the break:

 2800/200    0.243    0.000    1.413    0.007 {sage.misc.c3.C3_algorithm}

Tiny difference, but it's natural to put it here anyway. I am about to
update the patch.

Note: I switched from 5.2 to 5.3 in the mean time, so the timings from this comment can't really be compared with the timings before.

Cheers,
Nicolas

@nthiery nthiery changed the title Bugs in C3'salgorithm implementation in sage.misc.c3 Fix two bugs in sage.misc.c3's implementation of the algorithm C3 Sep 18, 2012
@simon-king-jena
Copy link
Member

comment:15

For the record: With sage-5.4.beta0 and

trac_715_combined.patch
trac_715_local_refcache.patch
trac_715_safer.patch
trac_715_specification.patch
trac_11521_homset_weakcache_combined.patch
trac_11521_callback.patch
13145.patch
trac_13447-sanitise_ring_refcount.patch
trac12215_weak_cached_function-sk.patch
trac12215_segfault_fixes.patch
trac_12313-mono_dict-combined-random-sk.patch
trac_12313_quit_sage.patch
trac13370_deprecate_is_field.patch
trac_13378-convert_map_shortcut.patch
trac_13412_category_for_power_series_rings.patch

and with %prun f(500) (because 200 may not be enough), I get

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 7000/500    0.212    0.000    0.773    0.002 {sage.misc.c3.C3_algorithm}

Adding your main patch, I get

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 7000/500    0.307    0.000    0.848    0.002 {sage.misc.c3.C3_algorithm}

And with the optimization patch, I get

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 7000/500    0.304    0.000    0.853    0.002 {sage.misc.c3.C3_algorithm}

So, the slow-down is little and using the C Api is not noticeable.

@simon-king-jena
Copy link
Member

comment:16

With a little cdef'ing, namely

diff --git a/sage/misc/c3.pyx b/sage/misc/c3.pyx
--- a/sage/misc/c3.pyx
+++ b/sage/misc/c3.pyx
@@ -196,6 +196,8 @@
     nbheads = len(heads)
     cdef object O, X
     cdef bint next_item_found
+    cdef set tail_set
+    cdef list tail_list

     while nbheads:
         for i from 0 <= i < nbheads:
@@ -205,7 +207,8 @@
             for j from 0 <= j < nbheads:
                 if j == i:
                     continue
-                if O in tailsets[j]:
+                tail_set = tailsets[j]
+                if O in tail_set:
                     next_item_found = False
                     break
             if next_item_found:
@@ -215,11 +218,13 @@
                 # j goes down so that ``del heads[j]`` does not screw up the numbering
                 for j from nbheads > j >= 0:
                     if heads[j] is O:
-                        try:
-                            X = tails[j].pop()
+                        tail_list = tails[j]
+                        if tail_list:
+                            X = tail_list.pop()
                             heads[j] = X
-                            tailsets[j].remove(X)
-                        except IndexError:
+                            tail_set = tailsets[j]
+                            tail_set.remove(X)
+                        else:
                             del heads[j]
                             del tails[j]
                             del tailsets[j]

I get

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 7000/500    0.277    0.000    0.819    0.002 {sage.misc.c3.C3_algorithm}

The differences really are tiny. However, here is why I removed the try: ... except IndexError::

sage: cython("""
....: def test1(list L):
....:     cdef list l
....:     for l in L:
....:         try:
....:             l.pop()
....:         except IndexError:
....:             l.append(0)
....: def test2(list L):
....:     cdef list l
....:     for l in L:
....:         if l:
....:             l.pop()
....:         else:
....:             l.append(0)
....: """)        
sage: L = [[] for _ in range(10000)]
sage: %timeit test1(L)
125 loops, best of 3: 7.07 ms per loop
sage: L = [[] for _ in range(10000)]
sage: %timeit test2(L)
125 loops, best of 3: 1.94 ms per loop

@simon-king-jena
Copy link
Member

comment:17

Another thought: We want to compare the entries on the list by identity, not by equality, isn't it? But isn't then if O in tail_set unnecessarily slow, because it compares by equality rather than identity?

I wonder if one could instrument MonoDict from #12313.

@simon-king-jena
Copy link
Member

comment:18

Replying to @simon-king-jena:

I wonder if one could instrument MonoDict from #12313.

I tested: MonoDict is slower than a set. But if we want to compare by identity, then perhaps the tailsets should store the memory addresses of the objects, rather than the objects?

@simon-king-jena
Copy link
Member

comment:19

Here is evidence that using a set of memory locations is way faster than a set of objects, if what we actually want is comparison by identity.

Namely:

sage: cython("""
....: def test1(set S, object O):
....:     return O in S
....: def test2(set S, object O):
....:     return <size_t><void *>O in S
....: """)
sage: S = set(Algebras(GF(p)) for p in prime_range(10000))
sage: Sid = set(id(O) for O in S)
sage: len(S)
1229
sage: len(Sid)
1229
sage: O = Algebras(GF(151))
sage: test1(S,O)
True
sage: test2(Sid,O)
True
sage: %timeit test1(S,O)
625 loops, best of 3: 507 ns per loop
sage: %timeit test2(Sid,O)
625 loops, best of 3: 186 ns per loop

@simon-king-jena
Copy link
Member

comment:20

Using <size_t><void *> rather than the objects seems to work, speed-wise! With your benchmark, I get

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 7000/500    0.197    0.000    0.671    0.001 {sage.misc.c3.C3_algorithm}

using the following patch:

diff --git a/sage/misc/c3.pyx b/sage/misc/c3.pyx
--- a/sage/misc/c3.pyx
+++ b/sage/misc/c3.pyx
@@ -186,16 +186,20 @@
     # ``tails'') . Each tail is stored reversed, so that we can use a
     # cheap pop() in lieue of pop(0). A duplicate of the tail is
     # stored as a set in ``tailsets`` for cheap membership testing.
+    # Since we actually want comparison by identity, not equality,
+    # what we store is the set of memory locations of objects
+    cdef object O, X
     cdef list tails = [getattr(obj, attribute) for obj in args]
     tails.append(args)
-    tails              = [list(reversed(tail)) for tail in tails]
-    cdef list heads    = [tail.pop()           for tail in tails]
-    cdef list tailsets = [set(tail)            for tail in tails]
+    tails              = [list(reversed(tail))     for tail in tails]
+    cdef list heads    = [tail.pop()               for tail in tails]
+    cdef list tailsets = [set([<size_t><void *>O for O in tail]) for tail in tails]
 
     cdef int i, j, nbheads
     nbheads = len(heads)
-    cdef object O, X
     cdef bint next_item_found
+    cdef set tail_set
+    cdef list tail_list
 
     while nbheads:
         for i from 0 <= i < nbheads:
@@ -205,7 +209,8 @@
             for j from 0 <= j < nbheads:
                 if j == i:
                     continue
-                if O in tailsets[j]:
+                tail_set = tailsets[j]
+                if <size_t><void *>O in tail_set:
                     next_item_found = False
                     break
             if next_item_found:
@@ -215,11 +220,13 @@
                 # j goes down so that ``del heads[j]`` does not screw up the numbering
                 for j from nbheads > j >= 0:
                     if heads[j] is O:
-                        try:
-                            X = tails[j].pop()
+                        tail_list = tails[j]
+                        if tail_list:
+                            X = tail_list.pop()
                             heads[j] = X
-                            tailsets[j].remove(X)
-                        except IndexError:
+                            tail_set = tailsets[j]
+                            tail_set.remove(<size_t><void *>X)
+                        else:
                             del heads[j]
                             del tails[j]
                             del tailsets[j]

I am now running make ptest with that patch, and if it works, then I'll post it here "officially", and we can start cross-reviewing.

@simon-king-jena
Copy link
Member

comment:21

PS: Note that according to the timing from my previous post, my patch would even yield a small speed-up, compared with vanilla sage.

@nthiery
Copy link
Contributor Author

nthiery commented Sep 19, 2012

comment:30

I looked at the log, and it's indeed due to another patch (I ran the test with 5.3 + the sage-combinat patches merged in 5.4, but without the correponing updated pickle jar).

So Positive Review!

Thanks Simon :-)

@nthiery
Copy link
Contributor Author

nthiery commented Sep 19, 2012

Changed dependencies from #12895 to none

@nthiery
Copy link
Contributor Author

nthiery commented Sep 19, 2012

comment:31

Running the tests just made me realize that this patch commutes with #12895. I thus remove the dependency.

@jdemeyer
Copy link
Contributor

comment:32

I'm getting a long doctest error:

sage -t  --long -force_lib devel/sage/sage/misc/c3.pyx
**********************************************************************
File "/release/merger/sage-5.4.beta2/devel/sage-main/sage/misc/c3.pyx", line 152:
    sage: class A(B, C): pass
Expected:
    Traceback (most recent call last):
    ...
    TypeError: Error when calling the metaclass bases
        Cannot create a consistent method resolution
    order (MRO) for bases B, C
Got:
    Traceback (most recent call last):
      File "/release/merger/sage-5.4.beta2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/release/merger/sage-5.4.beta2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/release/merger/sage-5.4.beta2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_1[35]>", line 1, in <module>
        class A(B, C): pass###line 152:
    sage: class A(B, C): pass
    TypeError: Error when calling the metaclass bases
        Cannot create a consistent method resolution
    order (MRO) for bases C, B
**********************************************************************

@simon-king-jena
Copy link
Member

comment:33

Replying to @jdemeyer:

I'm getting a long doctest error:

    sage: class A(B, C): pass
Expected:
    Traceback (most recent call last):
    ...
    TypeError: Error when calling the metaclass bases
        Cannot create a consistent method resolution
    order (MRO) for bases B, C
Got:

But that's pure Python, and unrelated with our implementation of C3!

Could it be that the error message randomly choses between "Cannot create ... for bases B, C" and "Cannot create ... for bases C, B"?

If that is the case, I suggest to change the expected error message into "Cannot create a consistent method resolution order (MRO) for bases ..."

@nthiery
Copy link
Contributor Author

nthiery commented Sep 21, 2012

@nthiery
Copy link
Contributor Author

nthiery commented Sep 21, 2012

comment:34

Replying to @simon-king-jena:

Could it be that the error message randomly choses between "Cannot create ... for bases B, C" and "Cannot create ... for bases C, B"?

That sounds a bit strange indeed (I would have expected C3 to be
completely deterministic, including in its error messages), but I
vaguely remember having to change this order once.

If that is the case, I suggest to change the expected error message
into "Cannot create a consistent method resolution order (MRO) for
bases ..."

+1

I just did that. I used the occasion to avoid using a temporary
tail_set variable, as recommended by your benchmark. See
attachment: trac_13501-c3-review-nt.patch for the changes. The
updated attachment: trac_13501-categories-c3_fix-nt.patch contains
all three patches folded together.

Apply:

Cheers,
Nicolas

@nthiery
Copy link
Contributor Author

nthiery commented Sep 21, 2012

comment:36

Replying to @nthiery:

I used the occasion to avoid using a temporary
tail_set variable, as recommended by your benchmark.

I just ran a benchmark, and it actually seems to make no difference ... But it's slighly more readable.

@nthiery
Copy link
Contributor Author

nthiery commented Oct 17, 2012

comment:37

Hi Simon!

Could you give it a quick look at my little change and set it back to positive review if that's ok?

Thanks!

@simon-king-jena
Copy link
Member

comment:38

The patch looks fine, and all doctests (except the one for cmdline.py, which has been disabled for security reasons) pass on bsd.math. So, I can set it back to a positive review.

@jdemeyer
Copy link
Contributor

comment:40

This doesn't apply to sage-5.4.rc2:

applying /release/merger/patches/trac_13501-categories-c3_fix-nt.patch
applying /release/merger/patches/trac_13501-c3-speedup-sk.patch
patching file sage/misc/c3.pyx
Hunk #1 FAILED at 185
Hunk #2 FAILED at 204
Hunk #3 FAILED at 214
3 out of 3 hunks FAILED -- saving rejects to file sage/misc/c3.pyx.rej
abort: patch failed to apply

@nthiery

This comment has been minimized.

@nthiery
Copy link
Contributor Author

nthiery commented Oct 23, 2012

comment:42

Oops, sorry, Jeroen. I forgot to update the Apply list in the description. As noted in comment 33, Simon's c3-speedup-sk.patch is already folded in the main patch. Fixed and back to positive review.

@jdemeyer
Copy link
Contributor

jdemeyer commented Nov 1, 2012

Merged: sage-5.5.beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants