-
-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix two bugs in sage.misc.c3's implementation of the algorithm C3 #13501
Comments
comment:1
Hi Simon, I'll be working on this ticket today or so. We are currently exploring with Florent the theoretical limits of C3, and ways to fix the MRO issues once for all for categories. For that, we need a clean C3 implementation for further instrumentation. More on that topic on sage-combinat-devel shortly! |
comment:2
Here is a preliminary patch. It fixes the bug and simplifies a bit the logic. I now need to benchmark it, and reintroduce a couple PyList... where timings will call for it. |
comment:3
Hi! I optimized and cleaned a bit the code; here are some timings, using the following for benchmarking (better suggestions welcome):
Before:
With attachment: trac_13501-categories-c3_fix-nt.patch:
With further attachment: trac_13501-categories-c3_fix-useless_optimization-nt.patch:
So, we have a 15% loss compared to the previous implementation. I am Apply: Simon: let me know in case you are busy, I can possibly ask Florent for the review. |
Reviewer: Simon King |
Author: Nicolas M. Thiéry |
This comment has been minimized.
This comment has been minimized.
comment:5
I'll see whether using the low-level functions |
comment:6
Replying to @simon-king-jena:
Or perhaps this would be even faster: |
comment:7
Replying to @simon-king-jena:
That's basically what the second patch does, and it does not seem to do much of a change. But feel free to explore! |
comment:8
Another optimization:
Apparently, after the line |
comment:9
Replying to @nthiery:
Sorry, I didn't look at the second patch. Yes, if that doesn't help, it would probably be useless. But I think breaking the loop would make sense. |
comment:10
Replying to @simon-king-jena:
Good point; I am trying it right now! |
comment:11
Yes, the effect of using the C Api really seems marginal. I defined
and obtained:
|
comment:12
PS: The difference being 5%, one could still argue that 5% is the lowest number that is called "significant". |
comment:13
Replying to @nthiery:
Here it is: without the break:
with the break:
Tiny difference, but it's natural to put it here anyway. I am about to Note: I switched from 5.2 to 5.3 in the mean time, so the timings from this comment can't really be compared with the timings before. Cheers, |
comment:15
For the record: With sage-5.4.beta0 and
and with
Adding your main patch, I get
And with the optimization patch, I get
So, the slow-down is little and using the C Api is not noticeable. |
comment:16
With a little cdef'ing, namely diff --git a/sage/misc/c3.pyx b/sage/misc/c3.pyx
--- a/sage/misc/c3.pyx
+++ b/sage/misc/c3.pyx
@@ -196,6 +196,8 @@
nbheads = len(heads)
cdef object O, X
cdef bint next_item_found
+ cdef set tail_set
+ cdef list tail_list
while nbheads:
for i from 0 <= i < nbheads:
@@ -205,7 +207,8 @@
for j from 0 <= j < nbheads:
if j == i:
continue
- if O in tailsets[j]:
+ tail_set = tailsets[j]
+ if O in tail_set:
next_item_found = False
break
if next_item_found:
@@ -215,11 +218,13 @@
# j goes down so that ``del heads[j]`` does not screw up the numbering
for j from nbheads > j >= 0:
if heads[j] is O:
- try:
- X = tails[j].pop()
+ tail_list = tails[j]
+ if tail_list:
+ X = tail_list.pop()
heads[j] = X
- tailsets[j].remove(X)
- except IndexError:
+ tail_set = tailsets[j]
+ tail_set.remove(X)
+ else:
del heads[j]
del tails[j]
del tailsets[j] I get
The differences really are tiny. However, here is why I removed the
|
comment:17
Another thought: We want to compare the entries on the list by identity, not by equality, isn't it? But isn't then I wonder if one could instrument |
comment:18
Replying to @simon-king-jena:
I tested: |
comment:19
Here is evidence that using a set of memory locations is way faster than a set of objects, if what we actually want is comparison by identity. Namely:
|
comment:20
Using
using the following patch: diff --git a/sage/misc/c3.pyx b/sage/misc/c3.pyx
--- a/sage/misc/c3.pyx
+++ b/sage/misc/c3.pyx
@@ -186,16 +186,20 @@
# ``tails'') . Each tail is stored reversed, so that we can use a
# cheap pop() in lieue of pop(0). A duplicate of the tail is
# stored as a set in ``tailsets`` for cheap membership testing.
+ # Since we actually want comparison by identity, not equality,
+ # what we store is the set of memory locations of objects
+ cdef object O, X
cdef list tails = [getattr(obj, attribute) for obj in args]
tails.append(args)
- tails = [list(reversed(tail)) for tail in tails]
- cdef list heads = [tail.pop() for tail in tails]
- cdef list tailsets = [set(tail) for tail in tails]
+ tails = [list(reversed(tail)) for tail in tails]
+ cdef list heads = [tail.pop() for tail in tails]
+ cdef list tailsets = [set([<size_t><void *>O for O in tail]) for tail in tails]
cdef int i, j, nbheads
nbheads = len(heads)
- cdef object O, X
cdef bint next_item_found
+ cdef set tail_set
+ cdef list tail_list
while nbheads:
for i from 0 <= i < nbheads:
@@ -205,7 +209,8 @@
for j from 0 <= j < nbheads:
if j == i:
continue
- if O in tailsets[j]:
+ tail_set = tailsets[j]
+ if <size_t><void *>O in tail_set:
next_item_found = False
break
if next_item_found:
@@ -215,11 +220,13 @@
# j goes down so that ``del heads[j]`` does not screw up the numbering
for j from nbheads > j >= 0:
if heads[j] is O:
- try:
- X = tails[j].pop()
+ tail_list = tails[j]
+ if tail_list:
+ X = tail_list.pop()
heads[j] = X
- tailsets[j].remove(X)
- except IndexError:
+ tail_set = tailsets[j]
+ tail_set.remove(<size_t><void *>X)
+ else:
del heads[j]
del tails[j]
del tailsets[j] I am now running |
comment:21
PS: Note that according to the timing from my previous post, my patch would even yield a small speed-up, compared with vanilla sage. |
comment:30
I looked at the log, and it's indeed due to another patch (I ran the test with 5.3 + the sage-combinat patches merged in 5.4, but without the correponing updated pickle jar). So Positive Review! Thanks Simon :-) |
Changed dependencies from #12895 to none |
comment:31
Running the tests just made me realize that this patch commutes with #12895. I thus remove the dependency. |
comment:32
I'm getting a long doctest error:
|
comment:33
Replying to @jdemeyer:
But that's pure Python, and unrelated with our implementation of C3! Could it be that the error message randomly choses between "Cannot create ... for bases B, C" and "Cannot create ... for bases C, B"? If that is the case, I suggest to change the expected error message into "Cannot create a consistent method resolution order (MRO) for bases ..." |
Attachment: trac_13501-categories-c3_fix-nt.patch.gz Attachment: trac_13501-c3-review-nt.patch.gz |
comment:34
Replying to @simon-king-jena:
That sounds a bit strange indeed (I would have expected C3 to be
+1 I just did that. I used the occasion to avoid using a temporary Apply: Cheers, |
comment:36
Replying to @nthiery:
I just ran a benchmark, and it actually seems to make no difference ... But it's slighly more readable. |
comment:37
Hi Simon! Could you give it a quick look at my little change and set it back to positive review if that's ok? Thanks! |
comment:38
The patch looks fine, and all doctests (except the one for cmdline.py, which has been disabled for security reasons) pass on bsd.math. So, I can set it back to a positive review. |
comment:40
This doesn't apply to sage-5.4.rc2:
|
This comment has been minimized.
This comment has been minimized.
comment:42
Oops, sorry, Jeroen. I forgot to update the Apply list in the description. As noted in comment 33, Simon's c3-speedup-sk.patch is already folded in the main patch. Fixed and back to positive review. |
Merged: sage-5.5.beta1 |
The current implementation of C3's algorithm in sage.misc.c3 (which is
used to compute the list of all the super categories of a category) is
incorrect.
Define indeed the following classes::
Which gives the following mro:
Here is now the same hierarchy using categories::
The super categories are not given in the same order (g,e,f instead of
e f g)::
This is due to popping X from the tail too early around line 186: O is
a candidate for being the next in line, but has not yet been
definitely chosen.
And indeed this is caught by the test suite::
Here is another hierarchy with an inconsistent MRO which is not
detected by Sage's current C3::
Here the issue comes from the fact that the C3 algorithm is supposed
to work on the mro's of the bases together with the list of the
bases, which the current Sage's implementation does not.
Apply:
CC: @sagetrac-sage-combinat @simon-king-jena
Component: categories
Keywords: method resolution order
Author: Nicolas M. Thiéry, Simon King
Reviewer: Simon King
Merged: sage-5.5.beta1
Issue created by migration from https://trac.sagemath.org/ticket/13501
The text was updated successfully, but these errors were encountered: