Triangle count algorithms return different results #15

szarnyasg · 2020-01-24T21:12:15Z

Seems #8 was a non-issue but the triangle count algorithms still return different results.

Using az INT64 pattern matrix fixes the inconsistency:

-    "M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, seed=42)"
+    "M = Matrix.from_random(INT64, 100, 100, 1000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)"

However, I cannot say that I can fully understand what's happening here. I've also tried introducing semiring=semiring.plus_times_int64 in the mxm operations but it did not change the output in any case.

The text was updated successfully, but these errors were encountered:

michelp · 2020-01-26T02:16:16Z

I was looking at this with @jim22k today and we noticed that it also works if you change the seed to 43, so we suspected there is maybe a problem with LAGraph_random? Although again, it hard to figure out how that would actually be happening. I'll dig a bit into the function with gdb and see if I can come up with anything. 🤷‍♂️

szarnyasg · 2020-02-25T14:34:23Z

I did some debugging today with @marci543 on this issue. First, a minimal example to reproduce the problem:

from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op

def cohen(A, U, L):
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    return L.mxm(L, mask=L).reduce_int()

M = Matrix.from_lists([0, 0, 1, 1, 2, 2, 2, 4, 4, 4], [2, 4, 2, 4, 0, 1, 4, 0, 1, 2], [1]*10, typ=BOOL)
M += M.transpose()

print(M.to_string())

print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril()))

draw(M)

This produces:

The trick - we believe - is that the algorithms need to use the UINT64.PLUS_TIMES semiring. However, even if we used mxm(..., semiring=UINT64.PLUS_TIMES), the typecast did not happen automatically. What works instead is manually converting the matrices from BOOL to UINT64.

def cohen(A, U, L):
    U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(L, mask=L).reduce_int()

This works fine:

from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op

def cohen(A, U, L):
    U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(L, mask=L).reduce_int()

M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)
M += M.transpose()

print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril()))

Both triangle count algorithms produce 102362 as their result.

Slightly related - the User Guide (v3.2.0) explains that GrB_transpose can be used perform a cast operation.

8.15 GrB_transpose: transpose a matrix
This step also does any typecasting needed, so GrB_transpose can be used to typecast a matrix A into another matrix C. To do this, simply use NULL for the Mask and accum, and provide a nondefault descriptor desc that sets the transpose option

WDYT @michelp?

jim22k · 2020-02-25T18:24:08Z

L.mxm(U, semiring=UINT64.PLUS_TIMES) -> returns a BOOL object I believe I know what's going on here. In the absence of an `out=...` parameter, pygraphblas will create a new output object based on the type of L. It doesn't consider U or the semiring in making that choice. The call then going down to SuiteSparse::GraphBLAS as: GrB_mxm(out[t*ype=BOOL*], NULL, NULL, semiring[*type=UINT64*], L[*type=BOOL*], U[*type=BOOL*]) Tim can do the calculation using the UINT64 semiring, but then he has to store it in a BOOL output, which naturally truncates the results to 1 or 0. As you found, one workaround is to convert L to UINT64. Another approach would be to create `out` manually with the right type and pass it in, along with the UINT64 semiring. In grblas, I always consider the types of L, U, and semiring when deciding on the output's type. pygraphblas may want to do something similar.

…

On Tue, Feb 25, 2020 at 8:34 AM Gabor Szarnyas ***@***.***> wrote: I did some debugging today with @marci543 <https://github.com/marci543> on this issue. First, a minimal example to reproduce the problem: from pygraphblas import *from pygraphblas.demo.gviz import draw, draw_op def cohen(A, U, L): return L.mxm(U, mask=A).reduce_int() // 2 def sandia(A, L): return L.mxm(L, mask=L).reduce_int() M = Matrix.from_lists([0, 0, 1, 1, 2, 2, 2, 4, 4, 4], [2, 4, 2, 4, 0, 1, 4, 0, 1, 2], [1]*10, typ=BOOL) M += M.transpose() print(M.to_string()) print(cohen(M, M.triu(), M.tril()))print(sandia(M, M.tril())) draw(M) This produces: [image: image] <https://user-images.githubusercontent.com/1402801/75255240-e46f9f00-57e1-11ea-8057-299fd430565a.png> The trick - we believe - is that the algorithms need to use the UINT64.PLUS_TIMES semiring. However, even if we used mxm(..., semiring=UINT64.PLUS_TIMES), the typecast did not happen automatically. What works instead is manually converting the matrices from BOOL to UINT64. def cohen(A, U, L): U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64) L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64) return L.mxm(U, mask=A).reduce_int() // 2 def sandia(A, L): A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64) L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64) return L.mxm(L, mask=L).reduce_int() This works fine: from pygraphblas import *from pygraphblas.demo.gviz import draw, draw_op def cohen(A, U, L): U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64) L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64) return L.mxm(U, mask=A).reduce_int() // 2 def sandia(A, L): A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64) L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64) return L.mxm(L, mask=L).reduce_int() M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42) M += M.transpose() print(cohen(M, M.triu(), M.tril()))print(sandia(M, M.tril())) Both triangle count algorithms produce 102362 as their result. Slightly related - the User Guide (v3.2.0) <http://mit.bme.hu/~szarnyas/grb/GraphBLAS_UserGuide.pdf> explains that GrB_transpose can be used perform a cast operation. 8.15 GrB_transpose: transpose a matrix This step also does any typecasting needed, so GrB_transpose can be used to typecast a matrix A into another matrix C. To do this, simply use NULL for the Mask and accum, and provide a nondefault descriptor desc that sets the transpose option WDYT @michelp <https://github.com/michelp>? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAVNLZTPKAA7SA7NRVSDRTDREUT7BANCNFSM4KLMFKWQ> .

michelp · 2020-02-25T18:39:38Z

Yep @jim22k that's it, this has bitten me a couple times, should use something like C implicit conversion to consider both sides of the operation.

https://en.cppreference.com/w/c/language/conversion

michelp · 2020-08-20T22:38:05Z

This is fixed in #72 and now all return the same result 102362.

szarnyasg mentioned this issue Feb 24, 2020

Fix input matrix for triangle count test #28

Closed

attilanagy234 mentioned this issue Apr 11, 2020

Commutativity of vector multiplication when types are different #42

Closed

michelp closed this as completed Aug 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triangle count algorithms return different results #15

Triangle count algorithms return different results #15

szarnyasg commented Jan 24, 2020 •

edited

Loading

michelp commented Jan 26, 2020

szarnyasg commented Feb 25, 2020

jim22k commented Feb 25, 2020 via email

michelp commented Feb 25, 2020

michelp commented Aug 20, 2020

Triangle count algorithms return different results #15

Triangle count algorithms return different results #15

Comments

szarnyasg commented Jan 24, 2020 • edited Loading

michelp commented Jan 26, 2020

szarnyasg commented Feb 25, 2020

jim22k commented Feb 25, 2020 via email

michelp commented Feb 25, 2020

michelp commented Aug 20, 2020

szarnyasg commented Jan 24, 2020 •

edited

Loading