-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Triangle count algorithms return different results #15
Comments
I was looking at this with @jim22k today and we noticed that it also works if you change the seed to 43, so we suspected there is maybe a problem with LAGraph_random? Although again, it hard to figure out how that would actually be happening. I'll dig a bit into the function with gdb and see if I can come up with anything. 🤷♂️ |
I did some debugging today with @marci543 on this issue. First, a minimal example to reproduce the problem: from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op
def cohen(A, U, L):
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
return L.mxm(L, mask=L).reduce_int()
M = Matrix.from_lists([0, 0, 1, 1, 2, 2, 2, 4, 4, 4], [2, 4, 2, 4, 0, 1, 4, 0, 1, 2], [1]*10, typ=BOOL)
M += M.transpose()
print(M.to_string())
print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril()))
draw(M) The trick - we believe - is that the algorithms need to use the def cohen(A, U, L):
U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(L, mask=L).reduce_int() This works fine: from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op
def cohen(A, U, L):
U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(L, mask=L).reduce_int()
M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)
M += M.transpose()
print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril())) Both triangle count algorithms produce Slightly related - the User Guide (v3.2.0) explains that
WDYT @michelp? |
L.mxm(U, semiring=UINT64.PLUS_TIMES) -> returns a BOOL object
I believe I know what's going on here. In the absence of an `out=...`
parameter, pygraphblas will create a new output object based on the type of
L. It doesn't consider U or the semiring in making that choice. The call
then going down to SuiteSparse::GraphBLAS as:
GrB_mxm(out[t*ype=BOOL*], NULL, NULL, semiring[*type=UINT64*], L[*type=BOOL*],
U[*type=BOOL*])
Tim can do the calculation using the UINT64 semiring, but then he has to
store it in a BOOL output, which naturally truncates the results to 1 or 0.
As you found, one workaround is to convert L to UINT64. Another approach
would be to create `out` manually with the right type and pass it in, along
with the UINT64 semiring.
In grblas, I always consider the types of L, U, and semiring when deciding
on the output's type. pygraphblas may want to do something similar.
…On Tue, Feb 25, 2020 at 8:34 AM Gabor Szarnyas ***@***.***> wrote:
I did some debugging today with @marci543 <https://github.com/marci543>
on this issue. First, a minimal example to reproduce the problem:
from pygraphblas import *from pygraphblas.demo.gviz import draw, draw_op
def cohen(A, U, L):
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
return L.mxm(L, mask=L).reduce_int()
M = Matrix.from_lists([0, 0, 1, 1, 2, 2, 2, 4, 4, 4], [2, 4, 2, 4, 0, 1, 4, 0, 1, 2], [1]*10, typ=BOOL)
M += M.transpose()
print(M.to_string())
print(cohen(M, M.triu(), M.tril()))print(sandia(M, M.tril()))
draw(M)
This produces:
[image: image]
<https://user-images.githubusercontent.com/1402801/75255240-e46f9f00-57e1-11ea-8057-299fd430565a.png>
The trick - we believe - is that the algorithms need to use the
UINT64.PLUS_TIMES semiring. However, even if we used mxm(...,
semiring=UINT64.PLUS_TIMES), the typecast did not happen automatically.
What works instead is manually converting the matrices from BOOL to UINT64.
def cohen(A, U, L):
U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(L, mask=L).reduce_int()
This works fine:
from pygraphblas import *from pygraphblas.demo.gviz import draw, draw_op
def cohen(A, U, L):
U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(U, mask=A).reduce_int() // 2
def sandia(A, L):
A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
return L.mxm(L, mask=L).reduce_int()
M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)
M += M.transpose()
print(cohen(M, M.triu(), M.tril()))print(sandia(M, M.tril()))
Both triangle count algorithms produce 102362 as their result.
Slightly related - the User Guide (v3.2.0)
<http://mit.bme.hu/~szarnyas/grb/GraphBLAS_UserGuide.pdf> explains that
GrB_transpose can be used perform a cast operation.
8.15 GrB_transpose: transpose a matrix
This step also does any typecasting needed, so GrB_transpose can be used
to typecast a matrix A into another matrix C. To do this, simply use NULL
for the Mask and accum, and provide a nondefault descriptor desc that sets
the transpose option
WDYT @michelp <https://github.com/michelp>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVNLZTPKAA7SA7NRVSDRTDREUT7BANCNFSM4KLMFKWQ>
.
|
Yep @jim22k that's it, this has bitten me a couple times, should use something like C implicit conversion to consider both sides of the operation. |
This is fixed in #72 and now all return the same result |
Seems #8 was a non-issue but the triangle count algorithms still return different results.
Using az
INT64
pattern matrix fixes the inconsistency:However, I cannot say that I can fully understand what's happening here. I've also tried introducing
semiring=semiring.plus_times_int64
in themxm
operations but it did not change the output in any case.The text was updated successfully, but these errors were encountered: