Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triangle count algorithms return different results #15

Closed
szarnyasg opened this issue Jan 24, 2020 · 5 comments
Closed

Triangle count algorithms return different results #15

szarnyasg opened this issue Jan 24, 2020 · 5 comments

Comments

@szarnyasg
Copy link
Contributor

szarnyasg commented Jan 24, 2020

Seems #8 was a non-issue but the triangle count algorithms still return different results.

Using az INT64 pattern matrix fixes the inconsistency:

-    "M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, seed=42)"
+    "M = Matrix.from_random(INT64, 100, 100, 1000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)"

However, I cannot say that I can fully understand what's happening here. I've also tried introducing semiring=semiring.plus_times_int64 in the mxm operations but it did not change the output in any case.

@michelp
Copy link
Collaborator

michelp commented Jan 26, 2020

I was looking at this with @jim22k today and we noticed that it also works if you change the seed to 43, so we suspected there is maybe a problem with LAGraph_random? Although again, it hard to figure out how that would actually be happening. I'll dig a bit into the function with gdb and see if I can come up with anything. 🤷‍♂️

@szarnyasg
Copy link
Contributor Author

I did some debugging today with @marci543 on this issue. First, a minimal example to reproduce the problem:

from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op

def cohen(A, U, L):
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    return L.mxm(L, mask=L).reduce_int()

M = Matrix.from_lists([0, 0, 1, 1, 2, 2, 2, 4, 4, 4], [2, 4, 2, 4, 0, 1, 4, 0, 1, 2], [1]*10, typ=BOOL)
M += M.transpose()

print(M.to_string())

print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril()))

draw(M)

This produces:
image

The trick - we believe - is that the algorithms need to use the UINT64.PLUS_TIMES semiring. However, even if we used mxm(..., semiring=UINT64.PLUS_TIMES), the typecast did not happen automatically. What works instead is manually converting the matrices from BOOL to UINT64.

def cohen(A, U, L):
    U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(L, mask=L).reduce_int()

This works fine:

from pygraphblas import *
from pygraphblas.demo.gviz import draw, draw_op

def cohen(A, U, L):
    U = Matrix.from_lists(*U.to_lists(), nrows=U.nrows, ncols=U.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(U, mask=A).reduce_int() // 2

def sandia(A, L):
    A = Matrix.from_lists(*A.to_lists(), nrows=A.nrows, ncols=A.ncols, typ=UINT64)
    L = Matrix.from_lists(*L.to_lists(), nrows=L.nrows, ncols=L.ncols, typ=UINT64)
    return L.mxm(L, mask=L).reduce_int()

M = Matrix.from_random(BOOL, 100, 100, 10000, no_diagonal=True, make_symmetric=True, make_pattern=True, seed=42)
M += M.transpose()

print(cohen(M, M.triu(), M.tril()))
print(sandia(M, M.tril()))

Both triangle count algorithms produce 102362 as their result.

Slightly related - the User Guide (v3.2.0) explains that GrB_transpose can be used perform a cast operation.

8.15 GrB_transpose: transpose a matrix
This step also does any typecasting needed, so GrB_transpose can be used to typecast a matrix A into another matrix C. To do this, simply use NULL for the Mask and accum, and provide a nondefault descriptor desc that sets the transpose option

WDYT @michelp?

@jim22k
Copy link

jim22k commented Feb 25, 2020 via email

@michelp
Copy link
Collaborator

michelp commented Feb 25, 2020

Yep @jim22k that's it, this has bitten me a couple times, should use something like C implicit conversion to consider both sides of the operation.

https://en.cppreference.com/w/c/language/conversion

@michelp
Copy link
Collaborator

michelp commented Aug 20, 2020

This is fixed in #72 and now all return the same result 102362.

@michelp michelp closed this as completed Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants