You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All the Similarity algorithms return incorrect results when calling the method number_of_vertices() before calling the algorithm if the vertices are not numbered from 0 to number_of_vertices -1. In fact, Jaccard for instance gets the number of vertices from the offsets array since the legacy version creates a graph from an adjacency list internally But when we explicitly call the method number_of_vertices() prior , this sets the number of vertices to a value different than what the offsets counterpart would do if the vertices are not renumbered (the offsets and indices array are different based on the number of vertices provided when creating those arrays.
Case 1: don’t add 1 to gdf:
graph_offsets=[0,16,25,35,41,44,48,52,56,61,63,66,67,69,74,76,78,80,82,84,87,89,91,93,98,101,104,106,110,113,117,121,127,139];
graph_indices=[1,2,3,4,5,6,7,8,10,11,12,13,17,19,21,31,0,2,3,7,13,17,19,21,30,0,1,3,7,8,9,13,27,28,32,0,1,2,7,12,13,0,6,10,0,6,10,16,0,4,5,16,0,1,2,3,0,2,30,32,33,2,33,0,4,5,0,0,3,0,1,2,3,33,32,33,32,33,5,6,0,1,32,33,0,1,33,32,33,0,1,32,33,25,27,29,32,33,25,27,31,23,24,31,29,33,2,23,24,33,2,31,33,23,26,32,33,1,8,32,33,0,24,25,28,32,33,2,8,14,15,18,20,22,23,29,30,31,33,8,9,13,14,15,18,19,20,22,23,26,27,28,29,30,31,32];
Case 2: add 1 to gdf:
graph_offsets=[0,0,16,25,35,41,44,48,52,56,61,63,66,67,69,74,76,78,80,82,84,87,89,91,93,98,101,104,106,110,113,117,121,127];
graph_indices=[2,3,4,5,6,7,8,9,11,12,13,14,18,20,22,32,1,3,4,8,14,18,20,22,31,1,2,4,8,9,10,14,28,29,33,1,2,3,8,13,14,1,7,11,1,7,11,17,1,5,6,17,1,2,3,4,1,3,31,33,34,3,34,1,5,6,1,1,4,1,2,3,4,34,33,34,33,34,6,7,1,2,33,34,1,2,34,33,34,1,2,33,34,26,28,30,33,34,26,28,32,24,25,32,30,34,3,24,25,34,3,32,34,24,27,33,34,2,9,33,34,1,25,26,29,33,34,3,9,15,16,19,21,23,24,30,31,32,34,9,10,14,15,16,19,20,21,23,24,27,28,29,30,31,32,33];
The print from case 2 suggests that the graph is not, in fact, renumbered. Thus there are 35 vertices (from 0 to 34). The fact that we overwrite the number of vertices to 34 results in incorrect behavior
Code of Conduct
I agree to follow cuGraph's Code of Conduct
I have searched the open bugs and have found no duplicates for this bug report
The text was updated successfully, but these errors were encountered:
BradReesWork
changed the title
[BUG]: Incorrect results return by the similarity algorithms
Incorrect results return by the similarity algorithms
Aug 3, 2023
This issue is not resolved yet but we made progress in 23.08 to deliver a full fledged solution. In fact, we got the last piece that enables support for weights in the C++ layer which was merged during burndown. Prior to that as a safety guard, we merged a PR highlighting that the similarity algorithms don't currently support certain datasets and we throw an exception if this condition is not satisfied. Once the roadmap of 23.10 release is establish, we will have a better estimate as to when this will be resolved.
Version
23.08
Which installation method(s) does this occur on?
Docker, Conda, Pip, Source
Describe the bug.
All the Similarity algorithms return incorrect results when calling the method
number_of_vertices()
before calling the algorithm if the vertices are not numbered from0
tonumber_of_vertices -1
. In fact,Jaccard
for instance gets the number of vertices from the offsets array since the legacy version creates a graph from an adjacency list internally But when we explicitly call the methodnumber_of_vertices()
prior , this sets the number of vertices to a value different than what the offsets counterpart would do if the vertices are not renumbered (the offsets and indices array are different based on the number of vertices provided when creating those arrays.Minimum reproducible example
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: