Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cuGRAPH] Sampling Optimizations #2

Open
wants to merge 2 commits into
base: rapidsai-master
Choose a base branch
from

Conversation

VibhuJawa
Copy link

Description

This PR optimizes the sampling workflow to make it go 3x+ faster by making sure we do sampling end to end with GPUs .

Before this PR on the as-skitter dataset it took 29.3 s now we only take 11.30 s and more importantly we spend 99.6 % of time in cugraph.community.egonet.batched_ego_graphs(g, current_seeds, radius=1) rather than just spending 34.8 % there.

This means that it should scale by the performance of batched_ego_graphs rather than other library overhead.

Profiling:

Before the PR:

Timer unit: 1e-06 s

Total time: 29.3832 s
File: /tmp/ipykernel_62775/1201298652.py
Function: cugraphSampler_old at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def cugraphSampler_old(g, nodes, fanouts, edge_dir='in', prob=None, replace=False,
     2                                                                copy_ndata=True, copy_edata=True, _dist_training=False, exclude_edges=None):
     3                                               # from here get in a new for loop
     4                                               # ego_net return edge list
     5                                               # print("edge data")
     6                                               # print(g.edge_data)
     7         1         27.0     27.0      0.0      num_nodes = len(nodes)
     8         1         15.0     15.0      0.0      if torch.is_tensor(nodes):
     9         1       1206.0   1206.0      0.0          current_seeds = cupy.asarray(nodes)
    10         1        518.0    518.0      0.0          current_seeds = cudf.Series(current_seeds)
    11                                               else:
    12                                                   current_seeds = nodes.reindex(index = np.arange(0, num_nodes))
    13                                           
    14         1          3.0      3.0      0.0      blocks = []
    15                                               #seeds = cudf.Series(nodes.to_array())
    16                                           
    17         3          4.0      1.3      0.0      for fanout in fanouts:
    18         2   10229989.0 5114994.5     34.8          ego_edge_list, seeds_offsets = cugraph.community.egonet.batched_ego_graphs(g, current_seeds, radius = 1)
    19                                                   #print ("current_seeds", current_seeds)
    20         2        453.0    226.5      0.0          print ("fanout", fanout)
    21                                                   #all_parents = cupy.ndarray(fanout*len(current_seeds))
    22                                                   #all_children = cupy.ndarray(fanout*len(current_seeds))
    23         2        193.0     96.5      0.0          all_parents = cupy.ndarray(0)
    24         2         35.0     17.5      0.0          all_children = cupy.ndarray(0)
    25                                                   #print ("all parents", all_parents)
    26                                               # filter and get a certain size neighborhood
    27      2002       2407.0      1.2      0.0          for i in range(1, len(seeds_offsets)):
    28      2000     530286.0    265.1      1.8              pos0 = seeds_offsets.values_host[i-1]
    29      2000     424828.0    212.4      1.4              pos1 = seeds_offsets.values_host[i]
    30      2000     791732.0    395.9      2.7              edge_list = ego_edge_list[pos0:pos1]
    31                                                       # get randomness fanout
    32      2000   12797656.0   6398.8     43.6              filtered_list = edge_list[edge_list ['dst']== current_seeds[i-1]]
    33                                                        
    34                                                       # get sampled_list
    35      2000      28990.0     14.5      0.1              if len(filtered_list) > fanout:
    36       659      88522.0    134.3      0.3                  sampled_indices = random.sample(filtered_list.index.to_arrow().to_pylist(), fanout)
    37       659    3559557.0   5401.5     12.1                  filtered_list = filtered_list.reindex(index = sampled_indices)
    38                                                           
    39      2000     291726.0    145.9      1.0              children = cupy.asarray(filtered_list['src'])
    40      2000     259716.0    129.9      0.9              parents = cupy.asarray(filtered_list['dst'])
    41                                                       # copy the src and dst to cupy array
    42      2000     217204.0    108.6      0.7              all_parents = cupy.append(all_parents, parents)
    43      2000     154211.0     77.1      0.5              all_children = cupy.append(all_children, children)
    44                                                       #print (len(test_parents)) 
    45                                           
    46                                                   # generate dgl.graph and  blocks
    47         2       1971.0    985.5      0.0          sampled_graph = dgl.graph ((all_children,all_parents))
    48                                                   #print(all_parents)
    49                                                   #print(all_children)
    50                                                   #print(sampled_graph.edges())
    51                                                   #print(seeds.to_array())
    52                                                   # '_ID' is EID
    53         2          3.0      1.5      0.0          num_edges = len(all_children) 
    54         2        468.0    234.0      0.0          sampled_graph.edata['_ID'] = torch.tensor(np.arange (num_edges))
    55         2       1446.0    723.0      0.0          print(sampled_graph.edata)
    56                                                   #block =dgl.to_block(sampled_graph,current_seeds.to_array())
    57                                                   #block.edata[dgl.EID] = eid
    58                                                   #current_seeds = block.srcdata[dgl.NID]
    59                                                   #current_seeds = cudf.Series(current_seeds.cpu().detach().numpy())
    60                                           
    61                                                   #blocks.insert(0, block)
    62                                                   # end of for
    63                                           
    64         1          1.0      1.0      0.0      return sampled_graph

After the PR:

Timer unit: 1e-06 s

Total time: 11.3084 s
File: /tmp/ipykernel_62775/3202272512.py
Function: cugraphSampler_vjawa at line 31

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    31                                           def cugraphSampler_vjawa(
    32                                               g,
    33                                               nodes,
    34                                               fanouts,
    35                                               edge_dir="in",
    36                                               prob=None,
    37                                               replace=False,
    38                                               copy_ndata=True,
    39                                               copy_edata=True,
    40                                               _dist_training=False,
    41                                               exclude_edges=None,
    42                                           ):
    43                                               # from here get in a new for loop
    44                                               # ego_net return edge list
    45                                           
    46                                               # vjawa: Below fails
    47                                               # print("edge data")
    48                                               # print(g.edge_data)
    49         1         14.0     14.0      0.0      num_nodes = len(nodes)
    50         1          4.0      4.0      0.0      if torch.is_tensor(nodes):
    51         1        809.0    809.0      0.0          current_seeds = cp.asarray(nodes)
    52         1        201.0    201.0      0.0          current_seeds = cudf.Series(current_seeds)
    53                                               else:
    54                                                   current_seeds = nodes.reindex(index=cp.arange(0, num_nodes))
    55                                           
    56                                               # blocks = []
    57                                               # seeds = cudf.Series(nodes.to_array())
    58         3          4.0      1.3      0.0      for fanout in fanouts:
    59         2          4.0      2.0      0.0          (
    60         2         44.0     22.0      0.0              ego_edge_list,
    61         2          7.0      3.5      0.0              seeds_offsets,
    62         4   11258415.0 2814603.8     99.6          ) = cugraph.community.egonet.batched_ego_graphs(
    63         2          2.0      1.0      0.0              g, current_seeds, radius=1
    64                                                   )
    65                                                   # filter and get a certain size neighborhood
    66                                                   # Step 1
    67                                                   # Get Filtered List of ego_edge_list corresposing to current_seeds
    68                                                   # We filter by creating a series of destination nodes
    69                                                   # corresponding to the offsets and filtering non matching vallues
    70                                           
    71         2       1838.0    919.0      0.0          seeds_offsets_s = cudf.Series(seeds_offsets).values
    72         2        495.0    247.5      0.0          offset_lens = seeds_offsets_s[1:] - seeds_offsets_s[0:-1]
    73         2       3131.0   1565.5      0.0          dst_seeds = current_seeds.repeat(offset_lens)
    74         2        156.0     78.0      0.0          dst_seeds.index = ego_edge_list.index
    75         2       3785.0   1892.5      0.0          filtered_list = ego_edge_list[ego_edge_list["dst"] == dst_seeds]
    76                                           
    77                                                   # Step 2
    78                                                   # Sample Fan Out
    79                                                   # for each dst take maximum of fanout samples
    80         2      35537.0  17768.5      0.3          filtered_list = group_sample(filtered_list, by="dst", n_samples=fanout)
    81                                           
    82         2        970.0    485.0      0.0          all_children = create_tensor_from_cupy_cudf_objs(filtered_list["src"])
    83         2        670.0    335.0      0.0          all_parents = create_tensor_from_cupy_cudf_objs(filtered_list["dst"])
    84                                           
    85         2       1582.0    791.0      0.0          sampled_graph = dgl.graph((all_children, all_parents))
    86                                           
    87                                                   # print(all_parents)
    88                                                   # print(all_children)
    89                                                   # print(sampled_graph.edges())
    90                                                   # print(seeds.to_array())
    91                                                   # '_ID' is EID
    92                                           
    93         2         15.0      7.5      0.0          num_edges = len(all_children)
    94         4        471.0    117.8      0.0          sampled_graph.edata["_ID"] = from_dlpack(
    95         2        214.0    107.0      0.0              cp.arange(num_edges).toDlpack()
    96                                                   )
    97                                                   # print(sampled_graph.edata)
    98                                                   # block =dgl.to_block(sampled_graph,current_seeds.to_array())
    99                                                   # block.edata[dgl.EID] = eid
   100                                                   # current_seeds = block.srcdata[dgl.NID]
   101                                                   # current_seeds = cudf.Series(current_seeds.cpu().detach().numpy())
   102                                           
   103                                                   # blocks.insert(0, block)
   104                                                   # end of for
   105                                           
   106         1          1.0      1.0      0.0      return sampled_graph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant