Skip to content

Commit

Permalink
Updates to Link Notebooks (rapidsai#2456)
Browse files Browse the repository at this point in the history
Move, test and update Link analysis and link prediction notebooks to the new organization. Also respond to some review comments on some earlier notebook changes. This is part of epic relates to rapidsai#1405 but does not close it.

Authors:
  - Don Acosta (https://github.com/acostadon)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#2456
  • Loading branch information
acostadon authored Aug 3, 2022
1 parent 4dc286e commit 5c32161
Show file tree
Hide file tree
Showing 18 changed files with 817 additions and 563 deletions.
24 changes: 5 additions & 19 deletions notebooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ This repository contains a collection of Jupyter Notebooks that outline how to r
Layout | | |
| | [Force-Atlas2](algorithms/layout/Force-Atlas2.ipynb) |A large graph visualization achieved with cuGraph. |
| Link Analysis | | |
| | [Pagerank](link_analysis/Pagerank.ipynb) | Compute the PageRank of every vertex in a graph |
| | [HITS](link_analysis/HITS.ipynb) | Compute the HITS' Hub and Authority scores for every vertex in a graph |
| | [Pagerank](algorithms/link_analysis/Pagerank.ipynb) | Compute the PageRank of every vertex in a graph |
| | [HITS](algorithms/link_analysis/HITS.ipynb) | Compute the HITS' Hub and Authority scores for every vertex in a graph |
| Link Prediction | | |
| | [Jaccard Similarity](link_prediction/Jaccard-Similarity.ipynb) | Compute vertex similarity score using both:<br />- Jaccard Similarity<br />- Weighted Jaccard |
| | [Overlap Similarity](link_prediction/Overlap-Similarity.ipynb) | Compute vertex similarity score using the Overlap Coefficient |
| | [Jaccard Similarity](algorithms/link_prediction/Jaccard-Similarity.ipynb) | Compute vertex similarity score using both:<br />- Jaccard Similarity<br />- Weighted Jaccard |
| | [Overlap Similarity](algorithms/link_prediction/Overlap-Similarity.ipynb) | Compute vertex similarity score using the Overlap Coefficient |
| Sampling |
| | [Random Walk](sampling/RandomWalk.ipynb) | Compute Random Walk for a various number of seeds and path lengths |
| Traversal | | |
Expand All @@ -61,21 +61,7 @@ Running the example in these notebooks requires:
* CUDA 11.4+
* NVIDIA driver 450.51+



#### Notebook Credits

- Original Authors: Bradley Rees
- Last Edit: 04/19/2021

RAPIDS Versions: 0.19

Test Hardware
- GV100 32G, CUDA 9,2



##### Copyright
#### Copyright

Copyright (c) 2019-2022, NVIDIA CORPORATION. All rights reserved.

Expand Down
20 changes: 10 additions & 10 deletions notebooks/algorithms/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,33 +10,33 @@ This repository contains a collection of Jupyter Notebooks that outline how to r

| Folder | Notebook | Description |
| --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| Centrality | | |
| [Centrality](centrality/README.md) | | |
| | [Centrality](centrality/Centrality.ipynb) | Compute and compare multiple (currently 5) centrality scores |
| | [Katz](centrality/Katz.ipynb) | Compute the Katz centrality for every vertex |
| | [Betweenness](centrality/Betweenness.ipynb) | Compute both Edge and Vertex Betweenness centrality |
| | [Degree](centrality/Degree.ipynb) | Compute Degree Centraility for each vertex |
| | [Eigenvector](centrality/Eigenvector.ipynb) | Compute Eigenvector for every vertex |
| Community | | |
|[Community](community/README.md) | | |
| | [Louvain](community/Louvain.ipynb) | Identify clusters in a graph using both the Louvain and Leiden algorithms |
| | [ECG](community/ECG.ipynb) | Identify clusters in a graph using the Ensemble Clustering for Graph |
| | [K-Truss](community/ktruss.ipynb) | Extracts the K-Truss cluster |
| | [Spectral-Clustering](community/Spectral-Clustering.ipynb) | Identify clusters in a graph using Spectral Clustering with both<br> - Balanced Cut<br> - Modularity Modularity |
| | [Subgraph Extraction](community/Subgraph-Extraction.ipynb) | Compute a subgraph of the existing graph including only the specified vertices |
| | [Triangle Counting](community/Triangle-Counting.ipynb) | Count the number of Triangle in a graph |
Components | | |
|[Components](components/README.md) | | |
| | [Connected Components](components/ConnectedComponents.ipynb) | Find weakly and strongly connected components in a graph |
| Cores | | |
| [Cores](cores/README.md) | | |
| | [core-number](cores/Core-number.ipynb) | Computes the core number for every vertex of a graph G. The core number of a vertex is a maximal subgraph that contains only that vertex and others of degree k or more. |
| | [kcore](cores/kcore.ipynb) |Find the k-core of a graph which is a maximal subgraph that contains nodes of degree k or more.|
Layout | | |
| | [Force-Atlas2](layout/Force-Atlas2.ipynb) |A large graph visualization achieved with cuGraph. |
<!--| Link Analysis | | |
| [Link Analysis](link_analysis/README.md) | | |
| | [Pagerank](link_analysis/Pagerank.ipynb) | Compute the PageRank of every vertex in a graph |
| | [HITS](link_analysis/HITS.ipynb) | Compute the HITS' Hub and Authority scores for every vertex in a graph |
| Link Prediction | | |
| | [Jaccard Similarity](link_prediction/Jaccard-Similarity.ipynb) | Compute vertex similarity score using both:<br />- Jaccard Similarity<br />- Weighted Jaccard |
| | [Overlap Similarity](link_prediction/Overlap-Similarity.ipynb) | Compute vertex similarity score using the Overlap Coefficient |
| Sampling |
| [Link Prediction](link_prediction/README.md) | | |
| | [Jaccard Similarity](algorithms/link_prediction/Jaccard-Similarity.ipynb) | Compute vertex similarity score using both:<br />- Jaccard Similarity<br />- Weighted Jaccard |
| | [Overlap Similarity](algorithms/link_prediction/Overlap-Similarity.ipynb) | Compute vertex similarity score using the Overlap Coefficient |
<!--| Sampling |
| | [Random Walk](sampling/RandomWalk.ipynb) | Compute Random Walk for a various number of seeds and path lengths |
| Traversal | | |
| | [BFS](traversal/BFS.ipynb) | Compute the Breadth First Search path from a starting vertex to every other vertex in a graph |
Expand All @@ -51,7 +51,7 @@ Layout |
| Author Credit | Date | Update | cuGraph Version | Test Hardware |
| --------------|------------|------------------|-----------------|----------------|
| Brad Rees | 04/19/2021 | created | 0.19 | GV100, CUDA 11.0
| Don Acosta | 07/05/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5
| Don Acosta | 08/02/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5

### Copyright

Expand Down
55 changes: 11 additions & 44 deletions notebooks/algorithms/centrality/Centrality.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"| --------------|------------|------------------|-----------------|----------------|\n",
"| Brad Rees | 04/16/2021 | created | 0.19 | GV100, CUDA 11.0\n",
"| Brad Rees | 08/05/2021 | tested / updated | 21.10 nightly | RTX 3090 CUDA 11.4\n",
"| Don Acosta | 07/05/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5\n",
"| Don Acosta | 07/29/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5\n",
" "
]
},
Expand All @@ -27,7 +27,7 @@
"source": [
"Centrality is measure of how important, or central, a node or edge is within a graph. It is useful for identifying influencer in social networks, key routing nodes in communication/computer network infrastructures, \n",
"\n",
"The seminal paper on centrality is: Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social networks, 1(3), 215-239.\n",
"The seminal paper on centrality is: Freeman, L. C. (1978). [Centrality in social networks conceptual clarification.](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.320.5551&rep=rep1&type=pdf) Social networks, 1(3), 215-239.\n",
"\n",
"__Degree Centrality__ <br>\n",
"Degree centrality is based on the notion that whoever has the most connections must be important. \n",
Expand Down Expand Up @@ -117,9 +117,10 @@
"metadata": {},
"outputs": [],
"source": [
"# Import the cugraph modules\n",
"# Import the cugraph modules\n",
"import cugraph\n",
"import cudf"
"import cudf\n",
"from cugraph.experimental.datasets import karate"
]
},
{
Expand All @@ -128,7 +129,7 @@
"metadata": {},
"outputs": [],
"source": [
"#import the networkX required modules\n",
"# import the non-Rapids required modules\n",
"import numpy as np\n",
"import pandas as pd \n",
"from IPython.display import display_html "
Expand Down Expand Up @@ -196,46 +197,13 @@
" display_html(df1_styler._repr_html_()+df2_styler._repr_html_()+df3_styler._repr_html_()+df4_styler._repr_html_()+df5_styler._repr_html_(), raw=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define the path to the test data \n",
"datafile='../../data/karate-data.csv'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"cuGraph does not do any data reading or writing and is dependent on other tools for that, with cuDF being the preferred solution. \n",
"\n",
"The data file contains an edge list, which represents the connection of a vertex to another. The `source` to `destination` pairs is in what is known as Coordinate Format (COO). In this test case, the data is just two columns. However a third, `weight`, column is also possible"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gdf = cudf.read_csv(datafile, delimiter='\\t', names=['src', 'dst'], dtype=['int32', 'int32'] )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"it was that easy to load data"
"The data file contains an edge list, which represents the connections between vertices. The `source` to `destination` pairs is in what is known as Coordinate Format (COO). In this test case, the data is just two columns. However a third, `weight`, column is also possible"
]
},
{
Expand All @@ -251,16 +219,15 @@
"metadata": {},
"outputs": [],
"source": [
"# create a Graph using the source (src) and destination (dst) vertex pairs from the Dataframe \n",
"G = cugraph.Graph()\n",
"G.from_cudf_edgelist(gdf, source='src', destination='dst')"
"# Create a graph using the imported Dataset object\n",
"G = karate.get_graph(fetch=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Compute Centrality"
"## Compute the Centrality measures in a single function."
]
},
{
Expand Down Expand Up @@ -295,7 +262,7 @@
"metadata": {},
"source": [
"### A Different Dataset\n",
"The Karate dataset is not that large or complex, which makes it a perfect test dataset since it is easy to visually verify results. Let's look at a larger dataset with a lot more edges"
"The Karate dataset is not large or complex, which makes it a perfect test dataset since it is easy to visually verify results. Let's look at a larger dataset with a lot more edges"
]
},
{
Expand Down
17 changes: 12 additions & 5 deletions notebooks/algorithms/layout/Force-Atlas2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"metadata": {},
"source": [
"# Force Atlas 2\n",
"# Skip notebook test"
"# Skip notebook test\n"
]
},
{
Expand All @@ -21,7 +21,14 @@
"| Hugo Linsenmaier | 11/16/2020 | created | 0.17 | GV100, CUDA 11.0\n",
"| Brad Rees | 01/11/2022 | tested / updated | 22.02 nightly | RTX A6000 CUDA 11.5\n",
"| Ralph Liu | 06/22/2022 | updated/tested | 22.08 | TV100, CUDA 11.5\n",
" "
"| Don Acosta | 08/01/2022 | tested / updated | 22.08 nightly | DGX Tesla A100 CUDA 11.5 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### This notebook will not currently run because there is a conflict between the version of CuPy required by cugraph (11.0) and the version supported in cuxfilter (7.8 to 10.0). Notebook will be updated when cuxfilter supports CuPy 11."
]
},
{
Expand Down Expand Up @@ -225,7 +232,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.7 ('base')",
"display_name": "Python 3.9.13 ('cugraph_dev')",
"language": "python",
"name": "python3"
},
Expand All @@ -239,11 +246,11 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "f708a36acfaef0acf74ccd43dfb58100269bf08fb79032a1e0a6f35bd9856f51"
"hash": "cee8a395f2f0c5a5bcf513ae8b620111f4346eff6dc64e1ea99c951b2ec68604"
}
}
},
Expand Down
Loading

0 comments on commit 5c32161

Please sign in to comment.