Skip to content

Commit

Permalink
Moving Centrality notebooks to new structure and updating/testing (ra…
Browse files Browse the repository at this point in the history
  • Loading branch information
acostadon authored Jul 8, 2022
1 parent 5dd0267 commit 3d38d4d
Show file tree
Hide file tree
Showing 12 changed files with 1,292 additions and 857 deletions.
16 changes: 9 additions & 7 deletions notebooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@ This repository contains a collection of Jupyter Notebooks that outline how to r
| Folder | Notebook | Description |
| --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| Centrality | | |
| | [Centrality](centrality/Centrality.ipynb) | Compute and compare multiple (currently 4) centrality scores |
| | [Katz](centrality/Katz.ipynb) | Compute the Katz centrality for every vertex |
| | [Betweenness](centrality/Betweenness.ipynb) | Compute both Edge and Vertex Betweenness centrality |
| | [Centrality](algorithms/centrality/Centrality.ipynb) | Compute and compare multiple (currently 5) centrality scores |
| | [Katz](algorithms/centrality/Katz.ipynb) | Compute the Katz centrality for every vertex |
| | [Betweenness](algorithms/centrality/Betweenness.ipynb) | Compute both Edge and Vertex Betweenness centrality |
| | [Degree](algorithms/centrality/Degree.ipynb) | Compute Degree Centraility for each vertex |
| | [Eigenvector](algorithms/centrality/Eigenvector.ipynb) | Compute Eigenvector for every vertex |
| Community | | |
| | [Louvain](community/Louvain.ipynb) and Leiden | Identify clusters in a graph using both the Louvain and Leiden algorithms |
| | [ECG](community/ECG.ipynb) | Identify clusters in a graph using the Ensemble Clustering for Graph |
Expand Down Expand Up @@ -51,10 +53,10 @@ Running the example in these notebooks requires:
* The latest version of RAPIDS with cuGraph.
* Download via Docker, Conda (See [__Getting Started__](https://rapids.ai/start.html))

* cuGraph is dependent on the latest version of cuDF. Please install all components of RAPIDS
* Python 3.7+
* cuGraph is dependent on the latest version of cuDF. Please install all components of RAPIDS
* Python 3.8+
* A system with an NVIDIA GPU: Pascal architecture or better
* CUDA 11.0+
* CUDA 11.4+
* NVIDIA driver 450.51+


Expand All @@ -73,7 +75,7 @@ Test Hardware

##### Copyright

Copyright (c) 2019-2020, NVIDIA CORPORATION. All rights reserved.
Copyright (c) 2019-2022, NVIDIA CORPORATION. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Expand Down
69 changes: 69 additions & 0 deletions notebooks/algorithms/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# cuGraph Algorithm Notebooks

As all the algorithm Notebooks are updated and migrated to this area, they will show in this Readme. Until then they are available [here](../README.md)

![GraphAnalyticsFigure](../img/GraphAnalyticsFigure.jpg)

This repository contains a collection of Jupyter Notebooks that outline how to run various cuGraph analytics. The notebooks do not address a complete data science problem. The notebooks are simply examples of how to run the graph analytics. Manipulation of the data before or after the graph analytic is not covered here. Extended, more problem focused, notebooks are being created and available https://github.com/rapidsai/notebooks-extended

## Summary

| Folder | Notebook | Description |
| --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| Centrality | | |
| | [Centrality](centrality/Centrality.ipynb) | Compute and compare multiple (currently 5) centrality scores |
| | [Katz](centrality/Katz.ipynb) | Compute the Katz centrality for every vertex |
| | [Betweenness](centrality/Betweenness.ipynb) | Compute both Edge and Vertex Betweenness centrality |
| | [Degree](centrality/Degree.ipynb) | Compute Degree Centraility for each vertex |
| | [Eigenvector](centrality/Eigenvector.ipynb) | Compute Eigenvector for every vertex |

<!-- | Community | | |
| | [Louvain](community/Louvain.ipynb) and Leiden | Identify clusters in a graph using both the Louvain and Leiden algorithms |
| | [ECG](community/ECG.ipynb) | Identify clusters in a graph using the Ensemble Clustering for Graph |
| | [K-Truss](community/ktruss.ipynb) | Extracts the K-Truss cluster |
| | [Spectral-Clustering](community/Spectral-Clustering.ipynb) | Identify clusters in a graph using Spectral Clustering with both<br> - Balanced Cut<br> - Modularity Modularity |
| | [Subgraph Extraction](community/Subgraph-Extraction.ipynb) | Compute a subgraph of the existing graph including only the specified vertices |
| | [Triangle Counting](community/Triangle-Counting.ipynb) | Count the number of Triangle in a graph |
| Components | | |
| | [Connected Components](components/ConnectedComponents.ipynb) | Find weakly and strongly connected components in a graph |
| Core | | |
| | [K-Core](cores/kcore.ipynb) | Extracts the K-core cluster |
| | [Core Number](cores/core-number.ipynb) | Computer the Core number for each vertex in a graph |
| Link Analysis | | |
| | [Pagerank](link_analysis/Pagerank.ipynb) | Compute the PageRank of every vertex in a graph |
| | [HITS](link_analysis/HITS.ipynb) | Compute the HITS' Hub and Authority scores for every vertex in a graph |
| Link Prediction | | |
| | [Jaccard Similarity](link_prediction/Jaccard-Similarity.ipynb) | Compute vertex similarity score using both:<br />- Jaccard Similarity<br />- Weighted Jaccard |
| | [Overlap Similarity](link_prediction/Overlap-Similarity.ipynb) | Compute vertex similarity score using the Overlap Coefficient |
| Sampling |
| | [Random Walk](sampling/RandomWalk.ipynb) | Compute Random Walk for a various number of seeds and path lengths |
| Traversal | | |
| | [BFS](traversal/BFS.ipynb) | Compute the Breadth First Search path from a starting vertex to every other vertex in a graph |
| | [SSSP](traversal/SSSP.ipynb) | Single Source Shortest Path - compute the shortest path from a starting vertex to every other vertex |
| Structure | | |
| | [Renumbering](structure/Renumber.ipynb) <br> [Renumbering 2](structure/Renumber-2.ipynb) | Renumber the vertex IDs in a graph (two sample notebooks) |
| | [Symmetrize](structure/Symmetrize.ipynb) | Symmetrize the edges in a graph |
-->

[System Requirements](../README.md#requirements)

| Author Credit | Date | Update | cuGraph Version | Test Hardware |
| --------------|------------|------------------|-----------------|----------------|
| Brad Rees | 04/19/2021 | created | 0.19 | GV100, CUDA 11.0
| Don Acosta | 07/05/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5

### Copyright

Copyright (c) 2019-2022, NVIDIA CORPORATION. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.





![RAPIDS](img/rapids_logo.png)
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,11 @@
"\n",
"In this notebook, we will compute the Betweenness centrality for both vertices and edges in our test database using cuGraph and NetworkX. The NetworkX and cuGraph processes will be interleaved so that each step can be compared.\n",
"\n",
"Notebook Credits\n",
"* Original Authors: Bradley Rees\n",
"* Created: 04/24/2019\n",
"* Last Edit: 08/16/2020\n",
"\n",
"RAPIDS Versions: 0.15 \n",
"\n",
"Test Hardware\n",
"\n",
"* GV100 32G, CUDA 10.2\n"
"| Author Credit | Date | Update | cuGraph Version | Test Hardware |\n",
"| --------------|------------|------------------|-----------------|----------------|\n",
"| Brad Rees | 04/24/2019 | created | 0.15 | GV100, CUDA 11.0\n",
"| Brad Rees | 08/16/2020 | tested / updated | 21.10 nightly | RTX 3090 CUDA 11.4\n",
"| Don Acosta | 07/05/2022 | tested / updated | 22.08 nightly | DGX Tesla V100 CUDA 11.5"
]
},
{
Expand Down Expand Up @@ -79,7 +74,7 @@
"metadata": {},
"source": [
"#### Some notes about vertex IDs...\n",
"* The current version of cuGraph requires that vertex IDs be representable as 32-bit integers, meaning graphs currently can contain at most 2^32 unique vertex IDs. However, this limitation is being actively addressed and a version of cuGraph that accommodates more than 2^32 vertices will be available in the near future.\n",
"\n",
"* cuGraph will automatically renumber graphs to an internal format consisting of a contiguous series of integers starting from 0, and convert back to the original IDs when returning data to the caller. If the vertex IDs of the data are already a contiguous series of integers starting from 0, the auto-renumbering step can be skipped for faster graph creation times.\n",
" * To skip auto-renumbering, set the `renumber` boolean arg to `False` when calling the appropriate graph creation API (eg. `G.from_cudf_edgelist(gdf_r, source='src', destination='dst', renumber=False)`).\n",
" * For more advanced renumbering support, see the examples in `structure/renumber.ipynb` and `structure/renumber-2.ipynb`\n"
Expand All @@ -95,7 +90,7 @@
"Anthropological Research 33, 452-473 (1977).*\n",
"\n",
"\n",
"![Karate Club](../img/zachary_black_lines.png)\n",
"<img src=\"../../img/zachary_black_lines.png\" width=\"35%\"/>\n",
"\n",
"\n",
"Because the test data has vertex IDs starting at 1, the auto-renumber feature of cuGraph (mentioned above) will be used so the starting vertex ID is zero for maximum efficiency. The resulting data will then be auto-unrenumbered, making the entire renumbering process transparent to users.\n"
Expand Down Expand Up @@ -143,7 +138,7 @@
"outputs": [],
"source": [
"# Define the path to the test data \n",
"datafile='../data/karate-data.csv'"
"datafile='../../data/karate-data.csv'"
]
},
{
Expand Down Expand Up @@ -221,33 +216,6 @@
"Let's now look at the results"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Find the most important vertex using the scores\n",
"# This methods should only be used for small graph\n",
"def print_top_scores(_df, txt) :\n",
" m = _df['betweenness_centrality'].max()\n",
" _d = _df.query('betweenness_centrality == @m')\n",
" print(txt)\n",
" print(_d)\n",
" print()\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print_top_scores(vertex_bc, \"top vertex centrality scores\")\n",
"print_top_scores(edge_bc, \"top edge centrality scores\")"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -342,7 +310,7 @@
"metadata": {},
"source": [
"___\n",
"Copyright (c) 2019-2020, NVIDIA CORPORATION.\n",
"Copyright (c) 2019-2022, NVIDIA CORPORATION.\n",
"\n",
"Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0\n",
"\n",
Expand All @@ -353,9 +321,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "cugraph_dev",
"display_name": "Python 3.8.13 ('cugraph_dev')",
"language": "python",
"name": "cugraph_dev"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -367,7 +335,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "cee8a395f2f0c5a5bcf513ae8b620111f4346eff6dc64e1ea99c951b2ec68604"
}
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 3d38d4d

Please sign in to comment.