Skip to content

Commit

Permalink
Updating docs
Browse files Browse the repository at this point in the history
  • Loading branch information
BradReesWork committed Jul 29, 2020
1 parent 428e0cb commit cffa80a
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 32 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
- PR #1012 Fix Local build script README
- PR #1017 Fix more mg bugs
- PR #1022 Fix support for using a cudf.DataFrame with a MG graph
- PR #1027 Fix documentation

# cuGraph 0.14.0 (03 Jun 2020)

Expand Down
23 changes: 13 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ __Style Formatting Tools:__
* `flake8` version 3.5.0+


<a name="issue"></a>
## 1) File an Issue for the RAPIDS cuGraph team to work

## 1) File an Issue for the RAPIDS cuGraph team to work <a name="issue"></a>
To file an issue, go to the RAPIDS cuGraph [issue](https://github.com/rapidsai/cugraph/issues/new/choose) page an select the appropriate issue type. Once an issue is filed the RAPIDS cuGraph team will evaluate and triage the issue. If you believe the issue needs priority attention, please include that in the issue to notify the team.

***Bug Report***</pr>
Expand All @@ -36,8 +36,8 @@ There are several ways to ask questions, including [Stack Overflow]( https://sta
- describing your question


<a name="implement"></a>
## 2) Propose a New Feature and Implement It

## 2) Propose a New Feature and Implement It <a name="implement"></a>

We love when people want to get involved, and if you have a suggestion for a new feature or enhancement and want to be the one doing the development work, we fully encourage that.

Expand All @@ -46,17 +46,17 @@ We love when people want to get involved, and if you have a suggestion for a new
- Once we agree that the plan looks good, go ahead and implement it
- Follow the [code contributions](#code-contributions) guide below.

<a name="bugfix"></a>
## 3) You want to implement a feature or bug-fix for an outstanding issue

## 3) You want to implement a feature or bug-fix for an outstanding issue <a name="bugfix"></a>
- Find an open Issue, and post that you would like to work that issues
- Once we agree that the plan looks good, go ahead and implement it
- Follow the [code contributions](#code-contributions) guide below.

If you need more context on a particular issue, please ask.

----
<a name="code"></a>
# So you want to contribute code

# So you want to contribute code <a name="code"></a>

**TL;DR General Development Process**
1. Read the documentation on [building from source](SOURCEBUILD.md) to learn how to setup, and validate, the development environment
Expand All @@ -74,11 +74,14 @@ If you need more context on a particular issue, please ask.
Remember, if you are unsure about anything, don't hesitate to comment on issues
and ask for clarifications!

**The _TODO_** comment<pr>

Use the _TODO_ comment to capture technical debt. It should not be used to flag areas that need to be fixed. FixMe blocks need to be cleaned up before code is submitted.



## Fork a private copy of cuGraph
<a name="fork"></a>

## Fork a private copy of cuGraph <a name="fork"></a>
The RAPIDS cuGraph repo cannot directly be modified. Contributions must come in the form of a *Pull Request* from a forked version of cugraph. GitHub as a nice write up ion the process: https://help.github.com/en/github/getting-started-with-github/fork-a-repo

1. Fork the cugraph repo to your GitHub account
Expand Down
2 changes: 1 addition & 1 deletion PRTAGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ PR = Pull Request
| skip-ci | _Do Not Run CI_ - This flag prevents CI from being run. It is good practice to include this with the **WIP** tag since code is typically not at a point where it will pass CI. |
| skip ci | same as above |
| API-REVIEW | This tag request a code review just of the API portion of the code - This is benificial to ensure that all required arguments are captured. Doing this early can save from having to refactor later. |
| REVIEW | The code is ready for a full code review. Only code that has passed a code review is merged into the baseline |
| REVIEW | The code is ready for a full code review. Only code that has passed a code review is merged into the baseline |
40 changes: 19 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![Build Status](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cugraph/job/branches/job/cugraph-branch-pipeline/badge/icon)](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cugraph/job/branches/job/cugraph-branch-pipeline/)

The [RAPIDS](https://rapids.ai) cuGraph library is a collection of GPU accelerated graph algorithms that process data found in [GPU DataFrames](https://github.com/rapidsai/cudf). The vision of cuGraph is _to make graph analysis ubiquitous to the point that users just think in terms of analysis and not technologies or frameworks_. To realize that vision, cuGraph operates, at the Python layer, on GPU DataFrames, allowing for seamless passing of data between ETL tasks in [cuDF](https://github.com/rapidsai/cudf) and machine learning tasks in [cuML](https://github.com/rapidsai/cuml). Data scientists familiar with Python will quickly pick up how cuGraph integrates with the Pandas-like API of cuDF. Likewise, users familiar with NetworkX will quickly recognize the NetworkX-like API provided in cuGraph, with the goal to allow existing code to be ported with minimal effort into RAPIDS. For users familiar with C++/CUDA and graph structures, a C++ API is also provided. However, there is less type and structure checking at the C++ layer.
The [RAPIDS](https://rapids.ai) cuGraph library is a collection of GPU accelerated graph algorithms that process data found in [GPU DataFrames](https://github.com/rapidsai/cudf). The vision of cuGraph is _to make graph analysis ubiquitous to the point that users just think in terms of analysis and not technologies or frameworks_. To realize that vision, cuGraph operates, at the Python layer, on GPU DataFrames, thereby allowing for seamless passing of data between ETL tasks in [cuDF](https://github.com/rapidsai/cudf) and machine learning tasks in [cuML](https://github.com/rapidsai/cuml). Data scientists familiar with Python will quickly pick up how cuGraph integrates with the Pandas-like API of cuDF. Likewise, users familiar with NetworkX will quickly recognize the NetworkX-like API provided in cuGraph, with the goal to allow existing code to be ported with minimal effort into RAPIDS. For users familiar with C++/CUDA and graph structures, a C++ API is also provided. However, there is less type and structure checking at the C++ layer.

For more project details, see [rapids.ai](https://rapids.ai/).

Expand All @@ -14,12 +14,12 @@ The [RAPIDS](https://rapids.ai) cuGraph library is a collection of GPU accelerat
import cugraph

# read data into a cuDF DataFrame using read_csv
cu_M = cudf.read_csv("graph_data.csv", names=["src", "dst"], dtype=["int32", "int32"])
gdf = cudf.read_csv("graph_data.csv", names=["src", "dst"], dtype=["int32", "int32"])

# We now have data as edge pairs
# create a Graph using the source (src) and destination (dst) vertex pairs
G = cugraph.Graph()
G.from_cudf_edgelist(cu_M, source='src', destination='dst')
G.from_cudf_edgelist(gdf, source='src', destination='dst')

# Let's now get the PageRank score of each vertex by calling cugraph.pagerank
df_page = cugraph.pagerank(G)
Expand All @@ -43,21 +43,22 @@ for i in range(len(df_page)):
| | Louvain | Single-GPU | |
| | Ensemble Clustering for Graphs | Single-GPU | |
| | Spectral-Clustering - Balanced Cut | Single-GPU | |
| | Spectral-Clustering | Single-GPU | |
| | Spectral-Clustering - Modularity | Single-GPU | |
| | Subgraph Extraction | Single-GPU | |
| | Triangle Counting | Single-GPU | |
| | K-Truss | Single-GPU | |
| Components | | | |
| | Weakly Connected Components | Single-GPU | |
| | Strongly Connected Components | Single-GPU | |
| Core | | | |
| | K-Core | Single-GPU | |
| | Core Number | Single-GPU | |
| | K-Truss | Single-GPU | |
| Layout | | | |
| | Force Atlas 2 | Single-GPU | |
| Link Analysis| | | |
| | Pagerank | Single-GPU | |
| | Pagerank | Multiple-GPU | |
| | Personal Pagerank | Single-GPU | |
| | HITS | Single-GPU | leverages Gunrock |
| Link Prediction | | | |
| | Jaccard Similarity | Single-GPU | |
| | Weighted Jaccard Similarity | Single-GPU | |
Expand All @@ -79,26 +80,25 @@ for i in range(len(df_page)):
## cuGraph Notice
The current version of cuGraph has some limitations:

- Vertex IDs need to be 32-bit integers.
- Vertex IDs need to be 32-bit integers (that restriction is going away in 0.16)
- Vertex IDs are expected to be contiguous integers starting from 0.
-- If the starting index is not zero, cuGraph will add disconnected vertices to fill in the missing range. (Auto-) Renumbering fixes this issue

cuGraph provides the renumber function to mitigate this problem. Input vertex IDs for the renumber function can be any type, can be non-contiguous, and can start from an arbitrary number. The renumber function maps the provided input vertex IDs to 32-bit contiguous integers starting from 0. cuGraph still requires the renumbered vertex IDs to be representable in 32-bit integers. These limitations are being addressed and will be fixed soon.
cuGraph provides the renumber function to mitigate this problem, which is by default automatically called when data is addted to a graph. Input vertex IDs for the renumber function can be any type, can be non-contiguous, can be multiple columns, and can start from an arbitrary number. The renumber function maps the provided input vertex IDs to 32-bit contiguous integers starting from 0. cuGraph still requires the renumbered vertex IDs to be representable in 32-bit integers. These limitations are being addressed and will be fixed soon.

cuGraph provides an auto-renumbering feature, enabled by default, during Graph creating. Renumbered vertices are automatically un-renumbered.
Additionally, when using the auto-renumbering feature, vertices are automatically un-renumbered in results.

cuGraph is constantly being updated and improved. Please see the [Transition Guide](TRANSITIONGUIDE.md) if errors are encountered with newer versions

## Graph Sizes and GPU Memory Size
The amount of memory required is dependent on the graph structure and the analytics being executed. As a simple rule of thumb, the amount of GPU memory should be about twice the size of the data size. That gives overhead for the CSV reader and other transform functions. There are ways around the rule but using smaller data chunks.


| Size | Recommended GPU Memory |
|-------------------|------------------------|
| 500 million edges | 32GB |
| 250 million edges | 16 GB |


The use of managed memory for oversubscription can also be used to exceed the above memory limitations. See the recent blog on _Tackling Large Graphs with RAPIDS cuGraph and CUDA Unified Memory on GPUs_: https://medium.com/rapids-ai/tackling-large-graphs-with-rapids-cugraph-and-unified-virtual-memory-b5b69a065d4


## Getting cuGraph
Expand All @@ -109,35 +109,33 @@ There are 3 ways to get cuGraph :
3. [Build from Source](#source)


<a name="quick"></a>

## Quick Start

## Quick Start <a name="quick"></a>
Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapidsai/), choosing a tag based on the NVIDIA CUDA version you’re running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize all of the RAPIDS libraries: cuDF, cuML, and cuGraph.


<a name="conda"></a>
### Conda
### Conda <a name="conda"></a>
It is easy to install cuGraph using conda. You can get a minimal conda installation with [Miniconda](https://conda.io/miniconda.html) or get the full installation with [Anaconda](https://www.anaconda.com/download).

Install and update cuGraph using the conda command:

```bash

# CUDA 10.0
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.0

# CUDA 10.1
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.1

# CUDA 10.2
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.2

# CUDA 11.0
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=11.0
```

Note: This conda installation only applies to Linux and Python versions 3.6/3.7.
Note: This conda installation only applies to Linux and Python versions 3.7/3.8.


<a name="source"></a>
### Build from Source and Contributing
### Build from Source and Contributing <a name="source"></a>

Please see our [guide for building cuGraph from source](SOURCEBUILD.md)</pr>

Expand Down

0 comments on commit cffa80a

Please sign in to comment.