Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs #268

Merged
merged 16 commits into from
Sep 12, 2023
182 changes: 12 additions & 170 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,181 +1,23 @@
# KvikIO: C++ and Python bindings to cuFile
# KvikIO: High Performance File IO

## Summary

This provides C++ and Python bindings to cuFile, which enables GPUDirect Storage (GDS).
KvikIO also works efficiently when GDS isn't available and can read/write both host and
device data seamlessly.
KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python
bindings to [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html),
which enables [GPUDirect Storage (GDS)](https://developer.nvidia.com/blog/gpudirect-storage/).
KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly.
The C++ library is header-only making it easy to include in [existing projects](https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/downstream/).


### Features

* Object Oriented API.
* Exception handling.
* Object oriented API of [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html) with C++/Python exception handling.
* A Python Zarr backend for reading and writing GPU data to file seamlessly.
madsbk marked this conversation as resolved.
Show resolved Hide resolved
* Concurrent reads and writes using an internal thread pool.
* Non-blocking API.
* Python Zarr reader.
* Handle both host and device IO seamlessly.
* Provides Python bindings to [nvCOMP](https://github.com/NVIDIA/nvcomp).

## Requirements

To install users should have a working Linux machine with CUDA Toolkit
installed (v11.4+) and a working compiler toolchain (C++17 and cmake).

### C++

The C++ bindings are header-only and depends on the CUDA Driver API.
In order to build and run the example code, CMake and the CUDA Runtime
API is required.

### Python

The Python package depends on the following packages:

* cython
* pip
* setuptools
* scikit-build

For nvCOMP, benchmarks, examples, and tests:

* pytest
* numpy
* cupy

## Install

### Conda

Install the stable release from the `rapidsai` channel like:

```
conda create -n kvikio_env -c rapidsai -c conda-forge kvikio
```

Install the `kvikio` conda package from the `rapidsai-nightly` channel like:

```
conda create -n kvikio_env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=11.8 kvikio
```

If the nightly install doesn't work, set `channel_priority: flexible` in your `.condarc`.

In order to setup a development environment run:
```
conda env create --name kvikio-dev --file conda/environments/all_cuda-118_arch-x86_64.yaml
```

### C++ (build from source)

To build the C++ example run:

```
./build.sh libkvikio
```

Then run the example:

```
./examples/basic_io
```

### Python (build from source)

To build and install the extension run:

```
./build.sh kvikio
```

One might have to define `CUDA_HOME` to the path to the CUDA installation.

In order to test the installation, run the following:

```
pytest tests/
```

And to test performance, run the following:

```
python benchmarks/single-node-io.py
```

## Examples


### Notebooks
- [How to read and write GPU memory directly to/from Zarr files](notebooks/zarr.ipynb)


### C++

```c++
#include <cstddef>
#include <cuda_runtime.h>
#include <kvikio/file_handle.hpp>
using namespace std;

int main()
{
// Create two arrays `a` and `b`
constexpr std::size_t size = 100;
void *a = nullptr;
void *b = nullptr;
cudaMalloc(&a, size);
cudaMalloc(&b, size);

// Write `a` to file
kvikio::FileHandle fw("test-file", "w");
size_t written = fw.write(a, size);
fw.close();

// Read file into `b`
kvikio::FileHandle fr("test-file", "r");
size_t read = fr.read(b, size);
fr.close();

// Read file into `b` in parallel using 16 threads
kvikio::default_thread_pool::reset(16);
{
kvikio::FileHandle f("test-file", "r");
future<size_t> future = f.pread(b_dev, sizeof(a), 0); // Non-blocking
size_t read = future.get(); // Blocking
// Notice, `f` closes automatically on destruction.
}
}
```

### Python

```python
import cupy
import kvikio

a = cupy.arange(100)
f = kvikio.CuFile("test-file", "w")
# Write whole array to file
f.write(a)
f.close()

b = cupy.empty_like(a)
f = kvikio.CuFile("test-file", "r")
# Read whole array from file
f.read(b)
assert all(a == b)

# Use contexmanager
c = cupy.empty_like(a)
with kvikio.CuFile(path, "r") as f:
f.read(c)
assert all(a == c)

# Non-blocking read
d = cupy.empty_like(a)
with kvikio.CuFile(path, "r") as f:
future1 = f.pread(d[:50])
future2 = f.pread(d[50:], file_offset=d[:50].nbytes)
future1.get() # Wait for first read
future2.get() # Wait for second read
assert all(a == d)
```
### Documentation
* Python: <https://docs.rapids.ai/api/kvikio/nightly/>
* C++: <https://docs.rapids.ai/api/libkvikio/nightly/>
4 changes: 3 additions & 1 deletion conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,18 @@ dependencies:
- libcufile=1.4.0.31
- ninja
- numpy>=1.21
- numpydoc
- nvcc_linux-64=11.8
- nvcomp==2.6.1
- packaging
- pre-commit
- pydata-sphinx-theme
- pytest
- pytest-cov
- python>=3.9,<3.11
- scikit-build>=0.13.1
- sphinx
- sphinx-click
- sphinx_rtd_theme
- sysroot_linux-64=2.17
- zarr
name: all_cuda-118_arch-x86_64
4 changes: 3 additions & 1 deletion conda/environments/all_cuda-120_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,17 @@ dependencies:
- libcufile-dev
- ninja
- numpy>=1.21
- numpydoc
- nvcomp==2.6.1
- packaging
- pre-commit
- pydata-sphinx-theme
- pytest
- pytest-cov
- python>=3.9,<3.11
- scikit-build>=0.13.1
- sphinx
- sphinx-click
- sphinx_rtd_theme
- sysroot_linux-64=2.17
- zarr
name: all_cuda-120_arch-x86_64
118 changes: 115 additions & 3 deletions cpp/doxygen/main_page.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,116 @@
# libkvikio
# Welcome to KvikIO's C++ documentation!

libkvikio is a C++ header-only library providing bindings to
cuFile, which enables GPUDirectStorage (GDS).
KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python
bindings to [cuFile](https://docs.nvidia.com/gpudirect-storage/api-reference-guide/index.html)
which enables [GPUDirect Storage (GDS)](https://developer.nvidia.com/blog/gpudirect-storage/).
KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly.

KvikIO C++ is a header-only library that is part of the [RAPIDS](https://rapids.ai/) suite of open-source software libraries for GPU-accelerated data science.

---
**Notice** this is the documentation for the C++ library. For the Python documentation of KvikIO, see under **KvikIO**.

---

## Features

* Object Oriented API.
* Exception handling.
* Concurrent reads and writes using an internal thread pool.
* Non-blocking API.
* Handle both host and device IO seamlessly.

## Installation

KvikIO is a header-only library and as such doesn't need installation.
However, for convenience we release Conda packages that makes it easy
to include KvikIO in your CMake projects.

### Conda/Mamba

We strongly recommend to use `mamba <https://github.com/mamba-org/mamba>`_ inplace of conda, which we will do throughout the documentation.
madsbk marked this conversation as resolved.
Show resolved Hide resolved

Install the **stable release** from the ``rapidsai`` channel like:
madsbk marked this conversation as resolved.
Show resolved Hide resolved
```sh
# Install in existing environment
mamba install -c rapidsai -c conda-forge libkvikio
# Create new environment (CUDA 11.8)
mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=11.8 libkvikio
# Create new environment (CUDA 12.0)
mamba create -n libkvikio-env -c rapidsai -c conda-forge cuda-version=12.0 libkvikio
```

Install the **nightly release** from the ``rapidsai-nightly`` channel like:
madsbk marked this conversation as resolved.
Show resolved Hide resolved

```sh
# Install in existing environment
mamba install -c rapidsai-nightly -c conda-forge libkvikio
# Create new environment (CUDA 11.8)
mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=11.8 libkvikio
# Create new environment (CUDA 12.0)
mamba create -n libkvikio-env -c rapidsai-nightly -c conda-forge python=3.10 cuda-version=12.0 libkvikio
```

---
**Notice** if the nightly install doesn't work, set ``channel_priority: flexible`` in your ``.condarc``.

---

### Include KvikIO in a CMake project
An example of how to include KvikIO in an existing CMake project can be found here: <https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/downstream/>.


### Build from source

To build the C++ example run:

```
./build.sh libkvikio
```

Then run the example:

```
./examples/basic_io
```


## Example

```cpp
#include <cstddef>
#include <cuda_runtime.h>
#include <kvikio/file_handle.hpp>
using namespace std;

int main()
{
// Create two arrays `a` and `b`
constexpr std::size_t size = 100;
void *a = nullptr;
void *b = nullptr;
cudaMalloc(&a, size);
cudaMalloc(&b, size);

// Write `a` to file
kvikio::FileHandle fw("test-file", "w");
size_t written = fw.write(a, size);
fw.close();

// Read file into `b`
kvikio::FileHandle fr("test-file", "r");
size_t read = fr.read(b, size);
fr.close();

// Read file into `b` in parallel using 16 threads
kvikio::default_thread_pool::reset(16);
{
kvikio::FileHandle f("test-file", "r");
future<size_t> future = f.pread(b_dev, sizeof(a), 0); // Non-blocking
size_t read = future.get(); // Blocking
// Notice, `f` closes automatically on destruction.
}
}
```

For a full runnable example see <https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/basic_io.cpp>.
4 changes: 3 additions & 1 deletion dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,10 @@ dependencies:
common:
- output_types: [conda, requirements]
packages:
- pydata-sphinx-theme
- numpydoc
- sphinx
- sphinx-click
- sphinx_rtd_theme
- output_types: conda
packages:
- doxygen=1.8.20 # pre-commit hook needs a specific version.
Expand Down
1 change: 0 additions & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ Zarr
.. autoclass:: GDSStore
:members:


Defaults
--------
.. currentmodule:: kvikio.defaults
Expand Down
Loading