Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cudf.options #11193

Merged
merged 18 commits into from
Jul 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/cudf/source/api_docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ This page provides a list of all publicly accessible modules, methods and classe
io
subword_tokenize
string_handling
options
12 changes: 12 additions & 0 deletions docs/cudf/source/api_docs/options.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _api.options:

============
cudf Options
============

.. autosummary::
:toctree: api/

cudf.get_option
cudf.set_option
cudf.describe_option
1 change: 1 addition & 0 deletions docs/cudf/source/developer_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@

library_design
documentation
options
22 changes: 22 additions & 0 deletions docs/cudf/source/developer_guide/options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Options

The usage of options is explained in the [user guide](options_user_guide).
This document provides more explanations on how developers work with options internally.

Options are stored as a dictionary in the `cudf.options` module.
Each option name is its key in the dictionary.
The value of the option is an instance of an `Options` object.

An `Options` object has the following attributes:
- `value`: the current value of the option
- `description`: a text description of the option
- `validator`: a boolean function that returns `True` if `value` is valid,
`False` otherwise.

Developers can use `cudf.options._register_option` to add options to the dictionary.
{py:func}`cudf.get_option` is provided to get option values from the dictionary.

When testing the behavior of a certain option,
it is advised to use [`yield` fixture](https://docs.pytest.org/en/7.1.x/how-to/fixtures.html#yield-fixtures-recommended) to set up and clean up the option.

See the [API reference](api.options) for more details.
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ groupby
guide-to-udfs
cupy-interop
dask-cudf
options
PandasCompat
```
14 changes: 14 additions & 0 deletions docs/cudf/source/user_guide/options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
(options_user_guide)=

# Options

cuDF has an options API to configure and customize global behavior.
This API complements the [pandas.options](https://pandas.pydata.org/docs/user_guide/options.html) API with features specific to cuDF.

{py:func}`cudf.describe_option` will print the option's description,
the current value, and the default value.
When no argument is provided,
all options are printed.
To set value to a option, use {py:func}`cudf.set_option`.
isVoid marked this conversation as resolved.
Show resolved Hide resolved

See the [API reference](api.options) for more details.
9 changes: 9 additions & 0 deletions python/cudf/cudf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@
from cudf.utils.dtypes import _NA_REP
from cudf.utils.utils import set_allocator

from cudf.options import (
get_option,
set_option,
describe_option,
)

try:
from ptxcompiler.patch import patch_numba_codegen_if_needed
except ImportError:
Expand Down Expand Up @@ -144,11 +150,13 @@
"concat",
"cut",
"date_range",
"describe_option",
"factorize",
"from_dataframe",
"from_dlpack",
"from_pandas",
"get_dummies",
"get_option",
"interval_range",
"isclose",
"melt",
Expand All @@ -163,6 +171,7 @@
"read_parquet",
"read_text",
"set_allocator",
"set_option",
"testing",
"to_datetime",
"to_numeric",
Expand Down
114 changes: 114 additions & 0 deletions python/cudf/cudf/options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from dataclasses import dataclass
from typing import Any, Callable, Dict, Optional


@dataclass
class Option:
default: Any
value: Any
description: str
validator: Callable


_OPTIONS: Dict[str, Option] = {}


def _register_option(
name: str, default_value: Any, description: str, validator: Callable
):
"""Register an option.

Parameters
----------
name : str
The name of the option.
default_value : Any
The default value of the option.
description : str
A text description of the option.
validator : Callable
Called on the option value to check its validity. Should raise an
error if the value is invalid.

Raises
------
BaseException
Raised by validator if the value is invalid.
"""
validator(default_value)
_OPTIONS[name] = Option(
default_value, default_value, description, validator
)


def get_option(name: str) -> Any:
"""Get the value of option.

Parameters
----------
key : str
The name of the option.

Returns
-------
The value of the option.

Raises
------
KeyError
If option ``name`` does not exist.
"""
try:
return _OPTIONS[name].value
except KeyError:
raise KeyError(f'"{name}" does not exist.')


def set_option(name: str, val: Any):
"""Set the value of option.

Parameters
----------
name : str
The name of the option.
val : Any
The value to set.

Raises
------
KeyError
If option ``name`` does not exist.
BaseException
Raised by validator if the value is invalid.
"""
try:
option = _OPTIONS[name]
except KeyError:
raise KeyError(f'"{name}" does not exist.')
option.validator(val)
option.value = val


def _build_option_description(name, opt):
return (
f"{name}:\n"
f"\t{opt.description}\n"
f"\t[Default: {opt.default}] [Current: {opt.value}]"
)
Comment on lines +94 to +99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to implement a custom __str__ on Option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be alright to do that, but the Option doesn't know its own name, so it's missing a crucial piece of information. I'm happy with the current state here.



def describe_option(name: Optional[str] = None):
"""Prints the description of an option.

If `name` is unspecified, prints the description of all available options.

Parameters
----------
name : Optional[str]
The name of the option.
"""
names = _OPTIONS.keys() if name is None else [name]
for name in names:
print(_build_option_description(name, _OPTIONS[name]))
77 changes: 77 additions & 0 deletions python/cudf/cudf/tests/test_options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from contextlib import redirect_stdout
from io import StringIO

import pytest

import cudf


@pytest.fixture(scope="module", autouse=True)
def empty_option_environment():
old_option_environment = cudf.options._OPTIONS
cudf.options._OPTIONS = {}
yield
cudf.options._OPTIONS = old_option_environment


@pytest.fixture
def odd_option(empty_option_environment):
def validator(x):
if not x % 2 == 1:
raise ValueError(f"Invalid option value {x}")

cudf.options._register_option(
"odd_option",
1,
"An odd option.",
validator,
)
yield
del cudf.options._OPTIONS["odd_option"]


@pytest.fixture
def even_option(empty_option_environment):
def validator(x):
if not x % 2 == 0:
raise ValueError(f"Invalid option value {x}")

cudf.options._register_option(
"even_option", 0, "An even option.", validator
)
yield
del cudf.options._OPTIONS["even_option"]


def test_option_get_set(odd_option):
assert cudf.get_option("odd_option") == 1
cudf.set_option("odd_option", 101)
assert cudf.get_option("odd_option") == 101


def test_option_set_invalid(odd_option):
with pytest.raises(ValueError, match="Invalid option value 0"):
cudf.set_option("odd_option", 0)


def test_option_description(odd_option):
s = StringIO()
with redirect_stdout(s):
cudf.describe_option("odd_option")
s.seek(0)
expected = "odd_option:\n\tAn odd option.\n\t[Default: 1] [Current: 1]\n"
assert expected == s.read()


def test_option_description_all(odd_option, even_option):
s = StringIO()
with redirect_stdout(s):
cudf.describe_option()
s.seek(0)
expected = (
"odd_option:\n\tAn odd option.\n\t[Default: 1] [Current: 1]\n"
"even_option:\n\tAn even option.\n\t[Default: 0] [Current: 0]\n"
)
assert expected == s.read()