Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cudf.options #11193

Merged
merged 18 commits into from
Jul 28, 2022
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/cudf/source/api_docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ This page provides a list of all publicly accessible modules, methods and classe
io
subword_tokenize
string_handling
options
12 changes: 12 additions & 0 deletions docs/cudf/source/api_docs/options.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _api.options:

============
cudf Options
============

.. autosummary::
:toctree: api/

cudf.get_option
cudf.set_option
cudf.describe_option
1 change: 1 addition & 0 deletions docs/cudf/source/developer_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@

library_design
documentation
options
19 changes: 19 additions & 0 deletions docs/cudf/source/developer_guide/options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Options

Options are stored as a dictionary in `cudf.options` module.
isVoid marked this conversation as resolved.
Show resolved Hide resolved
Each option name is also its key in the dictionary.
isVoid marked this conversation as resolved.
Show resolved Hide resolved
The value of the option is an instance of an `Options` object.

An `Options` object has the following attributes:
- `value`: the current value of the option
- `description`: a text description of the option
- `validator`: a boolean function that returns `True` if `value` is valid,
`False` otherwise.

Developers can use `cudf.options._register_option` to add options to the dictionary.
{py:func}`cudf.get_option` is provided to get option values from the dictionary.

When testing the behavior of a certain option,
it is advised to use [`yield` fixture](https://docs.pytest.org/en/7.1.x/how-to/fixtures.html#yield-fixtures-recommended) to set up and clean up the option.

See the [API reference](api.options) for more details.
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ groupby
guide-to-udfs
cupy-interop
dask-cudf
options
PandasCompat
```
12 changes: 12 additions & 0 deletions docs/cudf/source/user_guide/options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Options

cuDF has an options API to configure and customize global behavior.
This API complements the [pandas.options](https://pandas.pydata.org/docs/user_guide/options.html) API with features specific to cuDF.

{py:func}`cudf.describe_option` will print the option's description,
the current and default value.
isVoid marked this conversation as resolved.
Show resolved Hide resolved
When no argument is provided,
all options are printed.
To set value to a option, use {py:func}`cudf.set_option`.
isVoid marked this conversation as resolved.
Show resolved Hide resolved

See the [API reference](api.options) for more details.
9 changes: 9 additions & 0 deletions python/cudf/cudf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@
from cudf.utils.dtypes import _NA_REP
from cudf.utils.utils import set_allocator

from cudf.options import (
get_option,
set_option,
describe_option,
)

try:
from ptxcompiler.patch import patch_numba_codegen_if_needed
except ImportError:
Expand Down Expand Up @@ -167,4 +173,7 @@
"to_datetime",
"to_numeric",
"unstack",
"get_option",
"set_option",
"describe_option",
isVoid marked this conversation as resolved.
Show resolved Hide resolved
]
100 changes: 100 additions & 0 deletions python/cudf/cudf/options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from dataclasses import dataclass
from typing import Any, Callable, Dict, Optional


@dataclass
class Option:
default: Any
value: Any
description: str
validator: Callable


_OPTIONS: Dict[str, Option] = {}


def _register_option(
name: str, default_value: Any, description: str, validator: Callable
):
"""Register an option.

Parameters
----------
name : str
The name of the option.
default_value : Any
The default value of the option.
description : str
A text description of the option.
validator : Callable
Called on the option value to check its validity. Should raise an
error if the value is invalid.

"""
validator(default_value)
_OPTIONS[name] = Option(
default_value, default_value, description, validator
)


def get_option(name: str) -> Any:
"""Get the value of option.

Parameters
----------
key : str
The name of the option.

Returns
-------
The value of the option.
"""
return _OPTIONS[name].value


def set_option(name: str, val: Any):
"""Set the value of option.

Raises ``ValueError`` if the provided value is invalid.

Parameters
----------
name : str
The name of the option.
val : Any
The value to set.
"""
option = _OPTIONS[name]
option.validator(val)
option.value = val


def _build_option_description(name, opt):
return (
f"{name}:\n"
f"\t{opt.description}\n"
f"\t[Default: {opt.default}] [Current: {opt.value}]"
)
Comment on lines +94 to +99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to implement a custom __str__ on Option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be alright to do that, but the Option doesn't know its own name, so it's missing a crucial piece of information. I'm happy with the current state here.



def describe_option(name: Optional[str] = None):
"""Prints a specific option description or all option descriptions.

If `name` is unspecified, prints all available option descriptions.

Parameters
----------
name : Optional[str]
The name of the option.
"""
s = ""
if name is None:
s = "\n".join(
_build_option_description(name, opt)
for name, opt in _OPTIONS.items()
)
else:
s = _build_option_description(name, _OPTIONS[name])
print(s)
isVoid marked this conversation as resolved.
Show resolved Hide resolved
79 changes: 79 additions & 0 deletions python/cudf/cudf/tests/test_options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from contextlib import redirect_stdout
from io import StringIO

import pytest

import cudf


@pytest.fixture(scope="module", autouse=True)
def empty_option_environment():
old_option_enviorment = cudf.options._OPTIONS
cudf.options._OPTIONS = {}
yield
cudf.options._OPTIONS = old_option_enviorment
isVoid marked this conversation as resolved.
Show resolved Hide resolved


@pytest.fixture(scope="module")
def odd_option(empty_option_environment):
def validator(x):
if not x % 2 == 1:
raise ValueError(f"Invalid option {x}")

cudf.options._register_option(
"odd_option",
1,
"An odd option.",
validator,
)
yield
del cudf.options._OPTIONS["odd_option"]


@pytest.fixture(scope="module")
def even_option(empty_option_environment):
def validator(x):
if not x % 2 == 0:
raise ValueError(f"Invalid option {x}")

cudf.options._register_option(
"even_option", 0, "An even option.", validator
)
yield
del cudf.options._OPTIONS["even_option"]


def test_option_get_set(odd_option):
assert cudf.get_option("odd_option") == 1
cudf.set_option("odd_option", 101)
assert cudf.get_option("odd_option") == 101
# Restore default value for other tests
cudf.set_option("odd_option", 1)
isVoid marked this conversation as resolved.
Show resolved Hide resolved


def test_option_set_invalid(odd_option):
with pytest.raises(ValueError, match="Invalid option 0"):
cudf.set_option("odd_option", 0)


def test_option_description(odd_option):
s = StringIO()
with redirect_stdout(s):
cudf.describe_option("odd_option")
s.seek(0)
expected = "odd_option:\n\tAn odd option.\n\t[Default: 1] [Current: 1]\n"
assert expected == s.read()


def test_option_description_all(odd_option, even_option):
s = StringIO()
with redirect_stdout(s):
cudf.describe_option()
s.seek(0)
expected = (
"odd_option:\n\tAn odd option.\n\t[Default: 1] [Current: 1]\n"
"even_option:\n\tAn even option.\n\t[Default: 0] [Current: 0]\n"
)
assert expected == s.read()