Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW][FEA] Port nvtx.pyx to use non-legacy libcudf APIs #4235

Merged
merged 45 commits into from
Mar 18, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
bc16dc8
initial commit for libcudf++ nvtx.pyx
Feb 22, 2020
3afc2a6
initial commit for libcudf++ nvtx.pyx
Feb 22, 2020
cf409c7
style fix and CHANGELOG
Feb 22, 2020
f24d6a4
initial commit for libcudf++ nvtx.pyx
Feb 22, 2020
eb2cbfb
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Feb 26, 2020
a622c3a
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Feb 27, 2020
031f347
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 5, 2020
92cdee2
correct enum handling for nvtx
Mar 5, 2020
eac8fcd
style check
Mar 5, 2020
9a37846
uncommenting special color references
Mar 5, 2020
fe4c688
addressing some of the comments on the PR
Mar 5, 2020
28207e2
style fix
Mar 5, 2020
3e72f38
fumblight through typecasting
Mar 10, 2020
264a45b
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 11, 2020
a0355b6
still fumbling through Enum types for nvtx
Mar 11, 2020
c508a62
More fumbling with nvtx enums
Mar 11, 2020
cad9ea1
nvtx libcudf++ solution
Mar 11, 2020
b8c275d
style fix
Mar 11, 2020
210cecb
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 11, 2020
716370e
adding custom hex support for nvtx
Mar 11, 2020
238e07f
style fix
Mar 11, 2020
a4a9540
style fix
Mar 11, 2020
39c0dcb
updated all instances of python nvtx to use libcudf++ nvtx
Mar 11, 2020
56952f1
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 11, 2020
8ee8954
style fixes
Mar 11, 2020
99e6a3c
hidden legacy nvtx.range_pop
Mar 11, 2020
b895537
Update python/cudf/cudf/_libxx/nvtx.pyx
millerhooks Mar 12, 2020
90c05eb
Update python/cudf/cudf/_libxx/cpp/nvtx.pxd
millerhooks Mar 12, 2020
4beaa06
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 12, 2020
4e3b0d1
moving nvtx utils
Mar 12, 2020
1876d6f
added utilities init.py.
Mar 12, 2020
91621b4
Merge branch 'branch-0.13' into fea-cython-nvtx
millerhooks Mar 12, 2020
b8c1b8e
Update python/cudf/cudf/_libxx/nvtx.pyx
millerhooks Mar 12, 2020
f4f0b4e
nvtx name encoding safety
Mar 12, 2020
b57ff3e
Merge branch 'fea-cython-nvtx' of github.com:millerhooks/cudf into fe…
Mar 12, 2020
52f34db
Merge branch 'branch-0.13' of https://github.com/rapidsai/cudf into f…
Mar 12, 2020
645e88c
style fix
Mar 12, 2020
d0b7f50
utilities directory nvtx file organization
Mar 12, 2020
242dbb9
copyright 2020
Mar 12, 2020
5091225
Update python/cudf/cudf/_libxx/cpp/utilities/nvtx_utils.pxd
millerhooks Mar 13, 2020
edfb90f
Merge branch 'branch-0.13' into fea-cython-nvtx
Mar 17, 2020
366ab78
fix location of underlying type definition
Mar 17, 2020
d36de09
style
Mar 17, 2020
195e544
remove unused utilities from init
Mar 17, 2020
5e7b079
fix bad imports
Mar 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 2 additions & 13 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,18 @@

- PR #4360 Added Java bindings for bitwise shift operators
- PR #3577 Add initial dictionary support to column classes
- PR #3917 Add dictionary add_keys function
millerhooks marked this conversation as resolved.
Show resolved Hide resolved
- PR #3777 Add support for dictionary column in gather
- PR #3693 add string support, skipna to scan operation
- PR #3662 Define and implement `shift`.
- PR #3842 ORC writer: add support for column statistics
- PR #3861 Added Series.sum feature for String
- PR #4069 Added cast of numeric columns from/to String
- PR #3681 Add cudf::experimental::boolean_mask_scatter
- PR #4088 Added asString() on ColumnVector in Java that takes a format string
- PR #4040 Add support for n-way merge of sorted tables
- PR #4053 Multi-column quantiles.
- PR #4100 Add set_keys function for dictionary columns
- PR #3894 Add remove_keys functions for dictionary columns
- PR #4107 Add groupby nunique aggregation
- PR #4235 Port nvtx.pyx to use non-legacy libcudf APIs
- PR #4153 Support Dask serialization protocol on cuDF objects
- PR #4127 Add python API for n-way sorted merge (merge_sorted)
- PR #4164 Add Buffer "constructor-kwargs" header
Expand Down Expand Up @@ -64,19 +62,15 @@
- PR #3911 Adding null boolean handling for copy_if_else
- PR #4003 Drop old `to_device` utility wrapper function
- PR #4002 Adding to_frame and fix for categorical column issue
- PR #4035 Port NVText tokenize function to libcudf++
- PR #4009 build script update to enable cudf build without installing
- PR #3897 Port cuIO JSON reader to cudf::column types
- PR #4008 Eliminate extra copy in column constructor
- PR #4013 Add cython definition for io readers cudf/io/io_types.hpp
- PR #4028 Port json.pyx to use new libcudf APIs
- PR #4014 ORC/Parquet: add count parameter to stripe/rowgroup-based reader API
- PR #4042 Port cudf/io/functions.hpp to Cython for use in IO bindings
- PR #3880 Add aggregation infrastructure support for reduction
- PR #3880 Add aggregation infrastructure support for cudf::reduce
- PR #4059 Add aggregation infrastructure support for cudf::scan
- PR #4059 Add aggregation infrastructure support for cudf::scan
- PR #4021 Change quantiles signature for clarity.
- PR #4058 Port hash.pyx to use libcudf++ APIs
- PR #4057 Handle offsets in cython Column class
- PR #4045 Reorganize `libxx` directory
- PR #4029 Port stream_compaction.pyx to use libcudf++ APIs
Expand All @@ -102,7 +96,6 @@
- PR #4098 Remove legacy calls from libcudf strings column code
- PR #4044 Port join.pyx to use libcudf++ APIs
- PR #4111 Use `Buffer`'s to serialize `StringColumn`
- PR #4133 Mask cleanup and fixes: use `int32` dtype, ensure 64 byte padding, handle offsets
- PR #4113 Get `len` of `StringColumn`s without `nvstrings`
- PR #4147 Remove workaround for UNKNOWN_NULL_COUNT in contiguous_split.
- PR #4130 Renames in-place `cudf::experimental::fill` to `cudf::experimental::fill_in_place`
Expand Down Expand Up @@ -203,13 +196,9 @@
- PR #4089 Fix dask groupby mutliindex test case issues in join
- PR #4097 Fix strings concatenate logic with column offsets
- PR #4076 All null string entries should have null data buffer
- PR #4145 Support empty index case in DataFrame._from_table
- PR #4109 Use rmm::device_vector instead of thrust::device_vector
- PR #4113 Use `.nvstrings` in `StringColumn.sum(...)`
- PR #4116 Fix a bug in contiguous_split() where tables with mixed column types could corrupt string output
- PR #4108 Fix dtype bugs in dask_cudf metadata (metadata_nonempty overhaul)
- PR #4138 Really fix strings concatenate logic with column offsets
- PR #4119 Fix binary ops slowdown using jitify -remove-unused-globals
- PR #4125 Fix type enum to account for added Dictionary type in `types.hpp`
- PR #4132 Fix `hash_partition` null mask allocation
- PR #4137 Update Java for mutating fill and rolling window changes
Expand Down
5 changes: 4 additions & 1 deletion python/cudf/cudf/_lib/avro.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ from libcpp.vector cimport vector
from libcpp.memory cimport unique_ptr

from cudf.utils import ioutils
from cudf._lib.nvtx import nvtx_range_push, nvtx_range_pop
from cudf._libxx.nvtx import (
range_push as nvtx_range_push,
range_pop as nvtx_range_pop
)

from io import BytesIO
import errno
Expand Down
9 changes: 6 additions & 3 deletions python/cudf/cudf/_lib/csv.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,10 @@ from cudf._lib.cudf cimport *
from cudf._lib.cudf import *
from cudf._lib.utils cimport *
from cudf._lib.utils import *
from cudf._lib.nvtx import nvtx_range_push, nvtx_range_pop
from cudf._libxx.nvtx import (
range_push as nvtx_range_push,
range_pop as nvtx_range_pop
)
from cudf._lib.includes.csv cimport (
reader as csv_reader,
reader_options as csv_reader_options
Expand Down Expand Up @@ -87,7 +90,7 @@ cpdef read_csv(
if delimiter is None:
delimiter = sep

nvtx_range_push("CUDF_READ_CSV", "purple")
nvtx_range_push("CUDF_READ_CSV", "PURPLE")
millerhooks marked this conversation as resolved.
Show resolved Hide resolved

# Setup reader options
cdef csv_reader_options args = csv_reader_options()
Expand Down Expand Up @@ -262,7 +265,7 @@ cpdef write_csv(
cudf.io.csv.write_csv
"""

nvtx_range_push("CUDF_WRITE_CSV", "purple")
nvtx_range_push("CUDF_WRITE_CSV", "PURPLE")

from cudf.core.series import Series

Expand Down
13 changes: 0 additions & 13 deletions python/cudf/cudf/_lib/cudf.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -317,19 +317,6 @@ cdef extern from "cudf/cudf.h" nogil:
size_type* out_indices
) except +

cdef gdf_error gdf_nvtx_range_push(
const char* const name,
gdf_color color
) except +

cdef gdf_error gdf_nvtx_range_push_hex(
const char* const name,
unsigned int color
) except +

cdef gdf_error gdf_nvtx_range_pop() except +


cdef extern from "cudf/legacy/bitmask.hpp" nogil:

cdef gdf_error gdf_count_nonzero_mask(
Expand Down
1 change: 1 addition & 0 deletions python/cudf/cudf/_libxx/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
merge,
null_mask,
nvtext,
nvtx,
orc,
quantiles,
reduce,
Expand Down
Empty file.
Empty file.
34 changes: 34 additions & 0 deletions python/cudf/cudf/_libxx/cpp/utilities/nvtx_utils.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Copyright (c) 2020, NVIDIA CORPORATION.

from libc.stdint cimport uint32_t


cdef extern from "cudf/utilities/nvtx_utils.hpp" namespace "cudf::nvtx" nogil:
ctypedef enum color:
GREEN 'cudf::nvtx::color::GREEN'
BLUE 'cudf::nvtx::color::BLUE'
YELLOW 'cudf::nvtx::color::YELLOW'
PURPLE 'cudf::nvtx::color::PURPLE'
CYAN 'cudf::nvtx::color::CYAN'
RED 'cudf::nvtx::color::RED'
WHITE 'cudf::nvtx::color::WHITE'
DARK_GREEN 'cudf::nvtx::color::DARK_GREEN'
ORANGE 'cudf::nvtx::color::ORANGE'

cdef color JOIN_COLOR 'cudf::nvtx::JOIN_COLOR'
cdef color GROUP_COLOR 'cudf::nvtx::GROUP_COLOR'
cdef color BINARY_OP_COLOR 'cudf::nvtx::BINARY_OP_COLOR'
cdef color PARTITION_COLOR 'cudf::nvtx::PARTITION_COLOR'
cdef color READ_CSV_COLOR 'cudf::nvtx::READ_CSV_COLOR'

cdef void range_push(
const char* const name,
color color
) except +

cdef void range_push_hex(
const char* const name,
uint32_t color
) except +

cdef void range_pop() except +
6 changes: 6 additions & 0 deletions python/cudf/cudf/_libxx/nvtx.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Copyright (c) 2020, NVIDIA CORPORATION.

from libc.stdint cimport uint32_t


ctypedef uint32_t underlying_type_t_color
58 changes: 58 additions & 0 deletions python/cudf/cudf/_libxx/nvtx.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright (c) 2020, NVIDIA CORPORATION.

from enum import IntEnum
from libcpp.string cimport string
from cudf._libxx.cpp.utilities.nvtx_utils cimport (
range_push as cpp_range_push,
range_push_hex as cpp_range_push_hex,
range_pop as cpp_range_pop,
color as color_types,
)
from cudf._libxx.nvtx cimport underlying_type_t_color


class Color(IntEnum):
GREEN = <underlying_type_t_color> color_types.GREEN
BLUE = <underlying_type_t_color> color_types.BLUE
YELLOW = <underlying_type_t_color> color_types.YELLOW
PURPLE = <underlying_type_t_color> color_types.PURPLE
CYAN = <underlying_type_t_color> color_types.CYAN
RED = <underlying_type_t_color> color_types.RED
WHITE = <underlying_type_t_color> color_types.WHITE
DARK_GREEN = <underlying_type_t_color> color_types.DARK_GREEN
ORANGE = <underlying_type_t_color> color_types.ORANGE


def range_push(object name, object color='GREEN'):
"""
Demarcate the beginning of a user-defined NVTX range.

Parameters
----------
name : str
The name of the NVTX range
color : str
The color to use for the range.
Can be named color or hex RGB string.
"""
try:
color = int(color, 16)
except ValueError:
color = int(Color[color.upper()].value)

cdef const char *_name
name = name.encode()
_name = name

cdef underlying_type_t_color _color = color

with nogil:
cpp_range_push_hex(_name, _color)


def range_pop():
"""
Demarcate the end of a user-defined NVTX range.
"""
with nogil:
cpp_range_pop()
4 changes: 2 additions & 2 deletions python/cudf/cudf/core/column/datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ def is_unique(self):


def binop(lhs, rhs, op, out_dtype):
libcudf.nvtx.nvtx_range_push("CUDF_BINARY_OP", "orange")
libcudfxx.nvtx.range_push("CUDF_BINARY_OP", "orange")
out = libcudfxx.binaryop.binaryop(lhs, rhs, op, out_dtype)
libcudf.nvtx.nvtx_range_pop()
libcudfxx.nvtx.range_pop()
return out
5 changes: 2 additions & 3 deletions python/cudf/cudf/core/column/numerical.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
import pyarrow as pa
from pandas.api.types import is_integer_dtype

import cudf._lib as libcudf
import cudf._libxx as libcudfxx
from cudf.core.buffer import Buffer
from cudf.core.column import as_column, column
Expand Down Expand Up @@ -409,7 +408,7 @@ def can_cast_safely(self, to_dtype):
def _numeric_column_binop(lhs, rhs, op, out_dtype, reflect=False):
if reflect:
lhs, rhs = rhs, lhs
libcudf.nvtx.nvtx_range_push("CUDF_BINARY_OP", "orange")
libcudfxx.nvtx.range_push("CUDF_BINARY_OP", "orange")

is_op_comparison = op in ["lt", "gt", "le", "ge", "eq", "ne"]

Expand All @@ -421,7 +420,7 @@ def _numeric_column_binop(lhs, rhs, op, out_dtype, reflect=False):
if is_op_comparison:
out = out.fillna(op == "ne")

libcudf.nvtx.nvtx_range_pop()
libcudfxx.nvtx.range_pop()
return out


Expand Down
5 changes: 4 additions & 1 deletion python/cudf/cudf/core/column/string.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@

import cudf._libxx as libcudfxx
import cudf._libxx.string_casting as str_cast
from cudf._lib.nvtx import nvtx_range_pop, nvtx_range_push
from cudf._libxx.nvtext.generate_ngrams import (
generate_ngrams as cpp_generate_ngrams,
)
Expand All @@ -27,6 +26,10 @@
count_tokens as cpp_count_tokens,
tokenize as cpp_tokenize,
)
from cudf._libxx.nvtx import (
range_pop as nvtx_range_pop,
range_push as nvtx_range_push,
)
from cudf._libxx.strings.attributes import (
code_points as cpp_code_points,
count_characters as cpp_count_characters,
Expand Down
18 changes: 9 additions & 9 deletions python/cudf/cudf/core/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1863,7 +1863,7 @@ def nans_to_nulls(self):
@classmethod
def _concat(cls, objs, axis=0, ignore_index=False):

libcudf.nvtx.nvtx_range_push("CUDF_CONCAT", "orange")
libcudfxx.nvtx.range_push("CUDF_CONCAT", "orange")

if ignore_index:
index = RangeIndex(sum(map(len, objs)))
Expand Down Expand Up @@ -1901,7 +1901,7 @@ def _concat(cls, objs, axis=0, ignore_index=False):
else:
out.columns = unique_columns_ordered_ls

libcudf.nvtx.nvtx_range_pop()
libcudfxx.nvtx.range_pop()
return out

def as_gpu_matrix(self, columns=None, order="F"):
Expand Down Expand Up @@ -2303,7 +2303,7 @@ def merge(
4 3 13.0
2 4 14.0 12.0
"""
libcudf.nvtx.nvtx_range_push("CUDF_JOIN", "blue")
libcudfxx.nvtx.range_push("CUDF_JOIN", "blue")
if indicator:
raise NotImplementedError(
"Only indicator=False is currently supported"
Expand Down Expand Up @@ -2344,7 +2344,7 @@ def merge(
how,
method,
)

libcudfxx.nvtx.range_pop()
return gdf_result

def join(
Expand Down Expand Up @@ -2383,7 +2383,7 @@ def join(
- *on* is not supported yet due to lack of multi-index support.
"""

libcudf.nvtx.nvtx_range_push("CUDF_JOIN", "blue")
libcudfxx.nvtx.range_push("CUDF_JOIN", "blue")

# Outer joins still use the old implementation
if type != "":
Expand Down Expand Up @@ -2518,7 +2518,7 @@ def _set_categories(col, cats):
df.index.names = index_frame_l.columns
for new_key, old_key in zip(index_frame_l.columns, idx_col_names):
df.index._data[new_key] = df.index._data.pop(old_key)

libcudfxx.nvtx.range_pop()
return df

def groupby(
Expand Down Expand Up @@ -2584,7 +2584,7 @@ def groupby(

# The corresponding pop() is in
# DataFrameGroupBy._apply_aggregation()
libcudf.nvtx.nvtx_range_push("CUDF_GROUPBY", "purple")
libcudfxx.nvtx.range_push("CUDF_GROUPBY", "purple")

result = DataFrameGroupBy(
self,
Expand Down Expand Up @@ -2682,7 +2682,7 @@ def query(self, expr, local_dict={}):
)
)

libcudf.nvtx.nvtx_range_push("CUDF_QUERY", "purple")
libcudfxx.nvtx.range_push("CUDF_QUERY", "purple")
# Get calling environment
callframe = inspect.currentframe().f_back
callenv = {
Expand All @@ -2699,7 +2699,7 @@ def query(self, expr, local_dict={}):
newseries = self[col][selected]
newdf[col] = newseries
result = newdf
libcudf.nvtx.nvtx_range_pop()
libcudfxx.nvtx.range_pop()
return result

@applyutils.doc_apply()
Expand Down
3 changes: 2 additions & 1 deletion python/cudf/cudf/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import cudf
import cudf._lib as libcudf
import cudf._libxx as libcudfxx
from cudf import MultiIndex
from cudf.core.column import deserialize_columns, serialize_columns
from cudf.utils.dtypes import is_scalar
Expand Down Expand Up @@ -130,7 +131,7 @@ def _apply_aggregation(self, agg):
Applies the aggregation function(s) ``agg`` on all columns
"""
result = self._groupby.compute_result(agg)
libcudf.nvtx.nvtx_range_pop()
libcudfxx.nvtx.range_pop()
return result

def __getitem__(self, arg):
Expand Down
Loading