Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support passing scalar string args to string_udfs #11737

Open
randerzander opened this issue Sep 21, 2022 · 2 comments
Open

[FEA] Support passing scalar string args to string_udfs #11737

randerzander opened this issue Sep 21, 2022 · 2 comments
Assignees
Labels
feature request New feature or request numba Numba issue Python Affects Python cuDF API.

Comments

@randerzander
Copy link
Contributor

randerzander commented Sep 21, 2022

Using the newly merged strings_udf support, I'm trying to pass scalar arguments to a string UDF:

import cudf

df = cudf.DataFrame({"str_col": ["a", "abb", "abc"]})

def delim_count(row, delim):
    return row["str_col"].count(delim)

df.apply(delim_count, args=("b",), axis=1)

But I get udf compilation failed errors.

Trace:

---------------------------------------------------------------------------
NumbaNotImplementedError                  Traceback (most recent call last)
/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in cast(self, builder, val, fromty, toty)
    711         try:
--> 712             impl = self._casts.find((fromty, toty))
    713             return impl(self, builder, fromty, toty, val)

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in find(self, sig)
     48         if out is None:
---> 49             out = self._find(sig)
     50             self._cache[sig] = out

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in _find(self, sig)
     57         else:
---> 58             raise errors.NumbaNotImplementedError(f'{self}, {sig}')
     59 

NumbaNotImplementedError: <numba.core.base.OverloadSelector object at 0x7f2498b3aca0>, (unicode_type, string_view)

During handling of the above exception, another exception occurred:

NumbaNotImplementedError                  Traceback (most recent call last)
/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in cast(self, builder, val, fromty, toty)
    712             impl = self._casts.find((fromty, toty))
--> 713             return impl(self, builder, fromty, toty, val)
    714         except errors.NumbaNotImplementedError:

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/udf/masked_lowering.py in cast_primitive_to_masked(context, builder, fromty, toty, val)
    336 def cast_primitive_to_masked(context, builder, fromty, toty, val):
--> 337     casted = context.cast(builder, val, fromty, toty.value_type)
    338     ext = cgutils.create_struct_proxy(toty)(context, builder)

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in cast(self, builder, val, fromty, toty)
    714         except errors.NumbaNotImplementedError:
--> 715             raise errors.NumbaNotImplementedError(
    716                 "Cannot cast %s to %s: %s" % (fromty, toty, val))

NumbaNotImplementedError: Cannot cast unicode_type to string_view: %"inserted.parent" = insertvalue {i8*, i64, i32, i32, i64, i8*, i8*} %"inserted.meminfo", i8* %"arg.delim.6", 6

During handling of the above exception, another exception occurred:

NumbaNotImplementedError                  Traceback (most recent call last)
/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/indexed_frame.py in _apply(self, func, kernel_getter, *args, **kwargs)
   1817         try:
-> 1818             kernel, retty = _compile_or_get(
   1819                 self, func, args, kernel_getter=kernel_getter

/opt/conda/envs/rapids/lib/python3.9/contextlib.py in inner(*args, **kwds)
     78             with self._recreate_cm():
---> 79                 return func(*args, **kwds)
     80         return inner

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/udf/utils.py in _compile_or_get(frame, func, args, kernel_getter)
    214 
--> 215     kernel, scalar_return_type = kernel_getter(frame, func, args)
    216     np_return_type = numpy_support.as_dtype(scalar_return_type)

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/udf/row_function.py in _get_row_kernel(frame, func, args)
    132     )
--> 133     scalar_return_type = _get_udf_return_type(row_type, func, args)
    134     # this is the signature for the final full kernel compilation

/opt/conda/envs/rapids/lib/python3.9/contextlib.py in inner(*args, **kwds)
     78             with self._recreate_cm():
---> 79                 return func(*args, **kwds)
     80         return inner

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/udf/utils.py in _get_udf_return_type(argty, func, args)
     55     # needed here.
---> 56     ptx, output_type = cudautils.compile_udf(func, compile_sig)
     57     if not isinstance(output_type, MaskedType):

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/utils/cudautils.py in compile_udf(udf, type_signature)
    249     # compilation with Numba
--> 250     ptx_code, return_type = cuda.compile_ptx_for_current_device(
    251         udf, type_signature, device=True

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/cuda/compiler.py in compile_ptx_for_current_device(pyfunc, args, debug, lineinfo, device, fastmath, opt)
    289     cc = get_current_device().compute_capability
--> 290     return compile_ptx(pyfunc, args, debug=debug, lineinfo=lineinfo,
    291                        device=device, fastmath=fastmath, cc=cc, opt=True)

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     34             with self:
---> 35                 return func(*args, **kwargs)
     36         return _acquire_compile_lock

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/cuda/compiler.py in compile_ptx(pyfunc, args, debug, lineinfo, device, fastmath, cc, opt)
    266 
--> 267     cres = compile_cuda(pyfunc, None, args, debug=debug, lineinfo=lineinfo,
    268                         nvvm_options=nvvm_options)

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     34             with self:
---> 35                 return func(*args, **kwargs)
     36         return _acquire_compile_lock

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/cuda/compiler.py in compile_cuda(pyfunc, return_type, args, debug, lineinfo, inline, fastmath, nvvm_options)
    201     # Run compilation pipeline
--> 202     cres = compiler.compile_extra(typingctx=typingctx,
    203                                   targetctx=targetctx,

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler.py in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
    692                               args, return_type, flags, locals)
--> 693     return pipeline.compile_extra(func)
    694 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler.py in compile_extra(self, func)
    428         self.state.lifted_from = None
--> 429         return self._compile_bytecode()
    430 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler.py in _compile_bytecode(self)
    496         assert self.state.func_ir is None
--> 497         return self._compile_core()
    498 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler.py in _compile_core(self)
    475                     if is_final_pipeline:
--> 476                         raise e
    477             else:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler.py in _compile_core(self)
    462                 try:
--> 463                     pm.run(self.state)
    464                     if self.state.cr is not None:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_machinery.py in run(self, state)
    352                 patched_exception = self._patch_error(msg, e)
--> 353                 raise patched_exception
    354 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_machinery.py in run(self, state)
    340                 if isinstance(pass_inst, CompilerPass):
--> 341                     self._runPass(idx, pass_inst, state)
    342                 else:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     34             with self:
---> 35                 return func(*args, **kwargs)
     36         return _acquire_compile_lock

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_machinery.py in _runPass(self, index, pss, internal_state)
    295         with SimpleTimer() as pass_time:
--> 296             mutated |= check(pss.run_pass, internal_state)
    297         with SimpleTimer() as finalize_time:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/compiler_machinery.py in check(func, compiler_state)
    268         def check(func, compiler_state):
--> 269             mangled = func(compiler_state)
    270             if mangled not in (True, False):

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/typed_passes.py in run_pass(self, state)
    393                                        metadata=metadata)
--> 394                 lower.lower()
    395                 if not flags.no_cpython_wrapper:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower(self)
    195             self.genlower = None
--> 196             self.lower_normal_function(self.fndesc)
    197         else:

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_normal_function(self, fndesc)
    249         self.extract_function_arguments()
--> 250         entry_block_tail = self.lower_function_body()
    251 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_function_body(self)
    278             self.builder.position_at_end(bb)
--> 279             self.lower_block(block)
    280         self.post_lower()

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_block(self, block)
    292                                    loc=self.loc, errcls_=defaulterrcls):
--> 293                 self.lower_inst(inst)
    294         self.post_block(block)

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_inst(self, inst)
    437             ty = self.typeof(inst.target.name)
--> 438             val = self.lower_assign(ty, inst)
    439             argidx = None

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_assign(self, ty, inst)
    623         elif isinstance(value, ir.Expr):
--> 624             return self.lower_expr(ty, value)
    625 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_expr(self, resty, expr)
   1158         elif expr.op == 'call':
-> 1159             res = self.lower_call(resty, expr)
   1160             return res

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in lower_call(self, resty, expr)
    888         else:
--> 889             res = self._lower_call_normal(fnty, expr, signature)
    890 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in _lower_call_normal(self, fnty, expr, signature)
   1111         else:
-> 1112             argvals = self.fold_call_args(
   1113                 fnty, signature, expr.args, expr.vararg, expr.kws,

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in fold_call_args(self, fnty, signature, pos_args, vararg, kw_args)
    810                                           "when calling %s" % (fnty,))
--> 811             argvals = [self._cast_var(var, sigty)
    812                        for var, sigty in zip(pos_args, signature.args)]

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in <listcomp>(.0)
    810                                           "when calling %s" % (fnty,))
--> 811             argvals = [self._cast_var(var, sigty)
    812                        for var, sigty in zip(pos_args, signature.args)]

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/lowering.py in _cast_var(self, var, ty)
    793             val = self.loadvar(var.name)
--> 794         return self.context.cast(self.builder, val, varty, ty)
    795 

/opt/conda/envs/rapids/lib/python3.9/site-packages/numba/core/base.py in cast(self, builder, val, fromty, toty)
    714         except errors.NumbaNotImplementedError:
--> 715             raise errors.NumbaNotImplementedError(
    716                 "Cannot cast %s to %s: %s" % (fromty, toty, val))

NumbaNotImplementedError: Failed in cuda mode pipeline (step: native lowering)
Cannot cast unicode_type to Masked(string_view): %"inserted.parent" = insertvalue {i8*, i64, i32, i32, i64, i8*, i8*} %"inserted.meminfo", i8* %"arg.delim.6", 6
During: lowering "$12call_method.5 = call $8load_method.3(delim, func=$8load_method.3, args=[Var(delim, 3005899295.py:6)], kws=(), vararg=None, target=None)" at /tmp/ipykernel_2334/3005899295.py (6)

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_2334/3005899295.py in <module>
      6     return row["str_col"].count(delim)
      7 
----> 8 df.apply(delim_count, args=("b",), axis=1)

/opt/conda/envs/rapids/lib/python3.9/contextlib.py in inner(*args, **kwds)
     77         def inner(*args, **kwds):
     78             with self._recreate_cm():
---> 79                 return func(*args, **kwds)
     80         return inner
     81 

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/dataframe.py in apply(self, func, axis, raw, result_type, args, **kwargs)
   4089             raise ValueError("The `result_type` kwarg is not yet supported.")
   4090 
-> 4091         return self._apply(func, _get_row_kernel, *args, **kwargs)
   4092 
   4093     def applymap(

/opt/conda/envs/rapids/lib/python3.9/contextlib.py in inner(*args, **kwds)
     77         def inner(*args, **kwds):
     78             with self._recreate_cm():
---> 79                 return func(*args, **kwds)
     80         return inner
     81 

/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/indexed_frame.py in _apply(self, func, kernel_getter, *args, **kwargs)
   1820             )
   1821         except Exception as e:
-> 1822             raise ValueError(
   1823                 "user defined function compilation failed."
   1824             ) from e

ValueError: user defined function compilation failed.

numba version: '0.55.2'
cudf version: '22.10.00a+242.g387c5ff96d' (built from source)

cc @brandon-b-miller

@randerzander randerzander added feature request New feature or request Needs Triage Need team to review and classify labels Sep 21, 2022
@brandon-b-miller brandon-b-miller self-assigned this Sep 21, 2022
@brandon-b-miller brandon-b-miller added numba Numba issue Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Sep 21, 2022
@randerzander randerzander changed the title [FEA] Support passing scalar args to string_udfs [FEA] Support passing scalar string args to string_udfs Sep 21, 2022
@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@vyasr
Copy link
Contributor

vyasr commented May 17, 2024

See also #9639

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request numba Numba issue Python Affects Python cuDF API.
Projects
Status: Todo
Development

No branches or pull requests

4 participants