Skip to content

Commit

Permalink
Add (implicit) handling for torch tensors in is_scalar (#14623)
Browse files Browse the repository at this point in the history
PyTorch tensors advertise that they support the number API, and hence answer "True" to the question pd.api.types.is_scalar(torch_tensor). This trips up some of our data ingest, since in as_index we check if the input is a scalar (and raise) before handing off to as_column. To handle this, if we get True back from pandas' is_scalar call, additionally check that the object has an empty shape attribute (if it exists).

See also:

- pytorch/pytorch#99646
- pandas-dev/pandas#52701

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)

URL: #14623
  • Loading branch information
wence- authored Dec 13, 2023
1 parent 420dc5d commit a894ca0
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion python/cudf/cudf/api/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,17 @@ def is_scalar(val):
cudf._lib.scalar.DeviceScalar,
cudf.core.tools.datetimes.DateOffset,
),
) or pd_types.is_scalar(val)
) or (
pd_types.is_scalar(val)
# Pytorch tensors advertise that they support the number
# protocol, and therefore return True for PyNumber_Check even
# when they have a shape. So, if we get through this, let's
# additionally check that if they have a shape property that
# it is empty.
# See https://github.com/pytorch/pytorch/issues/99646
# and https://github.com/pandas-dev/pandas/issues/52701
and len(getattr(val, "shape", ())) == 0
)


def _is_scalar_or_zero_d_array(val):
Expand Down

0 comments on commit a894ca0

Please sign in to comment.