We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug When there are duplicate values in an index, Series constructor shouldn't be reindexing if there is a new index passed.
Steps/Code to reproduce bug
In [1]: import cudf In [2]: s = cudf.Series(['a', 'b', 'c', 'd'], index=[0, 0, 0, 0]) In [3]: s Out[3]: 0 a 0 b 0 c 0 d dtype: object In [4]: s = cudf.Series(s, index=[10, 11, 12, 13]) In [5]: s Out[5]: 10 <NA> 11 <NA> 12 <NA> 13 <NA> dtype: object
Expected behavior Raise an error as pandas does:
In [6]: import pandas as pd In [7]: ps = pd.Series(['a', 'b', 'c', 'd'], index=[0, 0, 0, 0]) In [8]: ps = pd.Series(ps, index=[10, 11, 12, 13]) <ipython-input-8-491d9281f1fc>:1: FutureWarning: reindexing with a non-unique Index is deprecated and will raise in a future version. ps = pd.Series(ps, index=[10, 11, 12, 13]) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[8], line 1 ----> 1 ps = pd.Series(ps, index=[10, 11, 12, 13]) File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/series.py:432, in Series.__init__(self, data, index, dtype, name, copy, fastpath) 430 index = data.index 431 else: --> 432 data = data.reindex(index, copy=copy) 433 copy = False 434 data = data._mgr File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/series.py:5094, in Series.reindex(self, *args, **kwargs) 5090 raise TypeError( 5091 "'index' passed as both positional and keyword argument" 5092 ) 5093 kwargs.update({"index": index}) -> 5094 return super().reindex(**kwargs) File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/generic.py:5289, in NDFrame.reindex(self, *args, **kwargs) 5286 return self._reindex_multi(axes, copy, fill_value) 5288 # perform the reindex on the axes -> 5289 return self._reindex_axes( 5290 axes, level, limit, tolerance, method, fill_value, copy 5291 ).__finalize__(self, method="reindex") File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/generic.py:5309, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy) 5304 new_index, indexer = ax.reindex( 5305 labels, level=level, limit=limit, tolerance=tolerance, method=method 5306 ) 5308 axis = self._get_axis_number(a) -> 5309 obj = obj._reindex_with_indexers( 5310 {axis: [new_index, indexer]}, 5311 fill_value=fill_value, 5312 copy=copy, 5313 allow_dups=False, 5314 ) 5315 # If we've made a copy once, no need to make another one 5316 copy = False File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/generic.py:5355, in NDFrame._reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups) 5352 indexer = ensure_platform_int(indexer) 5354 # TODO: speed up on homogeneous DataFrame objects (see _reindex_multi) -> 5355 new_data = new_data.reindex_indexer( 5356 index, 5357 indexer, 5358 axis=baxis, 5359 fill_value=fill_value, 5360 allow_dups=allow_dups, 5361 copy=copy, 5362 ) 5363 # If we've made a copy once, no need to make another one 5364 copy = False File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/internals/managers.py:737, in BaseBlockManager.reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, only_slice, use_na_proxy) 735 # some axes don't allow reindexing with dups 736 if not allow_dups: --> 737 self.axes[axis]._validate_can_reindex(indexer) 739 if axis >= self.ndim: 740 raise IndexError("Requested axis not found in manager") File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/indexes/base.py:4316, in Index._validate_can_reindex(self, indexer) 4314 # trying to reindex on an axis with duplicates 4315 if not self._index_as_unique and len(indexer): -> 4316 raise ValueError("cannot reindex on an axis with duplicate labels") ValueError: cannot reindex on an axis with duplicate labels
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered:
reindex
index
Raise error in reindex when index is not unique (#14400)
8deb3dd
Fixes: #14398 This PR raises an error in `reindex` API when reindexing is performed on a non-unique index column. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: #14400
Raise error in reindex when index is not unique (rapidsai#14400)
314ac0e
Fixes: rapidsai#14398 This PR raises an error in `reindex` API when reindexing is performed on a non-unique index column. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#14400
Raise error in reindex when index is not unique (#14400) (#14429)
4dc8300
Bacport of #14400 Fixes: #14398 This PR raises an error in `reindex` API when reindexing is performed on a non-unique index column. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: #14400 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) - Ashwin Srinath (https://github.com/shwina) - Ray Douglass (https://github.com/raydouglass)
galipremsagar
Successfully merging a pull request may close this issue.
Describe the bug
When there are duplicate values in an index, Series constructor shouldn't be reindexing if there is a new index passed.
Steps/Code to reproduce bug
Expected behavior
Raise an error as pandas does:
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered: