BUG: Issues with return type hint and/or definition of pandas.core.generic.NDFrame.__hash__
#40013
Closed
3 tasks done
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
It is already known that calling
NDFrame.__hash__
always raises an exception, and thusNDFrame
objects are not hashable. My issue is with the return type hint, which is currentlyint
:pandas/pandas/core/generic.py
Lines 1819 to 1823 in 84a9a65
I believe that the correct return type hint should be
typing.NoReturn
, according to the example given at https://docs.python.org/3/library/typing.html#typing.NoReturn.From a static type checker point-of-view (e.g.
mypy
), it may be even better to override the attribute as__hash__ = None
, as this what native Python expects for unhashable objects (see https://github.com/python/cpython/blob/1f433406bd46fbd00b88223ad64daea6bc9eaadc/Lib/_collections_abc.py#L76-L102). Any non-None
implementation currently produces non-intuitive behaviour; for example, the following passesmypy
static type checking:To get behaviour identical to
list
anddict
objects, it looks like making the change toNDFrame
andSeries
* to__hash__ = None
will do; after this change, we can get e.g. inmypy
:*Note: For some reason,
Series
redefines__hash__
as the same asNDFrame.__hash__
, even thoughSeries
inherits fromNDFrame
.pandas/pandas/core/series.py
Line 268 in 84a9a65
Expected Output
(Not applicable)
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 7d32926
python : 3.8.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.8.0-44-generic
Version : #50~20.04.1-Ubuntu SMP Wed Feb 10 21:07:30 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_NZ.UTF-8
LOCALE : en_NZ.UTF-8
pandas : 1.2.2
numpy : 1.20.1
pytz : 2020.5
dateutil : 2.8.1
pip : 20.0.2
setuptools : 44.0.0
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.2
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 8.0.0.dev
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.0
sqlalchemy : 1.3.22
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: