BUG: Passing string as axis
argument leads to incorrect behavior
#5094
Labels
bug 🦗
Something isn't working
P2
Minor bugs or low-priority feature requests
pandas concordance 🐼
Functionality that does not match pandas
Modin version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest released version of Modin.
I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)
Reproducible Example
Issue Description
Many query compiler, dataframe, and partition manager functions assume that the
axis
argument of a function will either be 0, 1,None
, ormodin.core.dataframe.base.dataframe.utils.Axis
. However, the pandas API (and some of our own test cases) allow passingaxis="index"
oraxis="columns"
, which has the potential to break some of our code.The provided example is one such case: on this line in the partition manager, a comparison is made for
axis == 0
, butaxis
here is still the string"index"
. This can manifest itself in more insidious ways in other functions (for example, this line in the dataframe attempts to XORaxis
even though it may be a string), though I'm not sure what the appropriate API call to hit that case would be.Expected Behavior
The above code should match pandas. The
axis
argument should be normalized to an integer orAxis
enum before it is used in comparisons in the query compiler/dataframe/partition manager.Error Logs
N/A
Installed Versions
INSTALLED VERSIONS
commit : 7871c7b
python : 3.10.4.final.0
python-bits : 64
OS : Darwin
OS-release : 21.6.0
Version : Darwin Kernel Version 21.6.0: Wed Aug 10 14:28:23 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_T6000
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
Modin dependencies
modin : 0.15.0+171.g7871c7bc
ray : 1.13.0
dask : 2022.8.0
distributed : 2022.8.0
hdk : None
pandas dependencies
pandas : 1.5.0
numpy : 1.23.1
pytz : 2022.1
dateutil : 2.8.2
setuptools : 61.2.0
pip : 22.1.2
Cython : None
pytest : 7.1.2
hypothesis : None
sphinx : 5.1.1
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : 0.8.2
fsspec : 2022.7.1
gcsfs : None
matplotlib : 3.5.3
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.8
pyarrow : 8.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2022.7.1
scipy : 1.9.0
snappy : None
sqlalchemy : 1.4.40
tables : 3.7.0
tabulate : None
xarray : 2022.6.0
xlrd : 2.0.1
xlwt : None
zstandard : None
tzdata : None
The text was updated successfully, but these errors were encountered: