-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
join type changes silently when left-joining with a list of dataframes #19607
Comments
I think this is the same issue as #17257. maybe also related to #16304. Welcome to have a look / PR to fix. These should be the same.
|
Well, I had a look at #17257 before reporting this. IMHO, they are not related: there, it's the Seems easy to fix, but might break existing code. I will look into how to fix it.. |
Prevents changing the join type from 'left' to 'outer' when merge() is used. Fixes pandas-dev#19607.
…uence of dataframes (pandas-dev#19607)
…uence of dataframes (pandas-dev#19607)
Code sample, copy-pastable
Problem description
df.join()
silently changes a 'left' join to an 'outer' join when these conditions are met:join()
is called with a sequence of dataframes, e.g.d1.join([d2], how='left', ...)
Note that the equivalent call to
merge(d1, d2, how='left', ...)
works just fine.When looking at the join() code, I see
concat()
being discarded when dataframe indices are not unique andmerge()
being called instead. This would work perfectly if nothow='left'
had just been changed tohow='outer'
a couple of lines above.Expected Output
d1.join(d2, how='left', ...)
should give the same result aspd.merge(d1, d2, how='left', ...)
. The join type should not be changed.Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 44 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: