open_mfdataset: Raise if combine='by_coords' and concat_dim=None #5231

TomNicholas · 2021-04-28T19:16:19Z

Fixes bug which allowed incorrect arguments to be passed to open_mfdataset without complaint.

The combination open_mfdataset(files, combine='by_coords', concat_dim='t') should never have been permitted, and in fact it wasn't permitted until the last part of the deprecation process from the old auto_combine. It makes no sense to pass this combination because the combine_by_coords function does not have a concat_dim argument at all!

The effect was pretty benign - the concat_dim arg wasn't really used for anything in that case, and the result of passing dodgy datasets would just be a less informative error. However there were multiple tests which assumed this behaviour was okay - I had to remove that particular parametrization for a bunch of your join tests @dcherian because they now fail with a different (clearer) error.

I also noticed a related issue which I fixed - internally open_mfdataset was performing a rearrangement of the input datasets that it needs for the case combine='nested', even in the case combine='by_coords'. I hadn't previously realised that we can just skip this rearrangement without issue, so open_mfdataset(combine='by_coords') should be a little bit faster now.

Closes Same files in open_mfdataset() unclear error message #5230
Tests added
Passes pre-commit run --all-files
User visible changes (including notable bug fixes) are documented in whats-new.rst

dcherian

Thanks @TomNicholas just one minor comment.

xarray/tests/test_backends.py

dcherian · 2021-04-30T12:40:21Z

flaky pydap tests being flakey...

Thanks @TomNicholas

TomNicholas added 8 commits April 28, 2021 13:43

regression test for raising clearer error

fdbff97

raise error on bad args to open_mfdataset

e4633f9

correct other tests to not pass by_coords and concat_dim together

9fc29ab

refactored to remove unneeded reordering when using combine=by_coords

b940249

Try pre-commit

842588a

black

b76dd28

manual -> nested in docstrings

f1cf9f7

what's new

a957625

TomNicholas requested a review from dcherian April 28, 2021 19:16

dcherian approved these changes Apr 28, 2021

View reviewed changes

xarray/tests/test_backends.py Outdated Show resolved Hide resolved

xarray/tests/test_backends.py Outdated Show resolved Hide resolved

corrected docstring on join

a5e72c9

TomNicholas mentioned this pull request Apr 29, 2021

release v0.18.0 #5232

Closed

13 tasks

reinstated file reverser

07d8ec0

dcherian changed the title ~~Prevent passing combine='by_coords' with concat_dim not none to open_mfdataset~~ open_mfdataset: Raise if combine='by_coords' and concat_dim=None Apr 30, 2021

dcherian merged commit 01b6cc7 into pydata:master Apr 30, 2021

TomNicholas mentioned this pull request May 3, 2021

Warn instead of error on combine='nested' with concat_dim supplied #5255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

open_mfdataset: Raise if combine='by_coords' and concat_dim=None #5231

open_mfdataset: Raise if combine='by_coords' and concat_dim=None #5231

TomNicholas commented Apr 28, 2021 •

edited

Loading

dcherian left a comment

dcherian commented Apr 30, 2021

open_mfdataset: Raise if combine='by_coords' and concat_dim=None #5231

open_mfdataset: Raise if combine='by_coords' and concat_dim=None #5231

Conversation

TomNicholas commented Apr 28, 2021 • edited Loading

dcherian left a comment

Choose a reason for hiding this comment

dcherian commented Apr 30, 2021

TomNicholas commented Apr 28, 2021 •

edited

Loading