Remove code related to deprecated "autoclose" option #501

xylar · 2018-12-02T07:34:32Z

xarray no longer supports this option.

xylar · 2018-12-02T07:36:20Z

@pwolfram, I'm getting warnings that this option will be removed in the next version of xarray and I'm wanting to get this taken care of before we run into trouble down the road.

I'm not sure what this means for xarray functionality but by moving to NCO for several of the operations we were once using xarray for (e.g. the MOC) I don't think we need to worry about autoclose anymore.

rabernat · 2018-12-05T18:44:49Z

I'm not sure what this means for xarray functionality but by moving to NCO for several of the operations we were once using xarray for (e.g. the MOC) I don't think we need to worry about autoclose anymore.

Just reiterating a comment that I have made in the past that the xarray developers would be very keen to hear about what motivated this change and how xarray was falling short. This will help us continue to improve xarray.

pwolfram

I took a quick look at this PR. Our code has largely moved away from use of xarray in favor of NCO for many applications so this may not be a big deal. However, I'm a little cautious about completely removing this capability and would recommend replacing it with the current best practice approach advocated by the xarray team.

I'm following up at pydata/xarray#2592 on suggestions for handling a large number of datasets in xarray in terms of "best practices" so that we may be able to maintain some of this functionality if needed.

I suspect in the future we will still want to have these hooks to use xarray as long as they aren't too difficult to maintain in the short term.

pwolfram · 2018-12-05T18:52:11Z

@rabernat, a key thing was performance on things like climatological anomalies, which was noted in the past as a serious issue needing rectification. Some other issues related to complicated computation stacks also came into play, e.g., having a multi-dimensional groupby operations. I'm not sure if those were implemented but a quick look suggests some degree of this functionality has been added, e.g., http://xarray.pydata.org/en/stable/groupby.html#multidimensional-grouping. Another issue was the need to get this up and running and produce stable results, so we had to go the NCO route over the last 18 months.

It sounds like we are due a revisiting of this conversation. I'm not sure if github or a skype call would be better but am more than happy to follow up with you, @mrocklin, @jhamman on these issues. I just didn't have adequate time to follow up previously in the last 3-6 months or so but will make the time now.

pwolfram

@xylar, I'm thinking replacing instances of autoclose=True to file_cache_maxsize=1200 would be a good default (100 years of monthly files) that could be modified via the config files we use. I would recommend we do not remove support of xarray's opening of many datasets in favor of this more general approach.

milenaveneziani · 2018-12-05T18:58:21Z

@rabernat, there was also the Too many open files issue that we tried to solve, but the solution was machine/memory dependent.

rabernat · 2018-12-05T19:05:27Z

Thanks for the feedback folks. Very helpful.

If you can boil any of these down to reproducible examples, it would be great to have xarray issues opened so we can keep track of them. But it sounds like these issues are a bit more complex and involve specific aspects of the datasets / computational environment.

The recent refactoring of xarray's file management will hopefully totally solve the "too many open files" issue. Also, the parallel=True open on open_mfdataset can really speed up opening large filesets (http://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html#xarray.open_mfdataset).

Of course you have a deadline and have to make things work with the tools as they are. In the long term, I hope xarray can improve to the point that it becomes useful for your needs.

jhamman · 2018-12-05T19:12:31Z

@pwolfram - happy to revisit this conversation in the near future.

@xylar - it looks like you'll be at AGU (link). @rabernat and I will be there as well. Maybe worth intersecting at some point.

...and use it to set the file_cache_maxsize option in xarray.

xylar · 2018-12-05T21:15:51Z

@pwolfram, let me know if my latest commit addresses your concern.

xylar · 2018-12-05T21:16:23Z

@jhamman and @rabernat, yes, it would be great to chat at AGU next week.

pwolfram

I'm happy now that we have transition from autoclose to file_cache_maxsize=1200. Thanks @xylar for keeping the xarray door open.

pwolfram · 2018-12-05T21:52:01Z

mpas_analysis/config.default

+# maximum number of open files to hold in xarray’s global least-recently-usage
+# cache. This should be smaller than your system’s per-process file descriptor
+# limit, e.g., ulimit -n on Linux
+file_cache_maxsize = 1200


xylar · 2018-12-05T22:46:58Z

Worked for me when I tested with python 3.7 (with the bug fix from #500)

xylar · 2018-12-05T22:47:39Z

Thanks, @pwolfram

Fix unsupported file_cache_maxsize for older xarray Older versions of xarray are not supported after #501, which is undesirable and unnecessary. All that is needed to fix this is to not raise and exception if file_cache_maxsize is not supported in a given xarray version.

Remove code related to depricated "autoclose" option

acc1aca

xarray no longer supports this option.

xylar added the clean up label Dec 2, 2018

xylar self-assigned this Dec 2, 2018

xylar requested a review from pwolfram December 2, 2018 07:34

pwolfram mentioned this pull request Dec 5, 2018

Deprecated autoclose option pydata/xarray#2592

Closed

pwolfram reviewed Dec 5, 2018

View reviewed changes

Add file_cache_maxsize config option

0ae7b84

...and use it to set the file_cache_maxsize option in xarray.

pwolfram approved these changes Dec 5, 2018

View reviewed changes

xylar merged commit 9fe4b14 into MPAS-Dev:develop Dec 5, 2018

xylar deleted the remove_autoclose branch December 5, 2018 22:47

xylar mentioned this pull request Jan 8, 2019

Fix unsupported file_cache_maxsize for older xarray #506

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove code related to deprecated "autoclose" option #501

Remove code related to deprecated "autoclose" option #501

xylar commented Dec 2, 2018

xylar commented Dec 2, 2018

rabernat commented Dec 5, 2018

pwolfram left a comment

pwolfram commented Dec 5, 2018

pwolfram left a comment

milenaveneziani commented Dec 5, 2018

rabernat commented Dec 5, 2018

jhamman commented Dec 5, 2018

xylar commented Dec 5, 2018

xylar commented Dec 5, 2018

pwolfram left a comment

pwolfram Dec 5, 2018

xylar commented Dec 5, 2018

xylar commented Dec 5, 2018

Remove code related to deprecated "autoclose" option #501

Remove code related to deprecated "autoclose" option #501

Conversation

xylar commented Dec 2, 2018

xylar commented Dec 2, 2018

rabernat commented Dec 5, 2018

pwolfram left a comment

Choose a reason for hiding this comment

pwolfram commented Dec 5, 2018

pwolfram left a comment

Choose a reason for hiding this comment

milenaveneziani commented Dec 5, 2018

rabernat commented Dec 5, 2018

jhamman commented Dec 5, 2018

xylar commented Dec 5, 2018

xylar commented Dec 5, 2018

pwolfram left a comment

Choose a reason for hiding this comment

pwolfram Dec 5, 2018

Choose a reason for hiding this comment

xylar commented Dec 5, 2018

xylar commented Dec 5, 2018