-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APE7 - A plan for NDData #8
Conversation
I'm fairly happy with this proposal, especially the I think the Also, I think there is good reason to have nddata in place since the only natural thing to do with Cubes (given that monitors are flat) is subset them into 1D or 2D slices/projections, and it makes sense to have those subsets behave as similarly as possible to the original cube. |
Main comment: I like the decorator in principle, but feel it should not be part of this APE, as it does not depend on how On the implementation: I like the principle of just defining the properties any function dealing with N-dimensional data should be able to count on being present, but I think that in python this most logically corresponds to not so much a supersimple implementation but rather an abstract base class [1]. I would suggest to actually use an ABC, which other classes can register with. For instance, I see many advantages to basing an image class on
and, presto, all nddata functions will now work with my class, without me needing to subclass from both Note that of course this doesn't change anything for the proposal to have a simple If we go the ABC route, the items that i think do not belong in the metaclass are |
More directly on the implementation: I think for consistency what we do elsewhere in astropy ( |
Another small comment: I see no reason to have a |
Overall: very much in favour of a very simple |
I'm not sure I buy into the idea that Image or Spectrum (or all the other variants possible) should be a subclass. The WCS property has sufficient information to indicate whether or not it contains an image or spectrum, or both. We can put the relevant operations into functions that test to see whether that nddata object has the necessary axes to support the kinds of operations intended. (We could stick all these functions as generic nddata methods, that raise an exception if it doesn't have the appropriate axis. This would make using nddata objects a lot simpler in many respects and avoid the complications of having to deal with: Image etc. and all the combinations that might be possible with these. Instead, nddata could have a method (or a function) that one can use to see if is is usable as a spectrum (of any spatial dimension). Or if it supports image operations (perhaps as a time series, or IFU). So if I want to write a function that integrates a spectrum over a bandpass, it would work with 1d, 2d, and 3d spectral cases and return the appropriate result (e.g, an image for a spectral cube, but still nddata, but with a modified wcs and different .data dimensionalty). If I want to sum the flux in a spatial region, that should work with any nddata object that has two spatial dimensions, e.g., images, spectral cubes, polarization image sets, or image time series. If we go with the proposed subclasses, we'll end up either reimplementing the same methods in many cases. |
I think this should be clearer about the standard attributes and whether they are optional (and if so, how their absence is handled, e.g., value of None?) |
Just a quick comment to @mhvk regarding the decorator stuff - this APE is not just about how NDData is implemented, but about how we can use it, hence why this is included here. I don't think we can decouple the discussion of implementation of NDData from how it's used. |
@astrofrog - I do think you are trying to do 2 things at the same time here. But underlying this is perhaps my impression that no real progress with Anyway, the bigger question in my mind is whether to do this through an ABC.. |
@mhvk - the decorator way of doing things is not an only solution and anyone can chose to implement functions that use e.g. only NDData. I agree we are doing two things at one in this APE, but I think that's ok given that the scope of the APE is the overall plan for NDData :) If it looks as though the decorator part is going to be controversial (it has not been in previous discussions) then we can consider taking it out. But it sounds like you want to go even more general, which is not incompatible with what is in the APE. Regarding the ABC - I originally had NDData as an ABC in this APE, and @cdeil and @embray convinced me that this isn't needed. In fact, @perrygreenfield's comments above go the other way by saying that NDData could be used directly in a lot of cases. Let's see what others think. |
What if there were some |
Also regarding what @mhvk wrote:
In addition to NDDataABC.register(MyNDData) you could also write NDData.register(MyNDData) then |
In other words, I have argued that there should be some |
@embray - yes, having both the ABC and the minimal implementation is what I had in mind. I hadn't realised though one would also be able to register with (@taldcroft - I think the above discussion on ABC usage is relevant for how we deal with column-like objects as well; effectively, astropy code would have |
With apologies if this is getting too off track from discussion of the broader points of this proposal (and if so we can take it elsewhere) but I think this minimal example demonstrates what @mhvk and I are talking about: import abc
from astropy.units import Quantity
class NDDataMeta(abc.ABCMeta):
def __instancecheck__(cls, obj):
if cls is NDData and issubclass(type(obj), NDDataBase):
return True
return super().__instancecheck__(obj)
class NDDataBase(metaclass=NDDataMeta):
@abc.abstractproperty
def data(self):
"""The raw data contained by this object."""
@abc.abstractproperty
def wcs(self):
"""The WCS mapping data to some coordinate system."""
class NDData(NDDataBase):
@property
def data(self):
return "Hello World!"
@property
def wcs(self):
raise NotImplementedError()
class Image(NDData):
"""
Some application-specific NDData subclass that inherits useful
functionality from the base class.
"""
@NDDataBase.register
class MyNDData(Quantity):
"""Not an NDData subclass but implements the appropriate interface."""
@property
def wcs(self):
raise NotImplemented() Then >>> d = MyNDData(1)
>>> isinstance(d, NDDataBase)
True
>>> isinstance(d, NDData)
True
>>> isinstance(d, Image)
False Something along those lines. |
@embray - I need to think about this more, but can you explain why:
in the example you showed? I'm very confused as to how this tests as |
Because My example is probably more complicated than necessary. The point is that it may be useful to have an interface defined by an ABC that other classes can match themselves to, while at the same time having a minimally useful |
@perrygreenfield - regarding the sub-classes, I can see the argument for not needing the sub-classes. On the other hand, a Spectrum class may make sense in that it can define methods that only apply to spectra (for example specific types of interpolation). In addition, the I/O framework needs to know which readers and writers it can use, and relying on the WCS to choose what formats are available may not be very robust. The current I/O framework relies on the class of the data to determine which readers/writers will work. There's also the issue of uncalibrated data that has no WCS - how would you distinguish a position-position image from a 2d spectrum for example? |
@perrygreenfield - just to make sure I understand what you are suggesting - I take it you are not opposed to having classes such as |
@astrofrog - Isn't the I/O aspect best handled by a factory function? It could:
|
@astrofrog - Right, subclasses would be permitted, just not necessary for specifying the kinds of axes it has. |
Just to leave also here Steve Crawford's suggestion on the mailing list to include |
not accept it - for example, if ``wcs`` is set, but the function cannot support | ||
WCS objects, an error would be raised. On the other hand, if an argument in the | ||
function does not exist in the ``NDData`` object or is not set, it is simply | ||
left to its default value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is problematic to let the decorator fail is an NDData
property is set, but the function does not support it. I image e.g. a function that shifts a spectrum from heliocentric to the reference frame of a star. All this function needs is the wcs (and maybe meta
to record this shift in a keyword), but with this implementation is would also be required to take data
, unit
and uncertainty
arguments just because the decorator fails if not every attribute of the NDData
is accepted by the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hamogu - maybe we could make all this parametrizable in the decorator, e.g. you could say that you don't care if there is anything else that is not needed by the function, or that it can ignore specific attributes, etc.
Having said that, the way this decorator is designed, it would not work if the function takes only wcs - it would need to take a data
argument too.
@perrygreenfield - I have been thinking about your comments some more, and to some extent I think we can decide on which sub-classes exist after this APE. So I removed the paragraph that I think you were disagreeing with:
and I have re-worded things in other places to avoid talking about this. With these changes, do you support this APE or do you still have concerns? |
I have implemented changes based on the comments so far: @mhvk: I have removed the setters and mentioned why we don't want to include them. I have also removed the advanced details about the slicing and @mhvk @embray: I can see where you are going with the ABC but I think that it requires additional consideration, and I'm not sure if it's really needed yet. I thought about it and if we implement things as suggested in this APE and later want to add the ABC, it will be possible to do it without breaking any backward compatibility, so I've taken the option of adding a paragraph briefly mentioning this option but saying we leave it for a future discussion beyond the APE. |
@astrofrog -- this might not belong in the APE, but the PR astropy/astropy#2905 currently contains a mixin class for arithmetic and a subclass that provides the functionality of the old NDData. I could see:
I'm fine with either one, so count this as a +1 |
I am very happy with the overall APE, but I still feel very strongly about including I would be much happier with including the following (or some variation of the following):
I am not happy with the current APE that does not include this as a base property and also with the current language in the APE. First, I don't think we should place any restrictions on what the uncertainty can be (it could be a function, a probability distribution, and possible many things that we can't even think of right now). From my view, I think not specifying what it will be is a feature. We should never put any limitations on how In addition, we really want to prevent people from calling it |
OK, so this is the important point. Perhaps it's worth saying up front that NDDataBase & NDData are a work in progress that are liable to acquire more specific definitions as the community thrashes out what works in practice? Also, separately from the APE, maybe encourage people to work on a small number of initial subclasses for specific data types rather than doing their own thing based on NDDataBase/NDData (without intending to discourage alternative experiments for a good reason). Regarding the wording for #3, maybe just consider changing "could" to "will" or "will probably". |
I'd be in favor of adding a sentence saying that if we the community identify core functionality that make sense in these classes, there isn't any reason why these couldn't be added in future, but I think those should probably require an APE or at least a strong consensus. |
Fair enough -- you certainly don't want it evolving erratically -- but I'd hope in the long run it will be expected to evolve further towards encouraging compatibility over lots of similar-but-not-quite-compatible conventions. |
Sorry about the silence -- will have time to comment/update later today. |
@jehturner -- thanks for condensing, and apologies for the delay in responding.
I think the main reason that
My experience with ccdproc, is that anything beyond dict-like is problematic. In ccdproc we started by requiring the meta-data to be a FITS header; that was a hassle if your data wasn't coming from a FITS file and resulted in warning messages if one, e.g., used a long key. Then we tried using something like an This approach for |
I would strengthen this a bit: I hope it does lead to future APEs once the updated |
Address all remaining line comments on APE
but does only very limited input validation. | ||
|
||
* It provides generic ``read`` and ``write`` methods that connect to the I/O | ||
registry, as for the ``Table`` class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mwcraig - damn I just noticed these two bullet points. The second is no longer true, right? Shall I re-word to say
The proposed ``NDDataBase`` simplifies the current ``NDData`` class to the extreme,
such that it essentially only defines the properties needed for ``NDData``?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
👍 |
👍 (Previously I argued for the alternative: remove |
👍 |
1 similar comment
👍 |
@mwcraig Thanks for the edits. I haven't actually thought hard about whether the existing FlagCollection is the best way to achieve what it does & just need a way to propagate multiple flags, so if FlagCollection is currently not being used it might be worth revisiting for the sub classes. I've tried to do FITS → meta → FITS previously for another project and it was clear that an ordered dictionary with comments as well as values would be the minimum needed to preserve the information usably. Did you find that still insufficient? I can see you might also need to track empty space to reproduce FITS files exactly (I'm sure Erik could comment definitively but maybe it's a bit of a tangent here at this point). I was thinking that might be better than nothing (otherwise the consistency provided may still be inadequate to write interoperable code) but it wouldn't accommodate every format and for now it seems this discussion has moved onto finalizing what's already here for the base class. Sorry, I tried to nest this under your comment, out of the way of the votes, but it keeps ending up at the bottom. |
@jehturner -- no worries! The bigger problem I ran into was long keywords getting mangled (because a dict doesn't know about HIERARCH) and CONTINUE cards for long values getting mangled. At that point I was going to have to teach the metadata to understand FITS, so I went the route of being more permissive, and writing short I/O wrappers. Standardizing on calling the metadata |
I see. Yes, at least it facilitates 1 wrapper per format for I/O (maybe also for manipulating the metadata non-trivially, but whether small project developers would consistently find time to do that I'm not sure). |
@mhvk @embray @keflavich - since you all commented above, any final votes? |
@crawfordsm @mwcraig - since I'm 👍 on the APE as it is now. |
+1 |
2 similar comments
+1 |
👍 |
I like the APE too. I have one quick textual issue, which I'll comment on in-line, and a slightly larger one that I think the APE prescribes some things for the abstract base class that I'm not sure how to enforce in practice. In particular, a very nice thing is that one can set |
on-the-fly. | ||
|
||
* ``unit`` - the unit of the data values, which will be internally | ||
represented as an Astropy Unit. Sub-classes could choose to connect this to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the first instance of Sub-classes could choose to connect this to data.unit
should be deleted, as it is repeated further down.
Wasn't sure how voting worked for APE authors, but yes, 👍 |
Seeing all the votes in favor, the coordination committee has accepted this proposal. @mhvk - I'm 👍 on using |
@eteq - yes, agreed, this is an implementation detail. |
This is ready for review. Comments are solicited until Friday 14th November after which a vote will be taken on whether to accept or reject this APE.
(note, this has been renamed to APE7 since there is already an APE6 proposed in #7)