Allow custom dataset names in 'generic_image' reader and fix nodata handling #1560

ch-k · 2021-02-22T17:34:24Z

This PR is about two issues. Sorry about mixing it, but it concerns the same file.

At the moment it is not possible to use the GenericImageFileHandler for reader configs that define multiple dataset names. This is an example:

reader:
    name: world_generic_image
    description: generic image reader
    reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader
    sensors: [world-images]
    default_channels: [image_goes16, image_goes17]

datasets:
  image_goes16:
    name: image_goes16
    file_type: graphic_goes16
  image_goes17:
    name: image_goes17
    file_type: graphic_goes17
  image_zds:
    name: image_zds
    file_type: graphic_zds
  image_iodc:
    name: image_iodc
    file_type: graphic_iodc
  image_himawari8:
    name: image_himawari8
    file_type: graphic_himawari8

file_types:
    graphic_goes16:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'goes16_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'goes16_{filename}.tif'
    graphic_goes17:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'goes17_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'goes17_{filename}.tif'
    graphic_zds:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'ZDS_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'ZDS_{filename}.tif'
    graphic_iodc:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'IODC_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'IODC_{filename}.tif'
    graphic_himawari8:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'himawari8_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'himawari8_{filename}.tif'

That doesn't work because a hardcoded name image (https://github.com/pytroll/satpy/blob/master/satpy/readers/generic_image.py#L85) is used to set the data and the name used in the reader config is used to get the data. Therefore only one dataset with name image is possible. Otherwise the reader complains that the dataset is not available.

The fill value of GeoTIFF files is not used to mask the data. With this PR, fill value pixels are set to nan as it is already the case when there are alpha channels present.

Closes #xxxx
Tests added
Fully documented
Add your name to AUTHORS.md if not there already

codecov · 2021-02-22T17:45:21Z

Codecov Report

Merging #1560 (0f93397) into master (b564e0c) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1560      +/-   ##
==========================================
+ Coverage   92.62%   92.65%   +0.02%     
==========================================
  Files         258      258              
  Lines       37847    37929      +82     
==========================================
+ Hits        35057    35142      +85     
+ Misses       2790     2787       -3

Flag	Coverage Δ
behaviourtests	`4.83% <0.00%> (-0.02%)`	⬇️
unittests	`92.78% <100.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
satpy/readers/generic_image.py	`97.56% <100.00%> (+4.22%)`	⬆️
satpy/tests/reader_tests/test_generic_image.py	`98.90% <100.00%> (+0.57%)`	⬆️
satpy/tests/test_scene.py	`99.44% <0.00%> (+<0.01%)`	⬆️
satpy/scene.py	`92.29% <0.00%> (+0.11%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b564e0c...0f93397. Read the comment docs.

stickler-ci · 2021-02-22T17:50:27Z

satpy/tests/reader_tests/test_generic_image.py

+    def test_GenericImageFileHandler_datasetid(self):
+        """Test direct use of the reader."""
+        from satpy.readers.generic_image import GenericImageFileHandler
+        from satpy.readers.generic_image import mask_image_data


F401 'satpy.readers.generic_image.mask_image_data' imported but unused

stickler-ci · 2021-02-22T17:50:27Z

satpy/tests/reader_tests/test_generic_image.py

+        foo = make_dataid(name='image-custom')
+        self.assertTrue(reader.file_content)
+        dataset = reader.get_dataset(foo, None)
+        self.assertTrue(isinstance(dataset, xr.DataArray))


W292 no newline at end of file

mraspaud · 2021-02-22T18:25:10Z

Could you explain the motivation behind this change?

ch-k · 2021-02-22T18:50:31Z

Could you explain the motivation behind this change?

Sorry, should be a draft only. My primary objectiv was to create the DWD world composite with a new trollflow2/satpy chain. I have geotiff files of all satellites on disk and segment gatherer should be used to publish an event when everything is available. Then trollflow2/satpy should use the generic image reader to read all images together into the same scene object. I've created a custom reader.yml there I specified each satellites' image as channel. Afterwards I'll try to create a new compositor to stich the images. To read all images together, this patch is necessary. I can provide the reader.yml when I'm back in office.

djhoese · 2021-02-22T18:56:57Z

Ah I see, the main problem is:

self.file_content['image'] = data

This doesn't use the key provided in the YAML at all. Maybe it would be better to copy the file_content['image'] DataArray and reassign the .attrs['name'] to the provided key's name. Maybe there should be a key in the info (that comes from the YAML) that specifies the file_key. This could default to image and would provide flexibility in the future if we wanted to make it possible to load other things besides the image (the alpha channel maybe, I don't know).

Just some brainstorming.

ch-k · 2021-02-23T17:27:37Z

Here is an example of the reader config that I'd like to use to read the images into the same scene (GOES16 + 17 is only the first step):

reader:
    name: world_generic_image
    description: generic image reader
    reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader
    sensors: [images]
    default_channels: [image_goes16, image_goes17]

datasets:
  image_goes16:
    name: image_goes16
    file_type: graphic_goes16
  image_goes17:
    name: image_goes17
    file_type: graphic_goes17

file_types:
    graphic_goes16:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'goes16_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'goes16_{filename}.tif'
    graphic_goes17:
        file_reader: !!python/name:satpy.readers.generic_image.GenericImageFileHandler
        file_patterns:
         - 'goes17_{filename}{start_time:%Y%m%d%H%M}.tif'
         - 'goes17_{filename}.tif'

ch-k · 2021-02-23T17:38:48Z

Ah I see, the main problem is:
self.file_content['image'] = data
This doesn't use the key provided in the YAML at all. Maybe it would be better to copy the file_content['image'] DataArray and reassign the .attrs['name'] to the provided key's name. Maybe there should be a key in the info (that comes from the YAML) that specifies the file_key. This could default to image and would provide flexibility in the future if we wanted to make it possible to load other things besides the image (the alpha channel maybe, I don't know).

Just some brainstorming.
At first, I tried to change the static 'image' to something like

    def read(self):
        """Read the image."""
        dataset = rasterio.open(self.finfo['filename'])
...
        self.file_content[key['name']] = data

... until I noticed that it's not available in this stage. I've thought about moving the whole reading stuff to get_dataset but I'm unsure if that's reasonable.

djhoese · 2021-02-23T18:40:04Z

Yeah hard to say. @pnuu I think wrote this reader originally and mind have ideas for what can and can't be done on __init__ of the file handler. I don't think my idea or your idea would be bad. No reason we couldn't do both (from the little I know about this reader). We definitely want to support your use case as I think this was kind of the intention of making this functionality a reader in the first place.

pnuu · 2021-02-24T08:16:36Z

Sorry, this might not be very helpful in respect to the changes needed to do this purely in Satpy.

My primary objectiv was to create the DWD world composite with a new trollflow2/satpy chain. I have geotiff files of all satellites on disk and segment gatherer should be used to publish an event when everything is available.

This sounds very much like create_global_mosaic.py we had in pytroll-collectors still few years back, but didn't want to keep it working with the current Satpy and decided to remove it all together. If you want to have a look, see this and this and this.

It collected the images similar to segment gatherer, then cut the input images very naïvely at the GEO satellites' mid-point longitudes and put the images together. It required the input images to be in latlon grid, didn't do any blending, and had some other simplifications.

My current solution is to rely on putting the images (in what ever projection they are generated in) to WMS and let the client software do the requests so that the images are in correct projection, and let it also handle (possible) blending, arrangement and everything else.

ch-k · 2021-02-24T08:30:48Z

Sorry, this might not be very helpful in respect to the changes needed to do this purely in Satpy.

My primary objectiv was to create the DWD world composite with a new trollflow2/satpy chain. I have geotiff files of all satellites on disk and segment gatherer should be used to publish an event when everything is available.

This sounds very much like create_global_mosaic.py we had in pytroll-collectors still few years back, but didn't want to keep it working with the current Satpy and decided to remove it all together. If you want to have a look, see this and this and this.

It collected the images similar to segment gatherer, then cut the input images very naïvely at the GEO satellites' mid-point longitudes and put the images together. It required the input images to be in latlon grid, didn't do any blending, and had some other simplifications.

My current solution is to rely on putting the images (in what ever projection they are generated in) to WMS and let the client software do the requests so that the images are in correct projection, and let it also handle (possible) blending, arrangement and everything else.

@pnuu I remember that we discussed about it in the past. Currently we still have a similar solution running. It uses some reading and writing functions provided by good old mpop. It has to be migrated. I hope to use as much satpy/trollflow2 functionality as possible. It should be parametrized as all the other stuff. The core mosaic creation would be done in a new satpy compositor. Geoserver is not an option for our use case.

pnuu · 2021-02-24T09:40:36Z

So would the use be like this?

fnames = ["goes16_true_color_202102241200.tif", "goes17_true_color_202102241200.tif"]
scn = Scene(reader="world_generic_reader", filenames=fnames)
scn.load(["image_goes16", "image_goes17"])

And another question: how are you planning to handle resampling so that the intertwined pixels do not cause weird effects where the images overlap? The plain nearest neighbour resampling will just blindly use the closest pixel to each target location, so the illumination, parallax and (very slightly) calibration will cause artifacts on the overlapping areas. Maybe MultiScene (several separate scenes) and MultiScene.blend() would handle this readily in a nice way?

ch-k · 2021-02-24T09:51:05Z

So would the use be like this?

fnames = ["goes16_true_color_202102241200.tif", "goes17_true_color_202102241200.tif"]
scn = Scene(reader="world_generic_reader", filenames=fnames)
scn.load(["image_goes16", "image_goes17"])

Yes, exactly. But we start with single channel images..

And another question: how are you planning to handle resampling so that the intertwined pixels do not cause weird effects where the images overlap? The plain nearest neighbour resampling will just blindly use the closest pixel to each target location, so the illumination, parallax and (very slightly) calibration will cause artifacts on the overlapping areas. Maybe MultiScene (several separate scenes) and MultiScene.blend() would handle this readily in a nice way?
In our use case, no resampling is necessary. The input images are already resampled to the target area:

ch-k · 2021-02-24T09:52:51Z

Maybe MultiScene (several separate scenes) and MultiScene.blend() would handle this readily in a nice way?

Does trollflow2 work with MultiScene?

pnuu · 2021-02-24T10:51:15Z

Does trollflow2 work with MultiScene?

No, not at the moment. Creating a plugin that supports the usage of it should be pretty simple.

ch-k · 2021-02-26T10:50:42Z

I've added some code to handle the fill_value when reading GeoTIFFs.
Sorry for mixing it with the other issue. It's difficult to manage changes in different branches when I need all of them to get my feature running (world comp).

@pnuu I think wrote this reader originally and mind have ideas for what can and can't be done on init of the file handler. I don't think my idea or your idea would be bad. No reason we couldn't do both (from the little I know about this reader). We definitely want to support your use case as I think this was kind of the intention of making this functionality a reader in the first place.

@pnuu: What do think? I see my PR more as a fix than an enhancement.

djhoese

Couple suggestions to make this more flexible...I think.

djhoese · 2021-03-01T12:26:32Z

satpy/readers/generic_image.py

@@ -117,4 +117,9 @@ def mask_image_data(data):
                                for i in range(data.shape[0])])
        data.data = masked_data
        data = data.sel(bands=BANDS[data.bands.size - 1])
+    elif hasattr(data, 'nodatavals') and data.nodatavals:
+        data = data.astype(np.float64)


Do you have to convert to float here? If we can keep the initial data type and use a fill value dependent on that data type then it may work for more situations. What if you copied the nodatavals to .attrs['_FillValue'] for integer types, but replaced them with np.nan for floats? This would match what we do in other readers for "category" products.

I'm unsure what's better option. Unfortunately, NaN is only possible with floats and I have the feeling that most of the compositors only work with NaNs as masks. It seems that only ColormapCompositor cares about "_FillValue". That's why I've done the float conversation as it is already the case if alpha channels are present.
Maybe I should introduce a new reader parameter and let the user decide how to handle nodatavals?

I just stumbled across the following paragraph in the satpy docs (https://satpy.readthedocs.io/en/latest/dev_guide/custom_reader.html)

Single precision floats (np.float32) is a good compromise, as it has 23 significant bits (mantissa) and can thus represent 16 bit integers exactly, as well as keeping the memory footprint half of a double precision float.
One commonly used method in readers is xarray.DataArray.where() (to mask invalid data) which can be coercing the data to np.float64. To ensure for example that integer data is coerced to np.float32 when xarray.DataArray.where() is used, you can do ..
So converting to floats to mask data with NaN seems a common practice. As noted in the docs, float32 would be better. Conversion to floats is already done few lines above when alpha channels are handled.

I think that is more specifically talking about integer data that is in the file as integers and needs to be scaled (ex. data * scale_factor + add_offset) to floats, but also has a fill value. In that case you'd need to do the .where call to mask out the floats with some other value. In Satpy that is NaN for floating point data and any _FillValue for integer data. In older versions of numpy (I think newer versions have fixed this) if you do .where with np.nan it would cast 32-bit floats to 64-bit floats. Now it knows that you really mean 32-bit float -> 32-bit float NaN. For integers, they would always be cast to 64-bit floats so the note in the documentation is trying to say "there is no reason for that so force 32-bit float conversion".

The note should probably be updated to refer to integer fields that are meant to stay integers like cloud type products or in this case 8-bit unsigned integer RGB images. If an RGB is being read in by Satpy and is provided as floating point data (instead of as an integer) then it should probably be normalized to a 0-1 range as that is the assumption by the trollimage package when saving images.

I wonder how to go one here. When I have an image (i.e. 1 band, 8bit int), I do not know without prior knowledge whether the values represent categories or not. Additionally, maybe scaling (according to parameters store in the file itself) is wanted or normalization to [0..1] - or nothing at all. At the moment (without the changes in this PR) no scaling/normalization/calibration is applied. They are returned as stored in the file.
I think the behaviour of the reader should be parametrized. I'm not very familiar with this but looking at other readers, the dataset sections in the reader yaml might be a way. Stuff like calibration, units, fill_value, etc. seems to be specified there.
Maybe we can introduce a parameter like mask_type: floatNan to define what the reader should do.

I don't want to risk any problems for other users of this reader. Therefore default behaviour would be to do nothing (no type conversion and nan-masking) and the "float-nan-masking" is only active when it's enabled in the reader config.
Are you okay with that?

Things like scale factor and add offset in a geotiff are not standard so I don't think we should add any support for that. Support for that should be added in a separate reader (a custom reader created by the user).

Indeed there are a lot of attributes that are not standarized (geotiff and beyond). I do not really care about the scaling or offsets at the moment (only masks :-) ). But I think that readers in general should provide the possibility to output data that is comparable to data provided by other readers. They should intruduce a level of abstraction from the files. It's not very convenient for users of satpy when you have to adjust/modify compositors when switching between different readers. IMHO readers (that share the same API) should also return similar data structures or at least can be configured to do so.

In other readers, things like units are just metadata to fill in the space of missing information in the file. In those cases, the reader is specific to a particular implementation of a data file (VIIRS SDR in the HDF5 format versus generic HDF5 files) or are representing a standard (CF compliant NetCDF files). This generic image reader should try to use standard geotiff features.

This reader is not only used to read GeoTiff. As far as I see, there are filetypes (png, ..) that this reader can handle. But then it is even more necessary to provide more reader options to specified how to handle this data.

Maybe these kind of reader parametrizations are then highly use case dependent and do not belong to "core" satpy (aka satpy/etc/readers/xxx.yaml) but the reader implementation in satpy (satpy/readers/xxx.py) should offer the set of tools (parameters) to control the handling of the input data.

IMHO readers (that share the same API) should also return similar data structures or at least can be configured to do so.

Yeah, sure, but the question is what is the point of this pull request and how much work do you want to include in it? I am totally fine with the python code allowing for all sorts of flexibility as long as it is documented and that the defaults work. This is the "generic" image reader so none of this extra functionality should be required to get something out of the reader. Adding options to a static list of datasets in the YAML sounds fine.

I think this reader should fix the dataset naming so it can be overridden in the YAML. If it includes some minimal nodata handling for geotiffs, then great. Anything beyond that should probably go in a separate PR, right?

My original use for which I wrote this reader was to read any image as an image and use it as a background for BackgroundCompositor and to make it possible to resample them to a specific area. So in a way, I never considered it for anything else than categorical data.

Yeah, sure, but the question is what is the point of this pull request and how much work do you want to include in it? I am totally fine with the python code allowing for all sorts of flexibility as long as it is documented and that the defaults work. This is the "generic" image reader so none of this extra functionality should be required to get something out of the reader. Adding options to a static list of datasets in the YAML sounds fine.

I think this reader should fix the dataset naming so it can be overridden in the YAML. If it includes some minimal nodata handling for geotiffs, then great. Anything beyond that should probably go in a separate PR, right?

My latest commit should address this points. There's now an option to control the handling of nodata. Default is to do nothing with the data but copying the nodata value (first element) to dataset.attrs['_FillValue'].
Beside that you can add the option "nodata_handling" to the readers yaml dataset sections. If it is set to "nan_mask", the data is converted to float32 and masked with np.nan.

Anything beyond that should probably go in a separate PR, right?

Absolutely right. Anything else should be in another PR. But I need this together do get my stuff working and it's hard to handle multiple PRs addressing the same file.

If you have any proposals for better naming of this option - you're welcome :-)
Hope this commit fulfills the demands.

My original use for which I wrote this reader was to read any image as an image and use it as a background for BackgroundCompositor and to make it possible to resample them to a specific area. So in a way, I never considered it for anything else than categorical data.

It's good to have such reader. Everything starts small. I hope that your needs are also covered with this modifications and the code still doesn't get too complex. These are only small changes and it may be more confusing to implement this in a separate reader.

djhoese · 2021-03-01T12:30:18Z

satpy/readers/generic_image.py

-        logger.debug("Reading %s.", key)
-        return self.file_content[key['name']]
+        logger.debug("Reading 'image' (ignoring provided key %s).", key)
+        return self.file_content['image']


How about you either:

Copy self.file_content['image'] (with .copy() method) and update the attributes with the info dictionary. Like img_copy.attrs.update(info). I think this should update the name and any other DataID information that someone might include in the YAML.

Keep the old way of doing key['name'] but update the read method to write the file content to the right file_content key.

Option 1 is easiest. Option 2 provides the most flexibility in the future. @pnuu let us know if there is a reason not to move .read to the get_dataset call as that also seems like it could help here.

Let's wait for @pnuu 's input and I'll change it accordingly. Option 2 seems nicer to me.

The 'image' is there just to make things work with the existing way Satpy is designed to load data with a named dataset. In an image, there's only one dataset and scn.load() should probably just load the image. But yeah, when loading data with multiple readers that'd not work.

I think option 2 is the better way.

I'm going to remove the read() completely as it is only used in the ctor and move the reading stuff to get_dataset(). Then there's also no need to keep an attribute self.file_content any longer. The "How-To for custom readers" (https://satpy.readthedocs.io/en/stable/dev_guide/custom_reader.html) only demands implementing get_dataset().
And I couldn't find any other readers that use GenericImageFileHandler as base class.
Keeping self.file_content seems only necessary for caching reasons. But I think there are other mechanisms in satpy that care about this, right?
Okay?

Regarding moving the functionality: sounds good as long as the code is clean.

Regarding file_content, that is likely just a common practice used from other readers. It also makes writing tests easier, at least in other readers, because complex NetCDF/HDF files can be described by a dictionary and put in file_content. For this reader, maybe that isn't necessary. As long as the tests can be made to work with whatever you do then 👍

I've decided to lease reading in ctor as it is because otherwise get_area_def() needs special treatment. The area is set during reading. Although it seems that get_area_def() is called after get_dataset() and therefore area should be already available in my tests - it don't know if it's the case in all situations..
The latest commit introduces a class attribute dataset_name that is set to default image in the read function. In derived fake readers you can provide own file_content and leave ds_name empty to work with multiple datasets - if somebody really wants to do this..

FYI there was a semi-recent change in the base reader that changed the order of when the dataset was loaded and when the area was loaded. It used to be that the area was always loaded first (if I remember correctly), but now it is loaded after.

pnuu · 2021-03-24T07:49:50Z

Could you add a description )to the description field) of what this PR really is about? Looking at the the code, I'm also unsure about the title of the PR. Maybe clarify the title too?

ch-k · 2021-03-24T09:14:32Z

Could you add a description )to the description field) of what this PR really is about? Looking at the the code, I'm also unsure about the title of the PR. Maybe clarify the title too?

I've added a description text (sorry for that). But I don't have a better title..

ghost · 2021-04-23T10:29:31Z

DeepCode's analysis on #0f9339 found:

ℹ️ 1 minor issue. 👇
✔️ 1 issue was fixed.

Top issues

Description	Example fixes
First argument to super() should be the enclosing class FakeGenericImageFileHandler Occurrences: test_generic_image.py:222	🔧 Example fixes

👉 View analysis in DeepCode’s Dashboard | Configure the bot

satpy/readers/generic_image.py

djhoese

Looks really good. Thanks for the work. I have a couple suggestions and changes I'd like to talk about that I've made comments about.

djhoese · 2021-04-26T15:00:46Z

satpy/readers/generic_image.py



-def mask_image_data(data):
-    """Mask image data if alpha channel is present."""
+def mask_image_data(data, info=None):


When is info ever not provided?

I don't know. This function is used in test_generic_image.py. That's why I fear it is part of the public interface and used by somebody somewhere..

Ah good catch. Unless @pnuu says otherwise, I'd say update the tests to pass {} for info and make info a requirement parameter here.

If you have the time it would be nice to make this function private (_ prefix) and update the tests to not test this function specifically, but the higher level expected output of the reader instead.

What Dave said. And indeed, this ought to be private.

djhoese · 2021-04-26T15:02:12Z

satpy/readers/generic_image.py

+        try:
+            nodata_handling = info['nodata_handling'] if info else NODATA_HANDLING_FILLVALUE
+        except KeyError:
+            nodata_handling = NODATA_HANDLING_FILLVALUE


Assuming info is never None as it is always passed, then this block could be simplified to:

Suggested change

try:

nodata_handling = info['nodata_handling'] if info else NODATA_HANDLING_FILLVALUE

except KeyError:

nodata_handling = NODATA_HANDLING_FILLVALUE

nodata_handling = info.get('nodata_handling', NODATA_HANDLING_FILLVALUE)

Yes, would be much nicer. If we consider mask_image_data as private function and info as not none, I can change this.

satpy/readers/generic_image.py

djhoese · 2021-04-26T15:06:49Z

satpy/tests/reader_tests/test_generic_image.py

+        fname = os.path.join(self.base_dir, 'test_l_nan_fillvalue.tif')
+        scn = Scene(reader='generic_image', filenames=[fname])
+        scn.load(['image'])
+        self.assertEqual(scn['image'].shape, (1, self.y_size, self.x_size))
+        self.assertEqual(np.sum(scn['image'].data[0][:10, :10].compute()), 0)
+
+        fname = os.path.join(self.base_dir, 'test_l_nan_nofillvalue.tif')
+        scn = Scene(reader='generic_image', filenames=[fname])
+        scn.load(['image'])
+        self.assertEqual(scn['image'].shape, (1, self.y_size, self.x_size))
+        self.assertTrue(np.all(np.isnan(scn['image'].data[0][:10, :10].compute())))
+


Would you mind separating these out into their own test functions. I think we've generally switched to not combining a ton of test cases into one function whenever possible. It makes it easier to debug.

Co-authored-by: David Hoese <[email protected]>

….com/ch-k/satpy into fix-generic-image-reader-datasetname

djhoese · 2021-04-26T20:09:24Z

satpy/readers/generic_image.py

+
+        return data
+
+    def __mask_image_data(self, data, info):


Why did this become a class method? Couldn't/shouldn't this remain a function outside the class? Also, please use a single underscore. Double underscore has special handling in python and isn't needed here.

Sorry, too much java in the last months. I'll fix it tomorrow morning.

djhoese · 2021-04-27T14:41:13Z

satpy/tests/reader_tests/test_generic_image.py

+
+            def __init__(self, filename, filename_info, filetype_info, file_content, **kwargs):
+                """Get fake file content from 'get_test_content'."""
+                if GenericImageFileHandler is object:


When will this import ever return object? Also, does this class need to be inside the test function?

It's basically a copy of an other fake reader (from file test_hdf5_utils.py). I supposed it is best practice t odo so (without knowning why..). Of cource I can remove it. I've already noted that some the checkers marked this as issue.
It is an inner class because all the other test routines in this file do their imports in the method itself. I thought, it would be better to do the same here too.

Yeah I think now that we are using pytest for tests it isn't as important to have the various imports in the test functions themselves. @mraspaud thoughts?

Just removed if GenericImageFileHandler is object.

I tihnk it's fine without the check

satpy/readers/generic_image.py

djhoese · 2021-04-29T18:38:59Z

Last request: Could you look at fixing some of the style issues from codebeat and deepcode? It looks like one of the deepcode ones will be unfixable (the one regarding super) as I think what you've done is intended.

ch-k · 2021-04-30T07:24:36Z

Last request: Could you look at fixing some of the style issues from codebeat and deepcode? It looks like one of the deepcode ones will be unfixable (the one regarding super) as I think what you've done is intended.

Fixed some of the issues. But I don't really know how to make codebeat happy. These two remaining issues are classified as info. I can't imagine that further reducing of ACB or branch nesting makes the code more readable and maintainable.

djhoese

Looks good. Thanks.

djhoese · 2021-04-30T14:35:59Z

Thanks for making the attempt. And yeah, removing all the issues identified by these tools is just something you get used to and readability is arguable.

Fix: use same hardcoded dataset name for set and get

b876bb4

Fix: use same hardcoded dataset name for set and get (unit test)

d1be050

stickler-ci reviewed Feb 22, 2021

View reviewed changes

Handling of fill_value in GenericImageFileHandler

e8ffa2f

ch-k changed the title ~~Fix: use same hardcoded dataset name for set and get~~ Fix: GenericImageFileHandler, hardcoded dataset name for set and get + fillvalue Feb 26, 2021

Fix codefactor suggestion

ab0e885

ch-k mentioned this pull request Feb 26, 2021

New compositors: MultiFiller and LongitudeMaskingCompositor #1577

Merged

2 tasks

ch-k marked this pull request as ready for review March 1, 2021 10:51

ch-k requested review from djhoese and mraspaud as code owners March 1, 2021 10:51

djhoese requested changes Mar 1, 2021

View reviewed changes

djhoese changed the title ~~Fix: GenericImageFileHandler, hardcoded dataset name for set and get + fillvalue~~ Allow custom dataset names in 'generic_image' reader and fix nodata handling Mar 24, 2021

djhoese added component:readers enhancement code enhancements, features, improvements labels Mar 24, 2021

djhoese requested review from pnuu and djhoese March 25, 2021 17:32

ch-k added 3 commits April 22, 2021 09:51

Merge branch 'master' into fix-generic-image-reader-datasetname

579002a

Merge branch 'master' into fix-generic-image-reader-datasetname

b1ad1e0

Make handling of nodata in GenericImage reader configurable

e3ff310

ch-k added 3 commits April 23, 2021 12:06

Fix deepcode hints

1b6a928

Fix flake8 hints

081a59b

Add more doc

632d8dc

djhoese reviewed Apr 26, 2021

View reviewed changes

satpy/readers/generic_image.py Outdated Show resolved Hide resolved

djhoese reviewed Apr 26, 2021

View reviewed changes

satpy/readers/generic_image.py Show resolved Hide resolved

djhoese requested changes Apr 26, 2021

View reviewed changes

ch-k and others added 4 commits April 26, 2021 15:08

Update style of code documentation

7ee8b86

Co-authored-by: David Hoese <[email protected]>

Refactor unit test

d643a51

Merge branch 'fix-generic-image-reader-datasetname' of https://github…

4db0a0b

….com/ch-k/satpy into fix-generic-image-reader-datasetname

Make masking function private

cd61f22

djhoese reviewed Apr 26, 2021

View reviewed changes

Revert making mask_image_data a class method

b6b9f3c

djhoese reviewed Apr 27, 2021

View reviewed changes

Update dataset attributes

6bb1e8f

djhoese reviewed Apr 27, 2021

View reviewed changes

satpy/readers/generic_image.py Show resolved Hide resolved

Cleanup fake reader

b7b3a31

Fix deepcode and codebeat issues

0f93397

djhoese approved these changes Apr 30, 2021

View reviewed changes

djhoese merged commit cec1f0e into pytroll:master Apr 30, 2021

gerritholl mentioned this pull request Jun 1, 2021

Passing multiple readers fails if generic_image is among them #723

Closed

djhoese mentioned this pull request Sep 11, 2024

generic_image reader returns data as float64 for PNG images #2897

Closed

Allow custom dataset names in 'generic_image' reader and fix nodata handling #1560

Allow custom dataset names in 'generic_image' reader and fix nodata handling #1560

Conversation

ch-k commented Feb 22, 2021 • edited Loading

codecov bot commented Feb 22, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mraspaud commented Feb 22, 2021

ch-k commented Feb 22, 2021

djhoese commented Feb 22, 2021

ch-k commented Feb 23, 2021

ch-k commented Feb 23, 2021

djhoese commented Feb 23, 2021

pnuu commented Feb 24, 2021

ch-k commented Feb 24, 2021

pnuu commented Feb 24, 2021

ch-k commented Feb 24, 2021

ch-k commented Feb 24, 2021 • edited Loading

pnuu commented Feb 24, 2021

ch-k commented Feb 26, 2021

djhoese left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pnuu commented Mar 24, 2021

ch-k commented Mar 24, 2021

ghost commented Apr 23, 2021 • edited by ghost Loading

DeepCode's analysis on #0f9339 found:

Top issues

👉 View analysis in DeepCode’s Dashboard | Configure the bot

djhoese left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pnuu Apr 26, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djhoese commented Apr 29, 2021

ch-k commented Apr 30, 2021

djhoese left a comment

Choose a reason for hiding this comment

djhoese commented Apr 30, 2021

ch-k commented Feb 22, 2021 •

edited

Loading

codecov bot commented Feb 22, 2021 •

edited

Loading

ch-k commented Feb 24, 2021 •

edited

Loading

ghost commented Apr 23, 2021 •

edited by ghost

Loading

pnuu Apr 26, 2021 •

edited

Loading