-
Notifications
You must be signed in to change notification settings - Fork 42
Conversation
Fix small typo
If I have a target polygon in the source image, how would I get the corresponding area in transformed coordinates corresponding to a sequence of |
RandomZoom([0., 0.2]), | ||
CenterCrop(height, width), | ||
]) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this allow injecting custom augmentation? For example, suppose I want to apply Gaussian Blur or channel wise contrast, can I inject that into it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you just need to create your own layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a single layer can have multiple transformations inside the call
which can be defined by other libraries like imguag
or albumnetation
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right. Custom layers offer you great flexibility to implement your own transformations. But note that all transformations should be defined using TF ops if you want performance. Otherwise you'd have to step out of graph execution (it's still technically doable, though).
]) | ||
|
||
input_pipeline = ImagePipeline([ | ||
augmenter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to take a lot of care with the random
augmentation layers to make sure the syncing works correctly. They would need to be deterministic & 're-settable' in some fashion that's built-in to the API & that the ImagePipeline/from_directory
apis would take advantage of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to make matters trickier now that I think about it, this would also need to work well w/ parallel processing in the dataset (& possibly w/ distribution strategies active). It may be worth taking a page out of Jax's book as inspiration? https://github.com/google/jax/blob/master/design_notes/prng.md
Or using some of the mechanisms from Peng's RFC for random numbers in tf 2.0:
https://github.com/tensorflow/community/pull/38/files?short_path=b84a5ce#diff-b84a5ce018def5de3e1396b9962feff1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we should add a public API to control the seeding behavior, besides the seed
argument in the constructor.
Would you have a specific API in mind? (in terms of methods and their signature)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think the safest thing would be to make random augmentation layers only use stateless random ops and support taking a seed
argument directly in the call methods. The constructor-time seed argument would then be renamed initial_seed
. If no seed
argument is provided at call-time, the seed generated from initial_seed
would be used.
Otherwise, the layer's initial_seed
would be combined with the seed passed in to make sure different layer objects act differently when passed the same seed (while a layer object shared in different models would act the same way in both when passed the same seed).
So, the implementation & API for RandomFlip would look something like:
class RandomFlip(PreprocessingLayer):
def __init__(self, horizontal=False, vertical=False, initial_seed=None):
self.horizontal = horizontal
self.vertical = vertical
self.initial_seed = initial_seed
self._initial_seed_or_random = initial_seed or random_value()
self._current_seed = self._initial_seed_or_random
def call(self, inputs, training=None, seed=None):
if seed is None:
seed = self._current_seed
self._current_seed += 1
else:
seed = seed + self._initial_seed_or_random
if training:
if self.horizontal:
inputs = tf.image.random_flip_left_right(inputs, seed=seed)
if self.vertical:
inputs = tf.image.random_flip_up_down(inputs, seed=seed)
return inputs
We'll run into similar challenges as with the training
argument of layers & models where we have to feed it through nested models & layers to avoid bugs. We can use a similar mechanism in __call__
to solve the problem.
This sort of deterministic randomness could be generally useful for random models & layers in Keras beyond just for random augmentations & preprocessing layers (e.g. for dropout layers)
The from_directory
& input methods would continue to take an optional seed
argument. They could either provide the dataset tuple index as a seed in the 'call' method of ImagePIpeline, or use some sort of 'tf.function & distribution strategy-friendly' version of tf.random.experimental.Generator
to generate the seeds for the layers. We can check with Peng who's been working on it to see what the status there is.
The random layers should also use tf.random.experimental.Generator or something similar to maintain their internal seeds rather than using raw python in the form of self.seed = self.seed + 1
.
That way they will work correctly w/ tf.function & distribution strategies.
A note on retracing tf.function: If we want to wrap the
call
method in a tf.function like we do for saved_models, we will need to take care to represent the seeds passed into call as scalar tensors rather than just python objects. Otherwise the function will get retraced for each seed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to these suggestions, especially having seed
as a call
argument. Note that we don't need to change the name of the constructor argument; seed
is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the workflow for an image-segmentation pipeline would be:
Create two identical ImagePipeline
s, with the same seed, and run them on both inputs? I guess the alternatives would be:
- Make these work on
nests
of images - Or have a
.reapply
method?
So if someone wanted to use this for object detection then you could build a BBox
equivalent of each layer, and write a converter that makes a new Sequential
pipeline with all the BBox
layers substituted in, using the same seed. Would that be a reasonable approach to using this for object_detection?
@lobrien You would have a matching set of Layers for BBoxes and they would get the same seed. |
|
||
#### Constructor | ||
|
||
`ImagePipeline` inherits from `keras.model.Sequential` and takes a list of layers as inputs. In the future it will inherit from `PreprocessingStage`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not build out the basic preproc stage first? Inheriting from Sequential directly seems dangerous, as we will get a bunch of attrs/methods that people will accidentally start to depend on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. We could simply inherit from PreprocessingLayer
only and manually add Sequential
-like features instead of inheriting from Sequential
.
|
||
`ImagePipeline` inherits from `keras.model.Sequential` and takes a list of layers as inputs. In the future it will inherit from `PreprocessingStage`. | ||
|
||
`ImagePipeline` is a preprocessing layer that encapsulate a series of image transformations. Since some of these transformations may be trained (featurewise normalization), it exposes the method `adapt`, like all other preprocessing layers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a preproc layer, right? A preprocessing stage? Do layers implement adapt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had the same question, i think @fchollet is referring to preprocessing stage here.
Layers do not implement adapt
, this was a concept introduced in the preprocessing layers design and is similar to fit
. adapt
is the API used to train preprocessing layers. The name fit
was not reused since the data used for training is different between this API and fit. Also i think adapt
was originally called update
in the processing layers design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a preprocessing stage (so by extension it is a preprocessing layer, since preprocessing stage will subclass PreprocessingLayer
).
I describe it as a preprocessing layer specifically because it is likely that PreprocessingStage
will not yet exist when we ship the initial version of this API, hence ImagePipeline
would subclass PreprocessingLayer
in its first iteration.
The method name adapt
was the consensus result of the preprocessing layer API design review meeting (not great, but we have to settle on something).
#### Methods | ||
|
||
```python | ||
def from_directory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify-- these are now methods of an instantiated ImagePipeline, not standalones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are instance methods, not standalone functions.
|
||
- We are dropping support for ZCA whitening as it is no longer popular in the computer vision community. | ||
- We don't have immediate support for random translations along only one axis. | ||
- We only plan on implementing support for `data_format='channels_last'`. As such this argument does not appear in the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Does this match the expectations of accelerator users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that NVIDIA is moving towards native support for channels_last
, removing the need to convert to channels_first
for performance. I omitted the argument for the sake of simplicity, but we can always add support if the need arises.
RandomContrast(amplitude=0., seed=None) | ||
RandomSaturation(amplitude=0., seed=None) | ||
RandomWidth(amplitude=0., seed=None) # Expand / shrink width while distorting aspect ratio | ||
RandomHeight(amplitude=0., seed=None) # Expand / shrink height while distorting aspect ratio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: amplitude is a weird parameter on some of these-- eg, what is the amplitude of a rotation, or a width resize? As these are separate layers whose params will diverge over time, does it make sense to use the "right" words here rather than biasing towards the same words?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API consistency is important to reduce cognitive load and minimize surprises / looking up the docs. Is there a better universal word we could use in this context?
Note: this is also the reason why we use the keyword "kernel" throughout Keras even in places where it doesn't exactly apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice and easy to use, but I am a little concerned that there's no apparent way to specify these in absolute units: pixels, radians,... etc.
Exposing some way to skip the relative units would make it easier to build piplelines specified in absolute units without spending time questioning things like:
- What the relative units are for each? How they're interpreted?
- How does multi-scale training, or images with a different input range affect the pipeline?
If larger than 1, it is rounded to one for the lower boundary (but not the higher boundary).
For random zoom this comes out a little strange. If I want a uniform scale in [1/2, 2]
I can set amplitude=[1/2, 1]
. But, IIUC, the random part here is linear so 1/3 of images are shrunk, and 2/3 are expanded. A log-scale option for random zoom would be nice to have.
|
||
```python | ||
Resizing(height, width) # Resize while distorting aspect ratio | ||
CenterCrop(height, width) # Resize without distorting aspect ratio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there value in a RandomCrop? Or just Crop, with center v random parameterized? IIRC, random cropping is part of some imagenet pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding RandomCrop
is a good idea. It is technically equivalent to a combination of RandomTranslation
and CenterCrop
, but it is a useful shortcut.
The only outstanding question at this time is what we should call what is currently named "amplitude". Due to not having any major issue to resolve with this design, we will be skipping the design review and use some time during the Keras SIG public meeting this Friday to officialize the design. Please make sure to raise any issue/question on this doc before then. |
Keras-Preprocessing Redesign
Comment period open until September 13th, 2019.
Context
tf.data.Dataset
is the main API for data loading and preprocessing in TensorFLow. It has two advantages:Meanwhile,
keras.preprocessing
is a major API for data loading and preprocessing in Keras. It is basedon Numpy and Scipy, and it produces instances of the
keras.utils.Sequence
class, which are finite-length,resettable Python generators that yield batches of data.
Some features of
keras.preprocessing
are highly useful and don't have straightforward equivalents intf.data
(in particular image data augmentation and dynamic time series iteration).
Ideally, the utilities in
keras.preprocessing
should be made compatible withtf.data
.This presents the opportunity to improve on the existing API. In particular we don't have good support
for image segmentation use cases today.
Some features are also being supplanted by preprocessing layers, in particular text processing.
As a result we may want move the current API to an API similar to Layers.
Goals
keras.preprocessing
compatible withtf.data
.tf.image
).