Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AA] Add auto augmentation wrapper #4694

Merged
merged 4 commits into from
Mar 9, 2023
Merged

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Mar 7, 2023

Category:

New feature (non-breaking change which adds functionality)

Description:

This is a prerequisite for automatic augmentation library (#4648).

The automatic augmentations in this context are just different schemes of random selection of augmentation per sample.

The common idea is that, apart from sampling the augmentation, the schemes provide parameters to the randomly selected augmentations. In order to do that in abstract way, they use the notion of magnitude: each operation has adimissible magnitude range, which is divided into a number of bins. The automatic augmentation schemere selects the magnitude bin and the augmentation's role is to understand how it maps to exact paramter. For example, for fn.rotate magnitude is an angle.

This PR adds @augmentation decorator that lets user wrap a function that uses DALI processing so that it can adhere to automatic augmentation scheme "protocol".

The user who is interested in only running a standard suite for [Auto/Rand/Trivial]Augment should not need to use any of the utils here directly.

The simplest interaction needed is changeing augmenation setup (such as magnitudes range), which is supposed to look as follows (to change the range of angles to (0, 60), randomly negated for each sample:

rotate = a.rotate.augmentation((0, 60), randomly_negate=True)

For more advanced cases, the @augmentation is the user-facing tool for wrapping custom code as an augmentation.

@augmentation(mag_range=(0, 30), randomly_negate=True)
def rotate(sample, angle, fill_value=0, interp_type=None):
    return fn.rotate(sample, angle=angle, fill_value=fill_value, interp_type=interp_type)
def warp_x_param(magnitude):
    return [magnitude, 0]

@augmentation(mag_range=(0, 0.3), randomly_negate=True, as_param=warp_x_param)
def shear_x(sample, shear, fill_value=0, interp_type=None):
    mt = fn.transforms.shear(shear=shear)
    return fn.warp_affine(sample, matrix=mt, fill_value=fill_value, interp_type=interp_type,
                          inverse_map=False)

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-3230

@awolant awolant self-assigned this Mar 7, 2023
@klecki klecki self-assigned this Mar 7, 2023
@klecki klecki added the automatic augmentations Automatic augmentations (AutoAugment, RandAugment, TrivialAugment and more) support in DALI. label Mar 7, 2023
@stiepan stiepan mentioned this pull request Mar 7, 2023
18 tasks
@stiepan
Copy link
Member Author

stiepan commented Mar 7, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7507593]: BUILD STARTED

Signed-off-by: Kamil Tokarski <[email protected]>
@stiepan stiepan changed the title Add auto augmentation wrapper [AA] Add auto augmentation wrapper Mar 7, 2023
@stiepan
Copy link
Member Author

stiepan commented Mar 7, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7507858]: BUILD STARTED

@klecki klecki added conditional execution Questions related to conditional execution and using `if` statements in DALI and removed conditional execution Questions related to conditional execution and using `if` statements in DALI labels Mar 7, 2023
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7507858]: BUILD FAILED

@stiepan
Copy link
Member Author

stiepan commented Mar 7, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7509648]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7509648]: BUILD PASSED

@stiepan stiepan mentioned this pull request Mar 8, 2023
18 tasks
Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks, some suggestions (I think we can take care of potential improvements in a follow up PR), and asking for a bit more docs.

Also, at some point we should dedicate a page in the docs for this and list the contents of the modules.


Parameter
---------
sample : DataNode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: sample if this is batch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditionals are written in this sample-like style, so I stuck to it.

Comment on lines 282 to 290
def _remap_bins_to_signed_magnitudes(magnitudes):

def remap_bin_idx(bin_idx):
magnitude = magnitudes[bin_idx // 2]
if bin_idx % 2:
magnitude = -magnitude
return magnitude

return np.array([remap_bin_idx(bin_idx) for bin_idx in range(2 * len(magnitudes))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a comment explaining the idea behind this for the benefit of the future readers.

lo, hi = mag_range
return np.linspace(lo, hi, num_magnitude_bins, dtype=np.float32)

def _get_param(self, magnitude_bin, num_magnitude_bins, random_sign=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short doc here that this generates the actual parameter as a DataNode, or something similar?

Comment on lines 247 to 249
if self.randomly_negate:
magnitudes = _remap_bins_to_signed_magnitudes(magnitudes)
magnitude_bin = 2 * magnitude_bin + random_sign
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most of the automatic augmentation schemes, this operation can be hoisted to the top of the implementation and calculated once instead of doing it for every augmentation individually.

This is something that might be considered in a follow-up PR if we measure the impact of this on the size of the graph and how well DALI copes with a lot of those operations running on small batches.

Copy link
Member Author

@stiepan stiepan Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I replaced the explicit random_sign passing with signed_bin() call. You invoke it signed_bin(magnitude_bin) and the necessary node is created at the place of the call. Then, when wrapper sees it as magnitude_bin param, it can use that node. It is meant as an optimisation - if user does not make the call but the augmentation has randomly_negate set to True, it will wrap the magnitude_bin on its own.

Comment on lines 247 to 249
if self.randomly_negate:
magnitudes = _remap_bins_to_signed_magnitudes(magnitudes)
magnitude_bin = 2 * magnitude_bin + random_sign
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you can probably get rid of one arithm op, by remapping magnitude_bin with magnitude_bin - random_sign for random_sign = [0, num_bins + 1].
Probably doesn't matter much.

Comment on lines +43 to +45
Then, the bin is used to compute the parameter as if by calling
`as_param(magnitudes[magnitude_bin] * ((-1) ** random_sign))`, where
`magnitudes=linspace(mag_range[0], mag_range[1], num_magnitude_bins)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth to provide an example, what is the expected definition and how the signature changes after it is decorated?

@@ -0,0 +1,87 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docs to this module

self.unused_args = unused_args


def filter_extra_accepted_kwargs(fun, kwargs, skip_positional=0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking for docs as the name can be understood as removing the extra kwargs that are accepted rather than obtaining them.

Comment on lines +141 to +142
assert any(el < 0 for el in magnitudes)
assert any(el > 0 for el in magnitudes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you will get a fully positive or negative batch at some point, unless the seed will be fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot-on, I've added the seeds

…param indices remapping to single top-level operation

Signed-off-by: Kamil Tokarski <[email protected]>
@stiepan
Copy link
Member Author

stiepan commented Mar 8, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7524829]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7524829]: BUILD PASSED

Copy link
Contributor

@awolant awolant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, well tested, lovely job 👍

From the user perspective I struggled with one small thing. From this code sample:

@augmentation(mag_range=(0, 30), randomly_negate=True)
def rotate(sample, angle, fill_value=0, interp_type=None):
    return fn.rotate(sample, angle=angle, fill_value=fill_value, interp_type=interp_type)

it was not immediately clear to me to which param mag_range applies. After going through the doc it is clear that the second argument is being worked on here.

One thing I have seen in similar situations was to explicitly point the argument by name or by index.
Additionally, when I first saw as_param I thought that it does exactly that - tells to the decorator which argument of the function decorated is the param to be modified.

Maybe something to consider would be to rename as_param to something like map_param or transform_param and use as_param or param_id name to point to the argument that mag_range applies to.

Regardless, I think it would be ok to leave it as it is.

@stiepan stiepan merged commit e4c1ddb into NVIDIA:main Mar 9, 2023
@stiepan stiepan mentioned this pull request Mar 14, 2023
18 tasks
aderylo pushed a commit to zpp-dali-2022/DALI that referenced this pull request Mar 17, 2023
* Add auto augmentation wrapper and utils
* Enable the auto_aug tests in qa

---------

Signed-off-by: Kamil Tokarski <[email protected]>
@JanuszL JanuszL mentioned this pull request Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automatic augmentations Automatic augmentations (AutoAugment, RandAugment, TrivialAugment and more) support in DALI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants