-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AA] Add auto augmentation wrapper #4694
Conversation
Signed-off-by: Kamil Tokarski <[email protected]>
!build |
CI MESSAGE: [7507593]: BUILD STARTED |
Signed-off-by: Kamil Tokarski <[email protected]>
!build |
CI MESSAGE: [7507858]: BUILD STARTED |
CI MESSAGE: [7507858]: BUILD FAILED |
Signed-off-by: Kamil Tokarski <[email protected]>
!build |
CI MESSAGE: [7509648]: BUILD STARTED |
CI MESSAGE: [7509648]: BUILD PASSED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks, some suggestions (I think we can take care of potential improvements in a follow up PR), and asking for a bit more docs.
Also, at some point we should dedicate a page in the docs for this and list the contents of the modules.
|
||
Parameter | ||
--------- | ||
sample : DataNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: sample
if this is batch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditionals are written in this sample-like style, so I stuck to it.
def _remap_bins_to_signed_magnitudes(magnitudes): | ||
|
||
def remap_bin_idx(bin_idx): | ||
magnitude = magnitudes[bin_idx // 2] | ||
if bin_idx % 2: | ||
magnitude = -magnitude | ||
return magnitude | ||
|
||
return np.array([remap_bin_idx(bin_idx) for bin_idx in range(2 * len(magnitudes))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this deserves a comment explaining the idea behind this for the benefit of the future readers.
lo, hi = mag_range | ||
return np.linspace(lo, hi, num_magnitude_bins, dtype=np.float32) | ||
|
||
def _get_param(self, magnitude_bin, num_magnitude_bins, random_sign=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a short doc here that this generates the actual parameter as a DataNode, or something similar?
if self.randomly_negate: | ||
magnitudes = _remap_bins_to_signed_magnitudes(magnitudes) | ||
magnitude_bin = 2 * magnitude_bin + random_sign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For most of the automatic augmentation schemes, this operation can be hoisted to the top of the implementation and calculated once instead of doing it for every augmentation individually.
This is something that might be considered in a follow-up PR if we measure the impact of this on the size of the graph and how well DALI copes with a lot of those operations running on small batches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I replaced the explicit random_sign passing with signed_bin()
call. You invoke it signed_bin(magnitude_bin)
and the necessary node is created at the place of the call. Then, when wrapper sees it as magnitude_bin
param, it can use that node. It is meant as an optimisation - if user does not make the call but the augmentation has randomly_negate set to True, it will wrap the magnitude_bin on its own.
if self.randomly_negate: | ||
magnitudes = _remap_bins_to_signed_magnitudes(magnitudes) | ||
magnitude_bin = 2 * magnitude_bin + random_sign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you can probably get rid of one arithm op, by remapping magnitude_bin
with magnitude_bin - random_sign
for random_sign = [0, num_bins + 1]
.
Probably doesn't matter much.
Then, the bin is used to compute the parameter as if by calling | ||
`as_param(magnitudes[magnitude_bin] * ((-1) ** random_sign))`, where | ||
`magnitudes=linspace(mag_range[0], mag_range[1], num_magnitude_bins)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is worth to provide an example, what is the expected definition and how the signature changes after it is decorated?
@@ -0,0 +1,87 @@ | |||
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docs to this module
self.unused_args = unused_args | ||
|
||
|
||
def filter_extra_accepted_kwargs(fun, kwargs, skip_positional=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am asking for docs as the name can be understood as removing the extra kwargs that are accepted rather than obtaining them.
assert any(el < 0 for el in magnitudes) | ||
assert any(el > 0 for el in magnitudes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you will get a fully positive or negative batch at some point, unless the seed will be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spot-on, I've added the seeds
…param indices remapping to single top-level operation Signed-off-by: Kamil Tokarski <[email protected]>
!build |
CI MESSAGE: [7524829]: BUILD STARTED |
CI MESSAGE: [7524829]: BUILD PASSED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, well tested, lovely job 👍
From the user perspective I struggled with one small thing. From this code sample:
@augmentation(mag_range=(0, 30), randomly_negate=True)
def rotate(sample, angle, fill_value=0, interp_type=None):
return fn.rotate(sample, angle=angle, fill_value=fill_value, interp_type=interp_type)
it was not immediately clear to me to which param mag_range
applies. After going through the doc it is clear that the second argument is being worked on here.
One thing I have seen in similar situations was to explicitly point the argument by name or by index.
Additionally, when I first saw as_param
I thought that it does exactly that - tells to the decorator which argument of the function decorated is the param to be modified.
Maybe something to consider would be to rename as_param
to something like map_param
or transform_param
and use as_param
or param_id
name to point to the argument that mag_range
applies to.
Regardless, I think it would be ok to leave it as it is.
* Add auto augmentation wrapper and utils * Enable the auto_aug tests in qa --------- Signed-off-by: Kamil Tokarski <[email protected]>
Category:
New feature (non-breaking change which adds functionality)
Description:
This is a prerequisite for automatic augmentation library (#4648).
The automatic augmentations in this context are just different schemes of random selection of augmentation per sample.
The common idea is that, apart from sampling the augmentation, the schemes provide parameters to the randomly selected augmentations. In order to do that in abstract way, they use the notion of magnitude: each operation has adimissible magnitude range, which is divided into a number of bins. The automatic augmentation schemere selects the magnitude bin and the augmentation's role is to understand how it maps to exact paramter. For example, for
fn.rotate
magnitude is an angle.This PR adds
@augmentation
decorator that lets user wrap a function that uses DALI processing so that it can adhere to automatic augmentation scheme "protocol".The user who is interested in only running a standard suite for [Auto/Rand/Trivial]Augment should not need to use any of the utils here directly.
The simplest interaction needed is changeing augmenation setup (such as magnitudes range), which is supposed to look as follows (to change the range of angles to (0, 60), randomly negated for each sample:
For more advanced cases, the
@augmentation
is the user-facing tool for wrapping custom code as an augmentation.Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-3230