-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Callback PR Rev 3 #615
Callback PR Rev 3 #615
Conversation
Signed-off-by: Jason <[email protected]>
This pull request introduces 8 alerts when merging b16d356 into d219483 - view on LGTM.com new alerts:
|
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 10 alerts when merging 879fcfc into d219483 - view on LGTM.com new alerts:
|
nemo/core/callbacks.py
Outdated
def action(self, action_obj): | ||
self._action = action_obj | ||
|
||
def on_action_start(self, state): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you proposed to have this set of "events"
on_train_start
on_epoch_start
on_optimizer_step_start
on_batch_start
on_batch_end
on_optimizer_step_stop
on_epoch_end
on_train_end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am against on_train_, on_optimizer_ - there are specific for training action, and we already got two other types of major actions aside of training...
I would still suggest to use on_iteration_* instead of on_step_*, but I can live with it - as long we all agree on that name ;)
…tion Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 10 alerts when merging 35d6b7d into deef552 - view on LGTM.com new alerts:
|
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 15 alerts when merging 3c7b89e into 06b04ca - view on LGTM.com new alerts:
|
…rdLoggerCallback Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 15 alerts when merging fa6553f into 9d90a95 - view on LGTM.com new alerts:
|
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
# ============================================================================= | ||
|
||
|
||
class NmTensorNameRegistry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So any particular reason why those shouldn't be weak references?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you store names only - not important anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #615 (comment)
pass | ||
|
||
# Finally, add object to the set. | ||
self._nmtensor_uniname_dict[tensor.unique_name] = tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a "strong reference" to tensor object...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this reference and switching _nmtensor_uniname_dict to be a set()
self._nmtensor_name_registry = nemo.core.neural_types.NmTensorNameRegistry() | ||
|
||
@property | ||
def tensor_names(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... And in here you suggest that this is only mapping from one name to the other.
So what is really TensorRegistry storing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep NmTensorRegistry is just a mapping of user's naming of nmtensors to their unique_names. It also keeps track of all unique_names so we can initialize the TrainingState object.
self.nf.reset_trainer() | ||
|
||
@pytest.fixture() | ||
def create_tensorboard_file(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the tmpdir
fixture from pytest
example:
tmp_file_name = str(tmpdir.mkdir("export").join("nested_list_export.yml")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
loss_tensor = loss(predictions=y_pred, target=y) | ||
|
||
# Mock up both std and stderr streams. | ||
with logging.patch_stdout_handler(StringIO()) as std_out: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, now I finally understand why you added those methods to logging... but should they really be part of logging, not testing?
epoch_step_counter = [0] | ||
epoch_batch_counter = [0] | ||
|
||
@on_step_end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow! Those decorators are nice! 👍
I haven't got that from the NeMoCallbacks API... what those are good for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Major:
- Remove all "dead" (commented) code.
- Move the graph parsing out of NeuralModuleFactory.py
Minor:
- Use save/load from nemo.backends instead of torch
- Use tmpdir in tests
nemo/core/neural_factory.py
Outdated
tensor_value = self.tensor_dict[unique_name] | ||
return tensor_value | ||
|
||
|
||
class Actions(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @blisc : move Actions + TrainingState + graph traversing-related code to a separate file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved Actions, TrainingState and topological_sort_from_leaves to a new file called nemo/core/actions.py
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 1 alert and fixes 3 when merging 9f4566b into 5d1527a - view on LGTM.com new alerts:
fixed alerts:
|
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 1 alert and fixes 3 when merging 31fc556 into 5d1527a - view on LGTM.com new alerts:
fixed alerts:
|
Signed-off-by: Jason <[email protected]>
This pull request introduces 1 alert and fixes 3 when merging b9e4441 into 5d1527a - view on LGTM.com new alerts:
fixed alerts:
|
Signed-off-by: Jason <[email protected]>
Signed-off-by: Jason <[email protected]>
This pull request introduces 2 alerts and fixes 3 when merging 1e429af into 5d1527a - view on LGTM.com new alerts:
fixed alerts:
|
Signed-off-by: Jason <[email protected]>
This pull request introduces 2 alerts and fixes 3 when merging fdae1f3 into 5d1527a - view on LGTM.com new alerts:
fixed alerts:
|
@@ -0,0 +1,298 @@ | |||
# ! /usr/bin/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
This commit introduces the following chagnes: 1. Makes sure not to clone taxonomy in non-interactive mode when it already exists 2. Adds a message when git clone failed informing user to manually clone the repo 3. Adds multiple tests for both interactive and non-interactive lab init Signed-off-by: Maciej Szulik <[email protected]> Signed-off-by: Martin Hickey <[email protected]> Co-authored-by: Martin Hickey <[email protected]>
Rebased on top of Master. Replaces #597
Changelog:
Major:
SimpleLogger
andTensorboardLogger
which replaceSimpleLossLoggerCallback
WandBLogger
replacesWandbCallback
CheckpointCallback
has been updated to the new callback system but usage and functionality remains the same as before.__get_top_sorted_modules_and_dataloader
and split it off into a new functiontopological_sort_from_leaves
which now lives in core/neural_factory.py__get_pytorch_module
from Actions.Minor:
Tasks:
Neural Module callsNmTensor.init()