Support a wider range of dynamically initialized models for MultiNodeOptimizer #148

shu65 · 2017-12-11T07:17:11Z

When a new layer is added to a model dynamically, its parameters needs to be sent all nodes.
But, the current MultiNodeOptimizer only sends model parameters first once.
Therefore, I fixed MultinodeOptimizer to send parameters all nodes when a model is changed.

iwiwi · 2017-12-13T06:51:49Z

chainermn/optimizers.py

+                    return True
+            return False
+        else:
+            return True


Let us make the case analysis as easy as possible. Here, how about using "early return" as follows (for details, please refer to the book 'The Art of Readable Code'):

if len(previous_params) != len(self.target_params): return True for param1, param2 in zip(self.target_params, previous_params): ...

Thanks, I will fix it.

iwiwi · 2017-12-13T07:14:37Z

I assume that it is fine, but, for safety, I would like to see some empirical evidence that this change does not affect the performance. I'm concerning the cost for many string comparison.

shu65 · 2017-12-13T07:43:02Z

Ok, I will check its performance.

shu65 · 2017-12-14T13:30:43Z

I measured the computing time of is_change which is the function checks if a model has not been modified.
I used ResNet50 as model and repeatedly measured it 100 times. I used 2 nodes in MN-1 for the evaluation. The result is following.

                 Mean    Median  Min     Max
                 (sec.)  (sec.)  (sec.)  (sec.)
is_change only   0.0002  0.0002  0.0002  0.0003

In addition, I also measured the computing time of is_change and allreduce of PureNcclCommunicator with ResNet50.
The result is following.

The computing time of is_change + allreduce

                 Mean    Median  Min     Max
                 (sec.)  (sec.)  (sec.)  (sec.)
1 GPU            0.0045  0.0042  0.0041  0.0076
2 GPUs           0.0137  0.0136  0.0129  0.0171
4 GPUs           0.0174  0.0174  0.0170  0.0185
8 GPUs           0.0263  0.0258  0.0253  0.0326
16 GPUs          0.0324  0.0322  0.0315  0.0350

The computing time of allreduce only

                 Mean    Median  Min     Max
                 (sec.)  (sec.)  (sec.)  (sec.)
1 GPU            0.0038  0.0038  0.0038  0.0041
2 GPUs           0.0134  0.0134  0.0126  0.0163
4 GPUs           0.0172  0.0171  0.0167  0.0187
8 GPUs           0.0256  0.0254  0.0251  0.0306
16 GPUs          0.0319  0.0317  0.0313  0.0339

For the results, the impact of the function is very small.

iwiwi · 2017-12-15T08:55:41Z

Great, thank you for the detailed evaluation! LGTM

add a test for dynamic network

980a67f

iwiwi self-requested a review December 13, 2017 04:25

iwiwi self-assigned this Dec 13, 2017

shu65 added 2 commits December 13, 2017 13:29

fix MultiNodeOptimizer

ef19f3e

delte needs_broadcast

62bebe1

iwiwi added the enhancement label Dec 13, 2017

iwiwi suggested changes Dec 13, 2017

View reviewed changes

fix flake8 errors

1ac8f9e

shu65 and others added 2 commits December 15, 2017 10:48

refactoring

fa61449

Merge branch 'master' into test_for_dynamic_network

a395236

shu65 changed the title ~~[WIP] Fix update bug in MultiNodeOptimizer~~ Fix update bug in MultiNodeOptimizer Dec 15, 2017

Merge branch 'master' into test_for_dynamic_network

4e5a7fd

iwiwi merged commit 1018524 into master Dec 15, 2017

iwiwi changed the title ~~Fix update bug in MultiNodeOptimizer~~ Support a wider range of dynamically initialized models for MultiNodeOptimizer Dec 15, 2017

iwiwi deleted the test_for_dynamic_network branch December 15, 2017 09:00

iwiwi added this to the v1.1.0 milestone Dec 15, 2017

kuenishi mentioned this pull request Jan 11, 2018

Tests for optimizer and evaluator #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support a wider range of dynamically initialized models for MultiNodeOptimizer #148

Support a wider range of dynamically initialized models for MultiNodeOptimizer #148

shu65 commented Dec 11, 2017

iwiwi Dec 13, 2017

shu65 Dec 14, 2017

iwiwi commented Dec 13, 2017 •

edited

Loading

shu65 commented Dec 13, 2017 •

edited

Loading

shu65 commented Dec 14, 2017 •

edited

Loading

iwiwi commented Dec 15, 2017

Support a wider range of dynamically initialized models for MultiNodeOptimizer #148

Support a wider range of dynamically initialized models for MultiNodeOptimizer #148

Conversation

shu65 commented Dec 11, 2017

iwiwi Dec 13, 2017

Choose a reason for hiding this comment

shu65 Dec 14, 2017

Choose a reason for hiding this comment

iwiwi commented Dec 13, 2017 • edited Loading

shu65 commented Dec 13, 2017 • edited Loading

shu65 commented Dec 14, 2017 • edited Loading

iwiwi commented Dec 15, 2017

iwiwi commented Dec 13, 2017 •

edited

Loading

shu65 commented Dec 13, 2017 •

edited

Loading

shu65 commented Dec 14, 2017 •

edited

Loading