Refine multi thread cpu parallel exe #11406

chengduoZH · 2018-06-12T13:03:25Z

No description provided.

… refine_multi_thread_CPU_Parallel_exe

… refine_multi_thread_CPU_Parallel_exe_debug_4

… refine_multi_thread_CPU_Parallel_exe

reyoung · 2018-07-13T03:58:11Z

paddle/fluid/framework/details/build_strategy.h

@@ -32,6 +32,8 @@ struct BuildStrategy {
  ReduceStrategy reduce_{ReduceStrategy::kAllReduce};
  GradientScaleStrategy gradient_scale_{GradientScaleStrategy::kCoeffNumDevice};

+  bool share_parameter_between_cards_{false};


How to share parameter between careds when use cuda?

Why we need another flag instead of ReduceStrategy ?

How to share parameter between cards when use cuda?

If share_parameter_between_cards_ is true, use_cuda_ must be false and build_strategy.reduce_ must be ReduceStrategy::kReduce. There is a checking:
https://github.com/PaddlePaddle/Paddle/pull/11406/files#diff-564dec854cf4f37015001783f71e06cbR76

So... Why we need the new flag rather than just use build_strategy.reduce_ ?

The data fields should be ORTHOGONAL.

reyoung · 2018-07-13T04:01:28Z

python/paddle/fluid/framework.py

            >>>     p = p - 0.001 * g
        """
        OpRole = core.op_proto_and_checker_maker.OpRole
        self._current_role = OpRole.Optimize
-        self._op_role_var = [var.name if isinstance(var, Variable) else var]
+        self._op_role_var = [


Why we need to store parameters and gradients?

In this case, fc_0.b_0's gradience name has been changed, so if we still use gradientfc_0.b_0@GRAD to decide this sgd in which device, we will encounter errors.
So, in this PR, I use grad but not GradVarName(params[0]) to get it's belong device.
https://github.com/PaddlePaddle/Paddle/pull/11406/files/7cf836f4ba4353d8ba4247ca09e904098a43edf5#diff-06c27dc69562c2f50b53409969b0a9b5R435

I see, cool

reyoung · 2018-07-13T06:43:30Z

paddle/fluid/framework/details/multi_devices_graph_builder.cc

  }
-  return -1;
+  auto param_grad = boost::get<std::vector<std::string>>(
+      op.GetNullableAttr(OpProtoAndCheckerMaker::OpRoleVarAttrName()));


Why here need to GetNullableAttr？ rather than GetAttr()?

It can be GetAttr() too here.

… refine_multi_thread_CPU_Parallel_exe

reyoung

Cool

* refine multi-thread CPU Parallel exe * refine multi thread CPU Parallel exe * Refine CPU version for ParallelExecutor * add share_parameter_between_cards_ * Fix ParallelExecutor bug * Fix unit test * Fix parameter opt balance * Fix with opti (param->grad) * Add grad to op var * Remove shard_param_between_cards

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from b5a1c35 to 606a73b Compare June 20, 2018 08:20

chengduoZH added 2 commits June 20, 2018 16:25

refine multi-thread CPU Parallel exe

dd17cc5

refine multi thread CPU Parallel exe

5c3ece4

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 606a73b to 5c3ece4 Compare June 20, 2018 08:26

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

efaed58

… refine_multi_thread_CPU_Parallel_exe

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 2 times, most recently from 24e1890 to 80bb2cf Compare June 26, 2018 13:18

Refine CPU version for ParallelExecutor

053ecd6

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 80bb2cf to 053ecd6 Compare June 26, 2018 13:19

chengduoZH requested a review from reyoung June 28, 2018 12:33

chengduoZH added 2 commits June 29, 2018 20:51

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

cc3584d

… refine_multi_thread_CPU_Parallel_exe

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5db01bf

… refine_multi_thread_CPU_Parallel_exe_debug_4

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 2 times, most recently from 04798dc to 5343ac2 Compare July 4, 2018 14:55

add share_parameter_between_cards_

985461b

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 5343ac2 to 985461b Compare July 4, 2018 15:00

Fix ParallelExecutor bug

7bae8c8

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 7 times, most recently from 3c8a759 to 623f412 Compare July 9, 2018 12:39

Fix unit test

962e8c9

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 623f412 to 962e8c9 Compare July 10, 2018 10:03

chengduoZH added 2 commits July 10, 2018 21:05

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

38e5359

… refine_multi_thread_CPU_Parallel_exe

Fix parameter opt balance

2af3613

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 043864e to 2af3613 Compare July 12, 2018 08:15

Fix with opti (param->grad)

a1c0ccb

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 2 times, most recently from b9241e1 to e223397 Compare July 12, 2018 12:27

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

dcce1ff

… refine_multi_thread_CPU_Parallel_exe

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from e223397 to dcce1ff Compare July 12, 2018 12:43

Add grad to op var

7cf836f

reyoung reviewed Jul 13, 2018

View reviewed changes

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 4 times, most recently from 21a39eb to 1cfdd10 Compare July 13, 2018 05:38

Remove shard_param_between_cards

8c8306e

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch 5 times, most recently from 5ec119a to 55cf9ee Compare July 13, 2018 05:54

reyoung reviewed Jul 13, 2018

View reviewed changes

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7c19f38

… refine_multi_thread_CPU_Parallel_exe

chengduoZH force-pushed the refine_multi_thread_CPU_Parallel_exe branch from 55cf9ee to 7c19f38 Compare July 13, 2018 06:53

reyoung approved these changes Jul 13, 2018

View reviewed changes

chengduoZH merged commit 86b0a72 into PaddlePaddle:develop Jul 13, 2018

chengduoZH mentioned this pull request Jul 13, 2018

Append gradient to OpRoleVar when OpRole is optimized. #12128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine multi thread cpu parallel exe #11406

Refine multi thread cpu parallel exe #11406

chengduoZH commented Jun 12, 2018

reyoung Jul 13, 2018

reyoung Jul 13, 2018

chengduoZH Jul 13, 2018

reyoung Jul 13, 2018

reyoung Jul 13, 2018

chengduoZH Jul 13, 2018

reyoung Jul 13, 2018

reyoung Jul 13, 2018

chengduoZH Jul 13, 2018

reyoung left a comment

Refine multi thread cpu parallel exe #11406

Refine multi thread cpu parallel exe #11406

Conversation

chengduoZH commented Jun 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyoung left a comment

Choose a reason for hiding this comment