Add GPU Profiler in PaddlePaddle #482

gangliao · 2016-11-15T16:14:23Z

Add GPU profiler object API
Add Performance tuning docs
Add unit test
pre-commit hook check code style

gangliao · 2016-11-15T16:15:05Z

gangliao · 2016-11-15T16:17:09Z

.travis.yml

@@ -42,7 +42,7 @@ addons:
 before_install:
  - |
    if [ ${JOB} == "BUILD_AND_TEST" ]; then
-      if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)'
+      if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)|(\.rst$)|(\.jpg$)|(\.png$)'


If only modify rst, md, Travis ci only build Docs

gangliao · 2016-11-15T16:18:08Z

doc/build/build_from_source.md

@@ -95,7 +95,7 @@ As a simple example, consider the following:
    ```bash
    # necessary
    sudo apt-get update
-    sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
+    sudo apt-get install -y g++ make cmake swig build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git


Add swig to avoid demo failed.

coveralls · 2016-11-15T17:43:54Z

Coverage increased (+0.009%) to 62.97% when pulling ff6205d on gangliao:profiler into 9bce328 on baidu:develop.

reyoung · 2016-11-16T01:16:17Z

paddle/math/tests/CMakeLists.txt

@@ -14,3 +14,4 @@ add_simple_unittest(test_perturbation)
 add_simple_unittest(test_CpuGpuVector)
 add_simple_unittest(test_Allocator)
 add_simple_unittest(test_FPException)
+add_simple_unittest(test_GpuProfiler)


please update to the newest develop code, and use the pre commit hooks. This file has a bad eof

reyoung · 2016-11-16T01:25:09Z

paddle/math/tests/test_GpuProfiler.cpp

+}
+
+TEST(Profiler, BilinearFwdBwd) {
+  hl_profiler_start();


maybe we could write a raii object for gpu profile, just like lock guard or set device. And we should add a global variable for profile refference count for maintaining the reentrant function call.

But it could be not in this PR.

reyoung · 2016-11-16T01:29:01Z

doc/optimization/gpu_profiling.rst

+
+.. code-block:: c++
+
+    TEST(Profiler, BilinearFwdBwd) {


rst file could reference other files locally. it is not need to copy and paste your codes. Please try to use literal_include(or something similar, i cannot remember the command exactly).

reyoung · 2016-11-16T01:30:12Z

.travis.yml

@@ -42,7 +42,7 @@ addons:
 before_install:
  - |
    if [ ${JOB} == "BUILD_AND_TEST" ]; then
-      if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)'
+      if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)|(\.rst$)|(\.jpg$)|(\.png$)'


reyoung · 2016-11-16T01:30:29Z

doc/build/build_from_source.md

@@ -95,7 +95,7 @@ As a simple example, consider the following:
    ```bash
    # necessary
    sudo apt-get update
-    sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
+    sudo apt-get install -y g++ make cmake swig build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git


luotao1 · 2016-11-16T02:09:06Z

因为这些commit中没有merge主干的部分，所以能将commit合并成一个提交么？

coveralls · 2016-11-17T06:41:36Z

Coverage increased (+0.09%) to 62.899% when pulling 9670b9a on gangliao:profiler into d0a908d on baidu:develop.

coveralls · 2016-11-17T07:05:36Z

Coverage increased (+0.1%) to 62.911% when pulling 9670b9a on gangliao:profiler into d0a908d on baidu:develop.

reyoung · 2016-11-18T02:56:08Z

paddle/utils/Stat.h

@@ -280,4 +281,23 @@ inline StatSet& registerTimerArg2(uint64_t threshold = -1,

 #endif  // DISABLE_TIMER

+class GpuProfiler final {


final关键词似乎会让Paddle的GCC版本提升到4.7+。我有印象4.6是不支持final的。。

Explicit virtual overrides
https://gcc.gnu.org/projects/cxx-status.html#cxx11

不过我们把gcc升到4.8+也未尝不可。因为百度的toolchain已经是4.8.2了

我也觉得可以升到4.8+

reyoung · 2016-11-18T03:05:47Z

paddle/utils/Stat.h

+class GpuProfiler final {
+public:
+  GpuProfiler() { hl_profiler_start(); }
+  ~GpuProfiler() { hl_profiler_end(); }


这里有一个问题是，如果有重入的调用就会有问题吧。例如

void funcA() { GpuProfiler p; ... } void funcB() { GpuProfiler p; funcA(); ... }

所以，GpuProfiler还需要在全局维护一个引用计数，并且需要是使用 http://en.cppreference.com/w/cpp/thread/recursive_mutex 来保护的。即Gpu测试也只能在一个线程里进行开关控制。

有道理啊

backyes · 2016-11-18T04:44:50Z

doc/optimization/gpu_profiling.rst

+======================
+Since training deep neural network typically take a very long time to get over, performance is gradually becoming
+the most important thing in deep learning field. The first step to improve performance is to understand what parts
+are slow. No point in improving performance of a region which doesn’t take much time!


No point in improving performance of a region which doesn’t take much time!
没看懂这句话啥意思？

我把 there is 加上吧。
There is no point in improving performance of a region which doesn’t take much time!
这个应该是amdahl's law理念吧。。

backyes · 2016-11-18T04:48:02Z

doc/optimization/gpu_profiling.rst

+The above code snippet includes two methods, you can use any of them to profile the regions of interest.
+
+1. :code:`REGISTER_TIMER_INFO` is a built-in timer wrapper which can calculate the time overhead of both cpu functions and cuda kernels.
+2. :code:`REGISTER_GPU_PROFILER` is a general purpose wrapper object of :code:`cudaProfilerStart` and :code:`cudaProfilerStop` to avoid


Add blank line between 1. 2.

Wow, thanks.

coveralls · 2016-11-21T06:05:33Z

Coverage decreased (-1.08%) to 61.725% when pulling 8393c19 on gangliao:profiler into d0a908d on PaddlePaddle:develop.

reyoung · 2016-11-23T07:55:26Z

@backyes Please review this PR.
@gangliao Please resolve the conflicts.

luotao1 · 2016-11-23T08:20:00Z

paddle/math/tests/test_GpuProfiler.cpp

+using namespace paddle;  // NOLINT
+using namespace std;     // NOLINT
+
+void MatrixCheckErr(const Matrix& matrix1, const Matrix& matrix2) {


MatrixCheckErr和testBilinearFwdBwd在test_MatrixCompare中都有，不用再写一遍了。

下一个PR吧把test_matrixCompare.cpp里面几个check函数转移到test_matrixUtil.h, 有些check函数也重复了。。。

OK，不过会和@hedaoyuan #385 的冲突了么

不至于冲突，他那个是更高一层abstraction, 有了他那个，确实也不急着改了。。。

…-api-cn-layers 1211 update api cn layers

* Create python-publish.yml + action * Update python-publish.yml

* update chnsenticorp and lcqmc to qianyan format * update md5 check

gangliao added 10 commits November 14, 2016 17:47

Revise build docs

9ec91b6

Travis ci does not build when only docs changed

20f853b

Change png/jpg files do not trigger Travis ci

d7d14ce

Keep libcudart.so link in binary

76a41f3

Add Gpu profiler interface

2e9ea1c

Add GPU Profiler unit test

e8c0fb9

Add gpu profiling docs

23bce47

Update gpu profiling docs

978a6e5

Merge conflict with develop branch

84cab2c

Replace md to rst for doc index

ff6205d

gangliao commented Nov 15, 2016

View reviewed changes

gangliao assigned luotao1 Nov 15, 2016

gangliao added this to the 0.10.0 milestone Nov 15, 2016

gangliao added enhancement labels Nov 15, 2016

reyoung requested changes Nov 16, 2016

View reviewed changes

gangliao assigned reyoung Nov 16, 2016

gangliao added 3 commits November 17, 2016 14:18

Add profiler object and update docs

2c84c1e

Merge branch 'develop' of https://github.com/baidu/Paddle into profiler

f28f2e0

pre-commit hook check python style

9670b9a

reyoung requested changes Nov 18, 2016

View reviewed changes

backyes requested changes Nov 18, 2016

View reviewed changes

Add recursive mutex and counter for gpu profiler

8393c19

reyoung approved these changes Nov 23, 2016

View reviewed changes

Merge conflict with hl_cuda_device.cc

e488001

luotao1 reviewed Nov 23, 2016

View reviewed changes

backyes approved these changes Nov 25, 2016

View reviewed changes

gangliao merged commit 29b6a75 into PaddlePaddle:develop Nov 25, 2016

gangliao deleted the profiler branch November 25, 2016 07:02

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019

Merge pull request PaddlePaddle#482 from haowang101779990/1211-update…

ccb1bb5

…-api-cn-layers 1211 update api cn layers

Meiyim added a commit to Meiyim/Paddle that referenced this pull request May 21, 2021

Create python-publish.yml (PaddlePaddle#482)

a7d51e7

* Create python-publish.yml + action * Update python-publish.yml

wangxicoding pushed a commit to wangxicoding/Paddle that referenced this pull request Dec 9, 2021

update chnsenticorp and lcqmc to qianyan format (PaddlePaddle#482)

07d414a

* update chnsenticorp and lcqmc to qianyan format * update md5 check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU Profiler in PaddlePaddle #482

Add GPU Profiler in PaddlePaddle #482

gangliao commented Nov 15, 2016 •

edited

Loading

gangliao commented Nov 15, 2016

gangliao Nov 15, 2016

reyoung Nov 16, 2016

gangliao Nov 15, 2016

reyoung Nov 16, 2016

coveralls commented Nov 15, 2016

reyoung Nov 16, 2016

gangliao Nov 17, 2016

reyoung Nov 16, 2016

reyoung Nov 16, 2016

reyoung Nov 16, 2016

reyoung Nov 16, 2016

luotao1 commented Nov 16, 2016

coveralls commented Nov 17, 2016

coveralls commented Nov 17, 2016

reyoung Nov 18, 2016

reyoung Nov 18, 2016

gangliao Nov 18, 2016

reyoung Nov 18, 2016

gangliao Nov 18, 2016

backyes Nov 18, 2016

gangliao Nov 18, 2016

backyes Nov 18, 2016 •

edited

Loading

gangliao Nov 18, 2016

coveralls commented Nov 21, 2016

reyoung commented Nov 23, 2016

luotao1 Nov 23, 2016 •

edited

Loading

gangliao Nov 23, 2016

luotao1 Nov 23, 2016

gangliao Nov 23, 2016

		@@ -280,4 +281,23 @@ inline StatSet& registerTimerArg2(uint64_t threshold = -1,

		#endif // DISABLE_TIMER

		class GpuProfiler final {

Add GPU Profiler in PaddlePaddle #482

Add GPU Profiler in PaddlePaddle #482

Conversation

gangliao commented Nov 15, 2016 • edited Loading

gangliao commented Nov 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Nov 16, 2016

coveralls commented Nov 17, 2016

coveralls commented Nov 17, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

backyes Nov 18, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 21, 2016

reyoung commented Nov 23, 2016

luotao1 Nov 23, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gangliao commented Nov 15, 2016 •

edited

Loading

backyes Nov 18, 2016 •

edited

Loading

luotao1 Nov 23, 2016 •

edited

Loading