Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU Profiler in PaddlePaddle #482

Merged
merged 15 commits into from
Nov 25, 2016
Merged

Conversation

gangliao
Copy link
Contributor

@gangliao gangliao commented Nov 15, 2016

  • Add GPU profiler object API
  • Add Performance tuning docs
  • Add unit test
  • pre-commit hook check code style

@gangliao
Copy link
Contributor Author

#289

@@ -42,7 +42,7 @@ addons:
before_install:
- |
if [ ${JOB} == "BUILD_AND_TEST" ]; then
if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)'
if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)|(\.rst$)|(\.jpg$)|(\.png$)'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only modify rst, md, Travis ci only build Docs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@@ -95,7 +95,7 @@ As a simple example, consider the following:
```bash
# necessary
sudo apt-get update
sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
sudo apt-get install -y g++ make cmake swig build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add swig to avoid demo failed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@gangliao gangliao added this to the 0.10.0 milestone Nov 15, 2016
@coveralls
Copy link

Coverage Status

Coverage increased (+0.009%) to 62.97% when pulling ff6205d on gangliao:profiler into 9bce328 on baidu:develop.

@@ -14,3 +14,4 @@ add_simple_unittest(test_perturbation)
add_simple_unittest(test_CpuGpuVector)
add_simple_unittest(test_Allocator)
add_simple_unittest(test_FPException)
add_simple_unittest(test_GpuProfiler)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update to the newest develop code, and use the pre commit hooks. This file has a bad eof

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

TEST(Profiler, BilinearFwdBwd) {
hl_profiler_start();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could write a raii object for gpu profile, just like lock guard or set device. And we should add a global variable for profile refference count for maintaining the reentrant function call.

But it could be not in this PR.


.. code-block:: c++

TEST(Profiler, BilinearFwdBwd) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rst file could reference other files locally. it is not need to copy and paste your codes. Please try to use literal_include(or something similar, i cannot remember the command exactly).

@@ -42,7 +42,7 @@ addons:
before_install:
- |
if [ ${JOB} == "BUILD_AND_TEST" ]; then
if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)'
if ! git diff --name-only $TRAVIS_COMMIT_RANGE | grep -qvE '(\.md$)|(\.rst$)|(\.jpg$)|(\.png$)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@@ -95,7 +95,7 @@ As a simple example, consider the following:
```bash
# necessary
sudo apt-get update
sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
sudo apt-get install -y g++ make cmake swig build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

@luotao1
Copy link
Contributor

luotao1 commented Nov 16, 2016

因为这些commit中没有merge主干的部分,所以能将commit合并成一个提交么?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.09%) to 62.899% when pulling 9670b9a on gangliao:profiler into d0a908d on baidu:develop.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 62.911% when pulling 9670b9a on gangliao:profiler into d0a908d on baidu:develop.

@@ -280,4 +281,23 @@ inline StatSet& registerTimerArg2(uint64_t threshold = -1,

#endif // DISABLE_TIMER

class GpuProfiler final {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final关键词似乎会让Paddle的GCC版本提升到4.7+。我有印象4.6是不支持final的。。

Explicit virtual overrides
https://gcc.gnu.org/projects/cxx-status.html#cxx11

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不过我们把gcc升到4.8+也未尝不可。因为百度的toolchain已经是4.8.2了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我也觉得可以升到4.8+

class GpuProfiler final {
public:
GpuProfiler() { hl_profiler_start(); }
~GpuProfiler() { hl_profiler_end(); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里有一个问题是,如果有重入的调用就会有问题吧。例如

void funcA() {
   GpuProfiler p;
   ...
}

void funcB() {
   GpuProfiler p;
   funcA();
   ...
}

所以,GpuProfiler还需要在全局维护一个引用计数,并且需要是使用 http://en.cppreference.com/w/cpp/thread/recursive_mutex 来保护的。即Gpu测试也只能在一个线程里进行开关控制。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有道理啊

======================
Since training deep neural network typically take a very long time to get over, performance is gradually becoming
the most important thing in deep learning field. The first step to improve performance is to understand what parts
are slow. No point in improving performance of a region which doesn’t take much time!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No point in improving performance of a region which doesn’t take much time!
没看懂这句话啥意思?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我把 there is 加上吧。
There is no point in improving performance of a region which doesn’t take much time!
这个应该是amdahl's law理念吧。。

The above code snippet includes two methods, you can use any of them to profile the regions of interest.

1. :code:`REGISTER_TIMER_INFO` is a built-in timer wrapper which can calculate the time overhead of both cpu functions and cuda kernels.
2. :code:`REGISTER_GPU_PROFILER` is a general purpose wrapper object of :code:`cudaProfilerStart` and :code:`cudaProfilerStop` to avoid
Copy link
Contributor

@backyes backyes Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add blank line between 1. 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.08%) to 61.725% when pulling 8393c19 on gangliao:profiler into d0a908d on PaddlePaddle:develop.

@reyoung
Copy link
Collaborator

reyoung commented Nov 23, 2016

@backyes Please review this PR.
@gangliao Please resolve the conflicts.

using namespace paddle; // NOLINT
using namespace std; // NOLINT

void MatrixCheckErr(const Matrix& matrix1, const Matrix& matrix2) {
Copy link
Contributor

@luotao1 luotao1 Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatrixCheckErr和testBilinearFwdBwd在test_MatrixCompare中都有,不用再写一遍了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下一个PR吧 把test_matrixCompare.cpp里面几个check函数 转移到test_matrixUtil.h, 有些check函数也重复了。。。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,不过会和@hedaoyuan #385 的冲突了么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不至于冲突,他那个是更高一层abstraction, 有了他那个,确实也不急着改了。。。

@gangliao gangliao merged commit 29b6a75 into PaddlePaddle:develop Nov 25, 2016
@gangliao gangliao deleted the profiler branch November 25, 2016 07:02
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019
…-api-cn-layers

1211 update api cn layers
Meiyim added a commit to Meiyim/Paddle that referenced this pull request May 21, 2021
* Create python-publish.yml

+ action

* Update python-publish.yml
wangxicoding pushed a commit to wangxicoding/Paddle that referenced this pull request Dec 9, 2021
* update chnsenticorp and lcqmc to qianyan format

* update md5 check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants