Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuduce memory copy when communication between trainer and pserver. #9271

Merged
merged 29 commits into from
Mar 22, 2018

Conversation

gongweibao
Copy link
Contributor

No description provided.

@@ -237,6 +242,8 @@ def train_loop(exe, trainer_prog):
"TRAINING_ROLE",
"TRAINER") # get the training role: trainer/pserver

#print(debuger.pprint_program_codes(fluid.default_main_program().desc))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -251,6 +258,7 @@ def train_loop(exe, trainer_prog):
if not current_endpoint:
print("need env SERVER_ENDPOINT")
exit(1)
print("get_pserver_program")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove print

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not removed yet

@@ -0,0 +1,315 @@
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use argument to decide whether to run tf as local or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -32,7 +32,8 @@ void ThreadPool::Init() {
// TODO(Yancey1989): specify the max threads number
int num_threads = std::thread::hardware_concurrency();
PADDLE_ENFORCE_GT(num_threads, 0);
threadpool_.reset(new ThreadPool(num_threads));
// threadpool_.reset(new ThreadPool(num_threads));
threadpool_.reset(new ThreadPool(1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should revert this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

::google::protobuf::io::ZeroCopyInputStream* contents() override {
DeleteStream();
stream_ = new (&space_) Reader(buffer_);
return stream_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need so many wrappers, just create the ::grpc::GrpcBufferReader as ZeroCopyInputStream could be simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So make it don't have any abstract type. The interface is simple enough to understand without this abstract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没看懂。。。。
Parse函数需要支持ByteBuffergrpc_byte_buffer两种类型的参数,他们都可以转成ZeroCopyInputStream, 而ZeroCopyInputStream是不能当做参数类型的。

struct timeval t0_wait, t1_wait;
gettimeofday(&t0_wait, 0);
std::thread::id this_id = std::this_thread::get_id();
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gongweibao gongweibao changed the title [WIP]Reuduce memory copy when communication between trainer and pserver. Reuduce memory copy when communication between trainer and pserver. Mar 21, 2018
typhoonzero
typhoonzero previously approved these changes Mar 21, 2018
Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a lot of work to finish the SerialiseTraits

@@ -251,6 +258,7 @@ def train_loop(exe, trainer_prog):
if not current_endpoint:
print("need env SERVER_ENDPOINT")
exit(1)
print("get_pserver_program")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not removed yet

@gongweibao
Copy link
Contributor Author

Done.

@gongweibao gongweibao merged commit 990d639 into PaddlePaddle:develop Mar 22, 2018
@gongweibao gongweibao deleted the optsend branch March 22, 2018 02:47
mikeseven added a commit to mikeseven/Paddle that referenced this pull request Mar 22, 2018
* commit '9c35b0dc1ba0ace5acf721685802a21045ea1249': (36 commits)
  Fix dist compile error (PaddlePaddle#9320)
  Fix bug for backward tanspiler when using parallel_do operator. (PaddlePaddle#9282)
  update
  fix transpiler bug
  Update index_en.rst (PaddlePaddle#9286)
  "fix mixed_vector bug" (PaddlePaddle#9319)
  Update index_en.rst (PaddlePaddle#9280)
  Adjust some contents in write_docs_en.rst for Contribue Documentation (PaddlePaddle#9147)
  CMake refine for HIP support. Fix CI.
  Reuduce memory copy when communication between trainer and pserver. (PaddlePaddle#9271)
  Modified build.sh and remove build_doc.sh
  fix doc
  Enhance device context pool (PaddlePaddle#9293)
  Device blobs are created only in training. Added testing attribute
  Shrink batch_norm_grad's inputs
  updates
  prepare and create op before run
  wip
  small fix
  initial commit
  ...

# Conflicts:
#	cmake/external/eigen.cmake
blacksheep-Aristotle pushed a commit to blacksheep-Aristotle/Paddle that referenced this pull request Nov 22, 2024
…ddle#9271)

* chatglm2 support block_attn and fix some bugs
* fix ci
* fix more ut error
* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants