Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text_classification infer performance test #11080

Merged
merged 22 commits into from
Jun 4, 2018

Conversation

tensor-tang
Copy link
Contributor

@tensor-tang tensor-tang commented May 31, 2018

This enabled text_classification model inference test on both single thread and multi threads.

We can get performance result shown below, on CPU 2620 v2:

Threads Num Latency per thread (ms) Throughput (QPS=sample number/total time)
1 2.29 436
12 6.56 1808

@tensor-tang tensor-tang changed the title [DO NOT MERGE] nlp performance test nlp performance test Jun 1, 2018
@tensor-tang tensor-tang changed the title nlp performance test text_classification infer performance test Jun 1, 2018
@tensor-tang
Copy link
Contributor Author

Last failure is:

[Step 1/1] In file included from /paddle/paddle/fluid/operators/tensorrt_engine_op.cc:19:0:
[08:12:28][Step 1/1] /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h: In member function 'void paddle::inference::tensorrt::OpConverter::ConvertOp(const paddle::framework::proto::OpDesc&, const std::unordered_set<std::__cxx11::basic_string<char> >&, const paddle::framework::Scope&, paddle::inference::tensorrt::TensorRTEngine*)':
[08:12:28][Step 1/1] /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:43:51: error: no matching function for call to 'paddle::framework::OpDesc::OpDesc(const paddle::framework::proto::OpDesc&, std::nullptr_t, std::nullptr_t)'
[08:12:28][Step 1/1]      framework::OpDesc op_desc(op, nullptr, nullptr);
[08:12:28][Step 1/1]                                                    ^
[08:12:28][Step 1/1] In file included from /paddle/paddle/fluid/framework/block_desc.h:24:0,
[08:12:28][Step 1/1]                  from /paddle/paddle/fluid/framework/operator.h:26,
[08:12:28][Step 1/1]                  from /paddle/paddle/fluid/operators/tensorrt_engine_op.h:19,
[08:12:28][Step 1/1]                  from /paddle/paddle/fluid/operators/tensorrt_engine_op.cc:17:
[08:12:28][Step 1/1] /paddle/paddle/fluid/framework/op_desc.h:40:3: note: candidate: paddle::framework::OpDesc::OpDesc(const paddle::framework::OpDesc&, paddle::framework::BlockDesc*)
[08:12:28][Step 1/1]    OpDesc(const OpDesc &other, BlockDesc *block) {
[08:12:28][Step 1/1]    ^
[08:12:28][Step 1/1] /paddle/paddle/fluid/framework/op_desc.h:40:3: note:   candidate expects 2 arguments, 3 provided
[08:12:28][Step 1/1] paddle/fluid/operators/CMakeFiles/tensorrt_engine_op.dir/build.make:62: recipe for target 'paddle/fluid/operators/CMakeFiles/tensorrt_engine_op.dir/tensorrt_engine_op.cc.o' failed
[08:12:28][Step 1/1] /paddle/paddle/fluid/framework/op_desc.h:38:12: note: candidate: paddle::framework::OpDesc::OpDesc(paddle::framework::BlockDesc*)
[08:12:28][Step 1/1]    explicit OpDesc(BlockDesc *block) : block_(block) {}

https://paddleci.ngrok.io/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=37591&_focus=4127

auto start_ms = GetCurrentMs();
for (size_t i = 0; i < inputs.size(); ++i) {
feed_targets[feed_target_names[0]] = inputs[i];
executor->Run(*copy_program, scope, &feed_targets, &fetch_targets, true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a subscope to run the executor

auto sub_scope = scope->NewScope();
executor->Run(sub_scope);
scope->DeleteScope(sub_scope);

Copy link
Member

@jacquesqiao jacquesqiao Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feed data into sub_scope if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will update.

// 1. Define place, executor, scope
auto place = paddle::platform::CPUPlace();
auto executor = paddle::framework::Executor(place);
auto* scope = new paddle::framework::Scope();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::unique_ptrpaddle::framework::Scope scope(new paddle::framework::Scope);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

for (size_t i = 0; i < inputs.size(); ++i) {
feed_targets[feed_target_names[0]] = inputs[i];
executor->Run(*copy_program, &sub_scope, &feed_targets, &fetch_targets,
true, true, feed_holder_name, fetch_holder_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comments to true such as

true/*use_gpu*/, true/* with_monitor*/

and it will be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks~

return 1e+3 * time.tv_sec + 1e-3 * time.tv_usec;
}

// return size of total words
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comment to describe the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

return sz;
}

void SplitData(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comment to describe the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

Copy link
Contributor Author

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update

return 1e+3 * time.tv_sec + 1e-3 * time.tv_usec;
}

// return size of total words
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

return sz;
}

void SplitData(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

for (size_t i = 0; i < inputs.size(); ++i) {
feed_targets[feed_target_names[0]] = inputs[i];
executor->Run(*copy_program, &sub_scope, &feed_targets, &fetch_targets,
true, true, feed_holder_name, fetch_holder_name);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks~

// 1. Define place, executor, scope
auto place = paddle::platform::CPUPlace();
auto executor = paddle::framework::Executor(place);
auto* scope = new paddle::framework::Scope();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sure~

@tensor-tang
Copy link
Contributor Author

Done @Superjomn

#include <omp.h>
#endif

DEFINE_string(modelpath, "", "Directory of the inference model.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model_path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I used to follow dirname before.

#endif

DEFINE_string(modelpath, "", "Directory of the inference model.");
DEFINE_string(datafile, "", "File of input index data.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

size_t LoadData(std::vector<paddle::framework::LoDTensor>* out,
const std::string& filename) {
if (filename.empty()) {
return DummyData(out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a log, such as

LOG(WARN) << "pass empty filename, use some dummy data instead";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panyx0718
panyx0718 previously approved these changes Jun 3, 2018
Copy link
Contributor

@panyx0718 panyx0718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address those small comments

@@ -38,3 +38,10 @@ inference_test(recommender_system)
#inference_test(rnn_encoder_decoder)
#inference_test(understand_sentiment ARGS conv)
inference_test(word2vec)

# This is an unly work around to make this test run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO to clean up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure~

while (getline(iss, field, ' ')) {
ids.push_back(stoi(field));
}
if (ids.size() >= 1024) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After synced with NLP guys, they say they will ignore the inputs which is larger than 1024.

Copy link
Contributor Author

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your inputs, I will update.

@@ -38,3 +38,10 @@ inference_test(recommender_system)
#inference_test(rnn_encoder_decoder)
#inference_test(understand_sentiment ARGS conv)
inference_test(word2vec)

# This is an unly work around to make this test run
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure~

#include <omp.h>
#endif

DEFINE_string(modelpath, "", "Directory of the inference model.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I used to follow dirname before.

#endif

DEFINE_string(modelpath, "", "Directory of the inference model.");
DEFINE_string(datafile, "", "File of input index data.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

size_t LoadData(std::vector<paddle::framework::LoDTensor>* out,
const std::string& filename) {
if (filename.empty()) {
return DummyData(out);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while (getline(iss, field, ' ')) {
ids.push_back(stoi(field));
}
if (ids.size() >= 1024) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After synced with NLP guys, they say they will ignore the inputs which is larger than 1024.

@tensor-tang tensor-tang merged commit 07c48db into PaddlePaddle:develop Jun 4, 2018
@tensor-tang tensor-tang deleted the nlp branch June 4, 2018 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants