-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/trt engine op test #11182
feature/trt engine op test #11182
Conversation
@@ -14,7 +14,7 @@ | |||
|
|||
#pragma once | |||
|
|||
#ifdef PADDLE_WITH_CUDA | |||
#if PADDLE_WITH_CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是ifdef
#include "paddle/fluid/inference/tensorrt/convert/op_converter.h" | ||
#include "paddle/fluid/inference/tensorrt/convert/ut_helper.h" | ||
|
||
USE_CPU_ONLY_OP(tensorrt_engine); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问为什么tensorrt_engine_op是CPU ONLY呢?不应该是GPU ONLY么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tensorrt engine op 的kernel逻辑是在 cpu里跑,会触发 gpu
@@ -34,12 +35,15 @@ class OpConverter { | |||
|
|||
// Converter logic for an op. | |||
virtual void operator()(const framework::proto::OpDesc& op, | |||
const framework::Scope& scope) {} | |||
const framework::Scope& scope, | |||
bool test_mode = false) {} | |||
|
|||
// Convert a single fluid operaotr and add the corresponding layer to TRT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operaotr-》operator
@@ -37,12 +36,18 @@ class MulOpConverter : public OpConverter { | |||
engine_, MatrixMultiply, *const_cast<nvinfer1::ITensor*>(input1), false, | |||
*const_cast<nvinfer1::ITensor*>(input2), false); | |||
|
|||
engine_->DeclareOutput(layer, 0, op_desc.Output("Out")[0]); | |||
auto output_name = op_desc.Output("Out")[0]; | |||
engine_->SetITensor(output_name, layer->getOutput(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 为什么单侧(test_mode)的时候不知道输出名字是什么呢?这个能否在之后的PR中去掉呢?
- 如果是单侧的时候要用,可以改成unittest_mode? 不然会认为是测试阶段,而TRT只有前向过程,会造成困扰。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里避免硬编码
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…/trt_engine_op_test
BlockDesc
totensorrt_engine_op
NEXT STEP:
write a tool to execute a larger model and output the benchmark.