-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix/trt op with kernel #11408
bugfix/trt op with kernel #11408
Conversation
Superjomn
commented
Jun 12, 2018
•
edited
Loading
edited
- add one FC layer for benchmark
- TODO add more layers
- fixed a bug, about not share the TRT engine for all the TRT engine op kernels.
return engines_.at(name).get(); | ||
} | ||
|
||
// Create or get an engine called `key` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an engine called name
mutable inference::tensorrt::TensorRTEngine* engine_{nullptr}; | ||
mutable int max_batch_{0}; | ||
// TODO(Superjomn) replace this stream with context's stream. | ||
// mutable cudaStream_t stream_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原来声明的stream_,engine_, max_batch_都不要了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对,之前那些会出现 bug:kernel 每个 type 全局只有一份,所以不能有 member。
This comment has been minimized.
This comment has been minimized.
… feature/trt_manul_benchmark
…rjomn/Paddle into feature/trt_manul_benchmark