Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-class non-maximum suppression operator. #7953

Merged
merged 8 commits into from
Feb 2, 2018

Conversation

qingqing01
Copy link
Contributor

@qingqing01 qingqing01 commented Jan 29, 2018

Fix #7773

  • The clipping and scaling for output BBoxes will be done in next PR.
  • This PR only supports CPU. The GPU will be supported in the future.
  • If there are no detected BBoxes for all images, the output is -1 and the elements of output LoD are all zero.

auto score_dims = ctx->GetInputDim("Scores");

PADDLE_ENFORCE_EQ(box_dims.size(), 2,
"The rank of Input(Bboxes) must be 3.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rank of Input(Bboxes) must be 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"The rank of Input(Bboxes) must be 3.");
PADDLE_ENFORCE_EQ(score_dims.size(), 3,
"The rank of Input(Scores) must be 3.");
PADDLE_ENFORCE_EQ(box_dims[1], 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape of Input(Bboxes) must be [N, 4]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

PADDLE_ENFORCE_EQ(score_dims.size(), 3,
"The rank of Input(Scores) must be 3.");
PADDLE_ENFORCE_EQ(box_dims[1], 4);
PADDLE_ENFORCE_EQ(box_dims[0], score_dims[2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predictions number of Input(Bboxes) and Input(Scores) must be the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

T BBoxArea(const T* box, const bool normalized) {
if (box[2] < box[0] || box[3] < box[1]) {
// If bbox is invalid (e.g. xmax < xmin or ymax < ymin), return 0.
return T(0.);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use static_cast(0.) or brace initialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

for (const auto& it : selected_indices) {
int label = it.first;
const T* sdata = scores_data + label * predict_dim;
std::vector<int> indices = it.second;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector& indices = it.second;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,375 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the year.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Bboxes"),
"Input(Bboxes) of MulticlassNMS should not be null.");
Copy link
Contributor

@pkuyym pkuyym Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bboxes or BBoxes?

MulticlassNMS --> MultiClassNMSOp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Change to BBoxes.

PADDLE_ENFORCE(ctx->HasInput("Bboxes"),
"Input(Bboxes) of MulticlassNMS should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Scores"),
"Input(Scores) of MulticlassNMS should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MulticlassNMS --> MulticlassNMSOp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

constexpr int64_t kOutputDim = 6;
constexpr int64_t kBBoxSize = 4;

class MulticlassNMSOp : public framework::OperatorWithKernel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MulticlassNMSOp --> MultiClassNMSOp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auto score_dims = ctx->GetInputDim("Scores");

PADDLE_ENFORCE_EQ(box_dims.size(), 2,
"The rank of Input(Bboxes) must be 3.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must be 3 --> must be 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

AddInput("Bboxes",
"(Tensor) A 2-D Tensor with shape [M, 4] represents the location "
"predictions with M bboxes. 4 is the number of "
"each location coordinates.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 is the number of each location coordinates --> Each bounding box has four coordinate values and the layout is [xmin, ymin, xmax, ymax]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

"each location coordinates.");
AddInput("Scores",
"(Tensor) A 3-D Tensor with shape [N, C, M] represents the "
"confidence predictions. N is the batch size, C is the class "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confidence predictions --> predicted confidence scores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

AddInput("Scores",
"(Tensor) A 3-D Tensor with shape [N, C, M] represents the "
"confidence predictions. N is the batch size, C is the class "
"number, M is number of predictions for each class, which is "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, M is number of predictions for each class --> and M is the number of bounding boxes. For each category there are total M scores which corresponding to M bounding boxes. Please note, M is equal to the 1st dimension of Bboxes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

AddAttr<int>(
"background_label",
"(int64_t, defalut: 0) "
"The index of background label, the background label will be ignored.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add If set to -1, then all categories will be considered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.SetDefault(0);
AddAttr<float>("score_threshold",
"(float) "
"Only consider detections whose confidences are larger than "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Threshold to filter out bounding boxes with low confidence score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghaox @pkuyym

Thank you for your carefully review.

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Bboxes",
"(Tensor) A 2-D Tensor with shape [M, 4] represents the location "
"predictions with M bboxes. 4 is the number of "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank!

AddInput("Bboxes",
"(Tensor) A 2-D Tensor with shape [M, 4] represents the location "
"predictions with M bboxes. 4 is the number of "
"each location coordinates.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

"each location coordinates.");
AddInput("Scores",
"(Tensor) A 3-D Tensor with shape [N, C, M] represents the "
"confidence predictions. N is the batch size, C is the class "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

AddInput("Scores",
"(Tensor) A 3-D Tensor with shape [N, C, M] represents the "
"confidence predictions. N is the batch size, C is the class "
"number, M is number of predictions for each class, which is "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

@@ -0,0 +1,375 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const T* bdata = bboxes_data + idx * kBBoxSize;
odata[count * kOutputDim] = label; // label
odata[count * kOutputDim + 1] = sdata[idx]; // score
odata[count * kOutputDim + 2] = bdata[0]; // xmin
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, use std::memcpy

std::vector<size_t> batch_starts = {0};
for (int64_t i = 0; i < batch_size; ++i) {
Tensor ins_score = scores->Slice(i, i + 1);
ins_score.Resize({class_num, predict_dim});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Slice in Tensor doesn't reduce dimension. The shape of ins_score is [1, C, M], resize to [C, M] here.

if (normalized) {
return w * h;
} else {
// If bbox is not within range [0, 1].
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

AddAttr<int>(
"background_label",
"(int64_t, defalut: 0) "
"The index of background label, the background label will be ignored.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.SetDefault(0);
AddAttr<float>("score_threshold",
"(float) "
"Only consider detections whose confidences are larger than "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

wanghaox
wanghaox previously approved these changes Feb 2, 2018
limitations under the License. */

#include "paddle/framework/op_registry.h"
#include "paddle/operators/math/math_function.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this header necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

template <class T>
T BBoxArea(const T* box, const bool normalized) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add inline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pkuyym
pkuyym previously approved these changes Feb 2, 2018
Copy link
Contributor

@pkuyym pkuyym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

std::vector<size_t> batch_starts = {0};
for (int64_t i = 0; i < batch_size; ++i) {
Tensor ins_score = scores->Slice(i, i + 1);
ins_score.Resize({class_num, predict_dim});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thx.

@qingqing01 qingqing01 dismissed stale reviews from pkuyym and wanghaox via a6f3846 February 2, 2018 08:33
Copy link
Contributor Author

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghaox Thanks!

limitations under the License. */

#include "paddle/framework/op_registry.h"
#include "paddle/operators/math/math_function.h"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

template <class T>
T BBoxArea(const T* box, const bool normalized) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@qingqing01 qingqing01 merged commit c9ef69b into PaddlePaddle:develop Feb 2, 2018
@qingqing01 qingqing01 deleted the multiclass_nms_op branch March 7, 2018 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants