-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-380] count_include_pad argument for Avg Pooling #11021
Conversation
Also need fix it in mkldnn_pooling: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_pooling.cc#L124 |
@@ -50,6 +50,7 @@ struct PoolingParam : public dmlc::Parameter<PoolingParam> { | |||
bool global_pool; | |||
bool cudnn_off; | |||
dmlc::optional<int> p_value; | |||
dmlc::optional<bool> count_include_pad; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't need to be optional. Just set to the previous default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to ensure forward compatibility here, if this is not an optional field, json file generated for the symbol will have an extra field, which we think may cause confusions for users of earlier versions.
src/operator/nn/pooling-inl.h
Outdated
.describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling."); | ||
|
||
DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>()) | ||
.describe("Whether to count padding elements for average calculation."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be more descriptive.
Only used for average pooling. Whether to count padded elements for normalization at edge and corners.
For example, blah blah
Please add tests |
.describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling"); | ||
.describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling."); | ||
|
||
DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument name sounds weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the same name as Pytorch: https://pytorch.org/docs/master/nn.html?highlight=pool2d#torch.nn.AvgPool2d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to take an ignore_pad_value here instead of assuming the padding value?
Please add this option to Gluon's avgpool |
f2d2e6e
to
845b944
Compare
FYI @hetong007 |
845b944
to
3f37699
Compare
3f37699
to
5eeae15
Compare
2b3a1df
to
1b59e20
Compare
@piiswrong @reminisce @szha Please give this a review if you have time, thanks! |
a89f9cd
to
59ec7ce
Compare
src/operator/nn/pool.cuh
Outdated
@@ -311,7 +322,9 @@ __global__ void pool_sum_3d_gpu_kernel(const int nthreads, const DType* in_data, | |||
} | |||
} | |||
} | |||
out_data[index] = a_root_p<DType, p>::Map(sum); | |||
out_data[index] = (pool_size == 0) ? | |||
DType(0.0f / pool_size) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
division by 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to create an artificial NaN here to ensure consistency with cudnn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to use nan constants in math/limits
? performing this division produces nan through interrupt which causes slow down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot use the std thing in gpu kernels, as they are host functions while this kernel here is a global function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would pool_size be 0? Shouldn't that be invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When your pad is wider than kernel size, you could get that pool_size when count_include_pad is False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example in the 3d test cases, there's one with (20, 20, 20) input, (4,5,3) kernel, and (2,3,3) pad. So on the last dim you could get 0 valid width and your whole pool_size will be 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DType cast should have already handled it for half_t and half2_t, or did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@szha okay then I'll just use those functions to generate NaN.
ce912fa
to
4115730
Compare
@piiswrong Addressed all the reviews so far, please take another look when you’ve got time, thanks! |
@@ -879,13 +881,13 @@ class AvgPool1D(_Pooling): | |||
equation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add documentation for count_include_pad for gluon blocks
@@ -926,13 +928,13 @@ class AvgPool2D(_Pooling): | |||
equation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here and and for AvgPool3D
.describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling."); | ||
|
||
DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>()) | ||
.describe("Only used for AvgPool, specify whether to count padding elements for average" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument is optional. Is the default value True or False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default behavior is True, which was the behavior before the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls include this in the doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
src/operator/nn/pool.cuh
Outdated
@@ -580,7 +607,8 @@ __global__ void unpool_sum_3d_gpu_kernel(const int nthreads, const DType* out_gr | |||
const int kernel_d, const int kernel_h, | |||
const int kernel_w, const int stride_d, const int stride_h, | |||
const int stride_w, const int pad_d, const int pad_h, | |||
const int pad_w, DType* in_grad, const bool isAvg = false) { | |||
const int pad_w, DType* in_grad, const bool isAvg = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when was isAvg introduced? Should be is_avg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was here a long time ago. see #5519
e380c21
to
11e9604
Compare
11e9604
to
1dede63
Compare
1dede63
to
5a07a43
Compare
@piiswrong Is this PR good to be merged? |
I feel like adding new tests for this change would be better, compared with changing the existing ones. |
@TaoLv All previous tests were still performed even with those changes. If I do separate the tests, wouldn't it be a bit confusing if the tests for |
* add count_include_pad argument * add cound_include_pad in cudnn pooling and corresponding tests * add gluon support for the new option * switch to built-in functions for artificial NaN * add doc for gluon * change isAvg/getAvg to is_avg/get_avg
|
Ah sorry, I just thought it might be this PR since it was the last one that modified the overall test. I've created an issue to track further investigation. |
* add count_include_pad argument * add cound_include_pad in cudnn pooling and corresponding tests * add gluon support for the new option * switch to built-in functions for artificial NaN * add doc for gluon * change isAvg/getAvg to is_avg/get_avg
Description
As title.
Checklist
Essentials
Changes
Comments
#10194