[MXNET-380] count_include_pad argument for Avg Pooling #11021

haojin2 · 2018-05-22T07:29:30Z

Description

As title.

Checklist

Essentials

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Add count_include_pad argument for AvgPool
Unit tests

Comments

#10194

TaoLv · 2018-05-22T07:44:45Z

Also need fix it in mkldnn_pooling: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_pooling.cc#L124
if count_include_pad=False, should use pooling_avg_exclude_padding here.

piiswrong · 2018-05-22T17:16:18Z

src/operator/nn/pooling-inl.h

@@ -50,6 +50,7 @@ struct PoolingParam : public dmlc::Parameter<PoolingParam> {
  bool global_pool;
  bool cudnn_off;
  dmlc::optional<int> p_value;
+  dmlc::optional<bool> count_include_pad;


doesn't need to be optional. Just set to the previous default

We want to ensure forward compatibility here, if this is not an optional field, json file generated for the symbol will have an extra field, which we think may cause confusions for users of earlier versions.

piiswrong · 2018-05-22T17:19:54Z

src/operator/nn/pooling-inl.h

+    .describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling.");
+
+    DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>())
+    .describe("Whether to count padding elements for average calculation.");


Be more descriptive.
Only used for average pooling. Whether to count padded elements for normalization at edge and corners.
For example, blah blah

piiswrong · 2018-05-22T17:20:11Z

Please add tests

piiswrong · 2018-05-22T17:20:55Z

src/operator/nn/pooling-inl.h

-    .describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling");
+    .describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling.");
+
+    DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>())


Argument name sounds weird

We're using the same name as Pytorch: https://pytorch.org/docs/master/nn.html?highlight=pool2d#torch.nn.AvgPool2d

would it be better to take an ignore_pad_value here instead of assuming the padding value?

piiswrong · 2018-05-22T17:21:16Z

Please add this option to Gluon's avgpool

zhanghang1989 · 2018-05-22T18:56:41Z

FYI @hetong007

haojin2 · 2018-06-02T01:35:02Z

@piiswrong @reminisce @szha Please give this a review if you have time, thanks!
@hetong007 @zhanghang1989 this should be ready soon.

szha · 2018-06-04T00:44:51Z

src/operator/nn/pool.cuh

@@ -311,7 +322,9 @@ __global__ void pool_sum_3d_gpu_kernel(const int nthreads, const DType* in_data,
        }
      }
    }
-    out_data[index] = a_root_p<DType, p>::Map(sum);
+    out_data[index] = (pool_size == 0) ?
+                      DType(0.0f / pool_size) :


division by 0?

We want to create an artificial NaN here to ensure consistency with cudnn.

would it be better to use nan constants in math/limits? performing this division produces nan through interrupt which causes slow down.

You cannot use the std thing in gpu kernels, as they are host functions while this kernel here is a global function.

how about nan and nanf?

Why would pool_size be 0? Shouldn't that be invalid?

When your pad is wider than kernel size, you could get that pool_size when count_include_pad is False.

For example in the 3d test cases, there's one with (20, 20, 20) input, (4,5,3) kernel, and (2,3,3) pad. So on the last dim you could get 0 valid width and your whole pool_size will be 0.

DType cast should have already handled it for half_t and half2_t, or did I miss something?

@szha okay then I'll just use those functions to generate NaN.

haojin2 · 2018-06-05T21:27:08Z

@piiswrong Addressed all the reviews so far, please take another look when you’ve got time, thanks!

haojin2 · 2018-06-05T21:33:57Z

@szha

eric-haibin-lin · 2018-06-08T22:19:23Z

python/mxnet/gluon/nn/conv_layers.py

@@ -879,13 +881,13 @@ class AvgPool1D(_Pooling):
          equation.


Pls add documentation for count_include_pad for gluon blocks

eric-haibin-lin · 2018-06-08T22:19:50Z

python/mxnet/gluon/nn/conv_layers.py

@@ -926,13 +928,13 @@ class AvgPool2D(_Pooling):
          equation.


Also here and and for AvgPool3D

eric-haibin-lin · 2018-06-08T22:22:42Z

src/operator/nn/pooling-inl.h

+    .describe("Value of p for Lp pooling, can be 1 or 2, required for Lp Pooling.");
+
+    DMLC_DECLARE_FIELD(count_include_pad).set_default(dmlc::optional<bool>())
+    .describe("Only used for AvgPool, specify whether to count padding elements for average"


The argument is optional. Is the default value True or False?

The default behavior is True, which was the behavior before the change

pls include this in the doc

eric-haibin-lin · 2018-06-08T22:24:02Z

src/operator/nn/pool.cuh

@@ -580,7 +607,8 @@ __global__ void unpool_sum_3d_gpu_kernel(const int nthreads, const DType* out_gr
                                         const int kernel_d, const int kernel_h,
                                         const int kernel_w, const int stride_d, const int stride_h,
                                         const int stride_w, const int pad_d, const int pad_h,
-                                         const int pad_w, DType* in_grad, const bool isAvg = false) {
+                                         const int pad_w, DType* in_grad, const bool isAvg = false,


when was isAvg introduced? Should be is_avg

It was here a long time ago. see #5519

haojin2 · 2018-06-12T18:36:21Z

@piiswrong Is this PR good to be merged?

TaoLv · 2018-06-17T02:27:25Z

I feel like adding new tests for this change would be better, compared with changing the existing ones.

haojin2 · 2018-06-17T02:38:23Z

@TaoLv All previous tests were still performed even with those changes. If I do separate the tests, wouldn't it be a bit confusing if the tests for count_include_pad=True and count_include_pad=False are in 2 different unit tests?

* add count_include_pad argument * add cound_include_pad in cudnn pooling and corresponding tests * add gluon support for the new option * switch to built-in functions for artificial NaN * add doc for gluon * change isAvg/getAvg to is_avg/get_avg

marcoabreu · 2018-07-01T22:18:24Z

Hi @haojin2 , I think your PR has made test_pooling_versions flaky. See details at #11517

haojin2 · 2018-07-01T22:21:07Z

test_2d_pooling('max') the test is failing at max pooling, which does not even hit my code paths.

marcoabreu · 2018-07-01T22:27:45Z

Ah sorry, I just thought it might be this PR since it was the last one that modified the overall test. I've created an issue to track further investigation.

* add count_include_pad argument * add cound_include_pad in cudnn pooling and corresponding tests * add gluon support for the new option * switch to built-in functions for artificial NaN * add doc for gluon * change isAvg/getAvg to is_avg/get_avg

piiswrong reviewed May 22, 2018

View reviewed changes

haojin2 force-pushed the count_include_pad branch 2 times, most recently from f2d2e6e to 845b944 Compare May 22, 2018 17:41

haojin2 force-pushed the count_include_pad branch from 845b944 to 3f37699 Compare May 22, 2018 19:43

haojin2 requested a review from yzhliu as a code owner May 22, 2018 19:43

haojin2 force-pushed the count_include_pad branch from 3f37699 to 5eeae15 Compare May 22, 2018 20:54

This was referenced May 23, 2018

[MXNET-386] API Docs Generation #10991

Merged

[MXNET-357] New Scala API Design (NDArray) #10787

Merged

hetong007 mentioned this pull request May 23, 2018

Add Nasnet Implementation dmlc/gluon-cv#136

Merged

haojin2 force-pushed the count_include_pad branch 4 times, most recently from 2b3a1df to 1b59e20 Compare June 2, 2018 00:45

haojin2 requested a review from szha as a code owner June 2, 2018 01:28

haojin2 changed the title ~~[MXNET-380] [WIP] count_include_pad argument for Avg Pooling~~ [MXNET-380] count_include_pad argument for Avg Pooling Jun 2, 2018

haojin2 force-pushed the count_include_pad branch from a89f9cd to 59ec7ce Compare June 2, 2018 04:29

szha reviewed Jun 4, 2018

View reviewed changes

haojin2 force-pushed the count_include_pad branch 2 times, most recently from ce912fa to 4115730 Compare June 5, 2018 04:19

eric-haibin-lin self-assigned this Jun 8, 2018

eric-haibin-lin reviewed Jun 8, 2018

View reviewed changes

haojin2 force-pushed the count_include_pad branch 4 times, most recently from e380c21 to 11e9604 Compare June 11, 2018 05:59

Hao Jin added 5 commits June 11, 2018 17:58

add count_include_pad argument

b3fdf26

add cound_include_pad in cudnn pooling and corresponding tests

5b5b702

add gluon support for the new option

dec157a

switch to built-in functions for artificial NaN

7935e8e

add doc for gluon

db6279d

haojin2 force-pushed the count_include_pad branch from 11e9604 to 1dede63 Compare June 11, 2018 17:58

eric-haibin-lin removed their assignment Jun 11, 2018

change isAvg/getAvg to is_avg/get_avg

5a07a43

haojin2 force-pushed the count_include_pad branch from 1dede63 to 5a07a43 Compare June 11, 2018 21:00

eric-haibin-lin approved these changes Jun 17, 2018

View reviewed changes

szha approved these changes Jun 18, 2018

View reviewed changes

eric-haibin-lin merged commit f34dafe into apache:master Jun 18, 2018

haojin2 deleted the count_include_pad branch June 18, 2018 18:13

jmerkow mentioned this pull request May 20, 2019

Updating mxnet from 1.0.0, networks give different outputs #14421

Closed

[MXNET-380] count_include_pad argument for Avg Pooling #11021

[MXNET-380] count_include_pad argument for Avg Pooling #11021

Conversation

haojin2 commented May 22, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

TaoLv commented May 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piiswrong commented May 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piiswrong commented May 22, 2018

zhanghang1989 commented May 22, 2018

haojin2 commented Jun 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haojin2 commented Jun 5, 2018

haojin2 commented Jun 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haojin2 commented Jun 12, 2018

TaoLv commented Jun 17, 2018

haojin2 commented Jun 17, 2018

marcoabreu commented Jul 1, 2018

haojin2 commented Jul 1, 2018

marcoabreu commented Jul 1, 2018 • edited Loading

haojin2 commented May 22, 2018 •

edited

Loading

marcoabreu commented Jul 1, 2018 •

edited

Loading