Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added batchnorm layer with tests #1891

Closed
wants to merge 1 commit into from

Conversation

ducha-aiki
Copy link
Contributor

based on fixed PRs from Russell91 (base test) and ChenglongChen (implementation)

@weiliu89
Copy link

I think the code misses computing the second part of derivative over mean during back propagation.

bottom_diff);

caffe_mul(buffer_blob_.count(), x_norm_.cpu_data()/*top_data*/, bottom_diff, bottom_diff);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// I think we should add these following code between line 244 and 246. Correct me if I am wrong

// EX across spatial
caffe_cpu_gemv(CblasNoTrans, N_ * C_, H_ * W_, Dtype(1), bottom_diff, spatial_multiplier_.cpu_data(), Dtype(0), spatial_mean_.mutable_cpu_data());
// EX across batch
caffe_cpu_gemv(CblasNoTrans, N_, C_, Dtype(1), spatial_mean_.cpu_data(), batch_sum_multiplier_.cpu_data(), Dtype(0), batch_mean_.mutable_cpu_data());

caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, N_, C_, 1, Dtype(1), batch_sum_multiplier_.cpu_data(), batch_mean_.cpu_data(), Dtype(0), spatial_mean_.mutable_cpu_data());
caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, N_ * C_, H_ * W_, 1, Dtype(-1), spatial_mean_.cpu_data(), spatial_multiplier_.cpu_data(), Dtype(1), bottom_diff);

@weiliu89
Copy link

During backward, it misses step: Average 2 as shown in the image I attached.
Correct me if I am wrong

bn_eq

@ChenglongChen
Copy link

@weiliu89,
The second part in Eq (3) can be dropped as x_i - \mu has zero mean? As a result Average 2 is not necessary.

@weiliu89
Copy link

@ChenglongChen
I think you are correct. Do you have any idea how to compute the derivatives over input x? I have no idea. Do you have any reference on how to do it?

@ducha-aiki
Copy link
Contributor Author

Replaced by #1965

@ducha-aiki ducha-aiki closed this Feb 25, 2015
@ducha-aiki ducha-aiki deleted the batch-norm-layer branch February 25, 2015 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants