Skip to content

Reproduction of the model from the paper “Dynamic Routing Between Capsules”.

Notifications You must be signed in to change notification settings

manuelsh/capsule-networks-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Capsules Network Implementation in PyTorch

Work done with Pau Rué (https://github.com/paurue).

Reproduction of the model from the paper “Dynamic Routing Between Capsules” (https://arxiv.org/abs/1710.09829). This model achieves the best performance in the CapsNet implementations available in GitHub that publishes their results (as of 31 March 2018).

Why another model?

After reviewing a number of implementations of the model, we found several bugs on them. Once these bugs were corrected, we still were not able to arrive close to the performance of the paper. The main reason for that is in the way the optimisation is performed, which is not fully explained in the original paper. The authors employ an exponential decay on the learning rate, with a decay rate of 0.96 which is updated every 2000 steps.

The model results are very sensitive to the correct selection of these parameters. We were able to arrive to 0.28% error rate, which is very close to the performance of the paper (0.25%). We believe that with further tuning a 0.25% error rate is attainable.

Other critical parts that needs to be correctly addressed to make the model work are:

  • There should not be backpropagation through each of the steps of the routing mechanism, only on the last step (use .detach method).
  • The individual capsules of the second layer (PrimaryCaps) should be built with the .view method in the correct way, and for that one needs to set the dimensions in the correct order (in our case we use .permute). Many implementations were reshaping the tensor incorrectly, and hence the resulting capsules were not transversal to the filters, but contained in the filters.

We hope this implementation may be helpful for other researchers.

Other Implementations

(Credit to XifengGuo for compiling the list)

About

Reproduction of the model from the paper “Dynamic Routing Between Capsules”.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published