-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AlexNet bacward shape missmatch + ReLu return a tuple #681
Comments
Hi, as pointed out by @chrishkchris , the convention is to use RELU as stateless layer. For shape mismatch, you might need to check the shape of layers again. Let me know if further info is required. |
Ok, I'll try but why to provide a statefull ReLU Layer? Is it for a specific purpose? |
I compared my implementation to other frameworks and it is the same shapes. |
@dcslin Did you try to run the code pasted by @Belegkarnil ? |
I am still checking the code |
Hi @Belegkarnil, you might need to change 256 * 6 * 6, 4096 to 256, 4096 to make it works. Also you are recommended to use relu/dropout/flatten like this https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py#L40 |
Ok thanks a lot ! I assumed that it works like other frameworks but that the result of AvgPool has a different shape. |
Hi,
I have implemented AlexNet in singa but I obtain an error during the backward_and_update instruction. I am using Singa 3.0.0.rc1 on cpu.
This is my AlexNet implementation:
`from singa import autograd
from singa import module
from singa import opt
all = ['AlexNet', 'alexnet']
class AlexNet(module.Module):
def init(self, num_classes=1000):
super(AlexNet, self).init()
# 12 sur GPU donc 6 & 6
self.features1 = [
autograd.Conv2d(3,64,kernel_size=11,stride=4,padding=2),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2),
autograd.Conv2d(64,192,kernel_size=5,padding=2),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2),
autograd.Conv2d(192,384,kernel_size=3,padding=1),
autograd.ReLU(),
autograd.Conv2d(384, 256,kernel_size=3,padding=1),
autograd.ReLU()
]
self.features2 = [
autograd.Conv2d(256, 256,kernel_size=3,padding=1),
autograd.ReLU(),
autograd.MaxPool2d(kernel_size=3, stride=2)
]
self.avgpool = autograd.AvgPool2d(6, stride=1)
self.flatten = autograd.Flatten()
self.classifier = [
autograd.Dropout(),
autograd.Linear(256 * 6 * 6, 4096),
autograd.ReLU(),
autograd.Dropout(),
autograd.Linear(4096, 4096),
autograd.ReLU(),
autograd.Linear(4096, num_classes)
]
self.optimizer = opt.SGD(lr=0.001, momentum=0.9)
def loss(self, out, ty):
return autograd.softmax_cross_entropy(out, ty)
def optim(self, loss, dist_option, spars):
if dist_option == 'fp32':
self.optimizer.backward_and_update(loss)
elif dist_option == 'fp16':
self.optimizer.backward_and_update_half(loss)
elif dist_option == 'partialUpdate':
self.optimizer.backward_and_partial_update(loss)
elif dist_option == 'sparseTopK':
self.optimizer.backward_and_sparse_update(loss, topK=True, spars=spars)
elif dist_option == 'sparseThreshold':
self.optimizer.backward_and_sparse_update(loss, topK=False, spars=spars)
def forward(self, x):
for (i,layers) in enumerate([self.features1, self.features2, [ self.avgpool,self.flatten ] , self.classifier]):
for (j,fn) in enumerate(layers):
x = fn(x)
if(type(x) is tuple):# FIXME I have to do that because of a bug in Singa? (ReLU)
x = x[0]
return x
def alexnet(**kwargs):
return AlexNet(**kwargs)
`
And I get : AssertionError: ('shape mismatch', (9216, 4096), (256, 4096))
Which is my first linear layer : 256 * 6 * 6, 4096
When I use my VGG16 implementation, I got a similar error :
AssertionError: ('shape mismatch', (25088, 4096), (512, 4096))
It seems that the backward operation does not map the correct shape to the corresponding layer.
Moreover, the ReLu class return a 1-tuple containing a Tensor. Is it intended or is it a bug?
The text was updated successfully, but these errors were encountered: