Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: cannot reshape array of size 10752 into shape (64,64,3) #8

Open
ghost opened this issue Apr 24, 2019 · 17 comments
Open

ValueError: cannot reshape array of size 10752 into shape (64,64,3) #8

ghost opened this issue Apr 24, 2019 · 17 comments

Comments

@ghost
Copy link

ghost commented Apr 24, 2019

I try to test on faces to get demo.gif result on a 64x64 input image but it I get this error

File "demo.py", line 62, in
source_image = VideoToTensor()(read_video(opt.source_image, opt.image_shape + (3,)))['video'][:, :1]
File "/content/monkey-net/frames_dataset.py", line 28, in read_video
video_array = video_array.reshape((-1,) + image_shape)
ValueError: cannot reshape array of size 10752 into shape (64,64,3)

using this command
!python demo.py --config config/nemo.yaml --driving_video sup-mat/driving.png --source_image source2.png --checkpoint /content/nemo-ckp.pth.tar --image_shape 64,64

@AliaksandrSiarohin
Copy link
Owner

What is the size of source2.png?

@ghost
Copy link
Author

ghost commented Apr 24, 2019

original size is 534x471, I resized it to 64x64 using !convert source2.png -resize 64x64 source2.png

@AliaksandrSiarohin
Copy link
Owner

It tells that the size is something like (56,64,3)

@ghost
Copy link
Author

ghost commented Apr 24, 2019

I am trying to make this into a google colab notebook can share if you can take a look to see the problem?

@AliaksandrSiarohin
Copy link
Owner

Can you send me resized image and driving video?

@ghost
Copy link
Author

ghost commented Apr 24, 2019

@ghost
Copy link
Author

ghost commented Apr 24, 2019

tried it with a different image Traceback (most recent call last):
File "demo.py", line 62, in
source_image = VideoToTensor()(read_video(opt.source_image, opt.image_shape + (3,)))['video'][:, :1]
File "/content/monkey-net/frames_dataset.py", line 28, in read_video
video_array = video_array.reshape((-1,) + image_shape)
ValueError: cannot reshape array of size 2964000 into shape (64,64,3)
link to image
https://upload.wikimedia.org/wikipedia/commons/thumb/5/51/Brad_Pitt_Fury_2014.jpg/800px-Brad_Pitt_Fury_2014.jpg

@AliaksandrSiarohin
Copy link
Owner

Try this
image

@ghost
Copy link
Author

ghost commented Apr 24, 2019

@AliaksandrSiarohin now getting this error with that image THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=663 error=11 : invalid argument
Traceback (most recent call last):
File "demo.py", line 67, in
out = transfer_one(generator, kp_detector, source_image, driving_video, config['transfer_params'])
File "/content/monkey-net/transfer.py", line 68, in transfer_one
kp_driving = cat_dict([kp_detector(driving_video[:, :, i:(i + 1)]) for i in range(d)], dim=1)
File "/content/monkey-net/transfer.py", line 68, in
kp_driving = cat_dict([kp_detector(driving_video[:, :, i:(i + 1)]) for i in range(d)], dim=1)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/content/monkey-net/modules/keypoint_detector.py", line 107, in forward
out = gaussian2kp(heatmap, self.kp_variance, self.clip_variance)
File "/content/monkey-net/modules/keypoint_detector.py", line 58, in gaussian2kp
var = torch.matmul(mean_sub.unsqueeze(-1), mean_sub.unsqueeze(-2))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:411

@AliaksandrSiarohin
Copy link
Owner

Seems like problem with cuda library. Probably pytorch does not match. Try to remove pytorch from the requirements.txt and run again.

@ghost
Copy link
Author

ghost commented Apr 24, 2019

I removed both torch==0.4.1
torchvision==0.2.1 in requirements.txt will try again now

@ghost
Copy link
Author

ghost commented Apr 24, 2019

@AliaksandrSiarohin seems it worked but the demo.gif result is weird
demo

@AliaksandrSiarohin
Copy link
Owner

This can happen because model is trained on nemo dataset. It is most likely not generalise outside of this dataset. It expects black bg and proper image crop. To validate this you can check if it works with image from the test part of nemo datarset. If you want a model that work on arvitrarrry faces dataset like vox celeb should be used.

@ghost
Copy link
Author

ghost commented Apr 24, 2019

@AliaksandrSiarohin I tried it with vox.yaml and vox-full.yaml and get this error

Traceback (most recent call last):
File "demo.py", line 52, in
Logger.load_cpk(opt.checkpoint, generator=generator, kp_detector=kp_detector)
File "/content/monkey-net/logger.py", line 54, in load_cpk
generator.load_state_dict(checkpoint['generator'])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for MotionTransferGenerator:
Missing key(s) in state_dict: "appearance_encoder.down_blocks.5.conv.weight", "appearance_encoder.down_blocks.5.conv.bias", "appearance_encoder.down_blocks.5.norm.weight", "appearance_encoder.down_blocks.5.norm.bias", "appearance_encoder.down_blocks.5.norm.running_mean", "appearance_encoder.down_blocks.5.norm.running_var", "appearance_encoder.down_blocks.6.conv.weight", "appearance_encoder.down_blocks.6.conv.bias", "appearance_encoder.down_blocks.6.norm.weight", "appearance_encoder.down_blocks.6.norm.bias", "appearance_encoder.down_blocks.6.norm.running_mean", "appearance_encoder.down_blocks.6.norm.running_var", "video_decoder.up_blocks.5.conv.weight", "video_decoder.up_blocks.5.conv.bias", "video_decoder.up_blocks.5.norm.weight", "video_decoder.up_blocks.5.norm.bias", "video_decoder.up_blocks.5.norm.running_mean", "video_decoder.up_blocks.5.norm.running_var", "video_decoder.up_blocks.6.conv.weight", "video_decoder.up_blocks.6.conv.bias", "video_decoder.up_blocks.6.norm.weight", "video_decoder.up_blocks.6.norm.bias", "video_decoder.up_blocks.6.norm.running_mean", "video_decoder.up_blocks.6.norm.running_var".
size mismatch for appearance_encoder.down_blocks.4.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 3, 3]).
size mismatch for appearance_encoder.down_blocks.4.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 3, 3]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.decoder.up_blocks.0.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.0.conv.weight: copying a param with shape torch.Size([512, 522, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1034, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.0.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.conv.weight: copying a param with shape torch.Size([256, 1034, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 2058, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.1.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.2.conv.weight: copying a param with shape torch.Size([128, 522, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2058, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.3.conv.weight: copying a param with shape torch.Size([64, 266, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1034, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.3.conv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.4.conv.weight: copying a param with shape torch.Size([32, 138, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 522, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.4.conv.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).

@AliaksandrSiarohin
Copy link
Owner

I did not publish checkpoints for vox. So you can not try it now. You can try to retrain the network with 64x64 vox. This should work, however larger resulution is not that great with a current code. I'm working on improved version of this work, I will publish all the checkpoint when it will be ready.

@AliaksandrSiarohin
Copy link
Owner

If it is urgent and you do not have access to gpu, I can try to train 64x64 vox for you.

@ghost
Copy link
Author

ghost commented Apr 24, 2019

@AliaksandrSiarohin I am trying to setup a colab notebook which uses a t4 gpu, if I can get training running on it as well, I can start training different models and share it, also suggest adding a google colab notebook to the repo as it makes them more reproducible much easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant