You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WARNING 2022-11-28T08:24:01 | py.warnings: /miniconda3/envs/sl/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2022-11-28T08:24:02 | models.model_retrieval_base: Init new model with new image size 224, and load weights.
2022-11-28T08:24:05 | models.model_retrieval_base: _IncompatibleKeys(missing_keys=['encoder.layer.0.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.1.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.2.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.3.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.4.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.5.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.6.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.7.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.8.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.9.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.10.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.11.attention.attention.relative_position_bias.relative_position_index'], unexpected_keys=[])
2022-11-28T08:24:05 | models.model_retrieval_base: Build text_encoder bert-base-uncased
2022-11-28T08:24:10 | models.model_retrieval_base: Build text_encoder bert-base-uncased, done!
2022-11-28T08:24:10 | models.model_vqa: Build text_decoder bert-base-uncased
2022-11-28T08:24:14 | models.model_vqa: Build text_decoder bert-base-uncased, done!
2022-11-28T08:24:14 | utils.optimizer: optimizer -- lr=1e-05 wd=0.02 len(p)=208
2022-11-28T08:24:14 | utils.optimizer: optimizer -- lr=1e-05 wd=0 len(p)=329
2022-11-28T08:24:14 | tasks.shared_utils: Loading checkpoint from /singularity/ckpts_and_logs/ft_msrvtt_qa_singularity_17m.pth
2022-11-28T08:24:21 | models.utils: Load temporal_embeddings, lengths: 64-->1
Traceback (most recent call last):
File "tasks/vqa.py", line 295, in
main(cfg)
File "tasks/vqa.py", line 188, in main
find_unused_parameters=True
File "/singularity/ckpts_and_logs/qa_msrvtt/msrvtt/code/singularity/tasks/shared_utils.py", line 85, in setup_model
layer_num = int(encoder_keys[4])
ValueError: invalid literal for int() with base 10: 'attention'
The text was updated successfully, but these errors were encountered:
I am trying to train msrvtt-qa, but got this error.
--command
bash scripts/train_vqa.sh msrvtt msrvtt 1 local pretrained_path=/singularity/ckpts_and_logs/ft_msrvtt_qa_singularity_17m.pth
Did I miss something?
-----logs
2022-11-28T08:23:48 | main: config:
{'dataset_name': 'msrvtt', 'data_root': '${oc.env:SL_DATA_DIR}/videos', 'anno_root_downstream': '${oc.env:SL_DATA_DIR}/anno_downstream', 'train_file': [['${anno_root_downstream}/msrvtt_qa_train.json', '${data_root}/msrvtt_2fps_224', 'video']], 'test_types': ['val'], 'test_file': {'val': ['${anno_root_downstream}/msrvtt_qa_val.json', '${data_root}/msrvtt_2fps_224', 'video'], 'test': ['${anno_root_downstream}/msrvtt_qa_test.json', '${data_root}/msrvtt_2fps_224', 'video']}, 'stop_key': 'val', 'answer_list': '${anno_root_downstream}/msrvtt_qa_answer_list.json', 'text_encoder': 'bert-base-uncased', 'text_decoder': 'bert-base-uncased', 'bert_config': 'configs/config_bert.json', 'vit_type': 'beit', 'vit_zoo': {'beit': 'microsoft/beit-base-patch16-224-pt22k-ft22k'}, 'vit_name_or_pretrained_path': '${vit_zoo[${vit_type}]}', 'temporal_vision_encoder': {'enable': False, 'num_layers': 2, 'update_pooler_embed': False}, 'add_temporal_embed': False, 'image_res': 224, 'embed_dim': 256, 'video_input': {'num_frames': 1, 'reader': 'decord', 'sample_type': 'rand', 'num_frames_test': 4, 'sample_type_test': 'middle'}, 'max_q_len': 25, 'max_a_len': 5, 'batch_size': {'image': 128, 'video': 32}, 'batch_size_test': {'image': 64, 'video': 64}, 'k_test': 128, 'temp': 0.07, 'eos': '[SEP]', 'optimizer': {'opt': 'adamW', 'lr': 1e-05, 'opt_betas': [0.9, 0.999], 'weight_decay': 0.02, 'max_grad_norm': -1, 'different_lr': {'enable': False, 'module_names': [], 'lr': 0.001}}, 'scheduler': {'sched': 'cosine', 'epochs': 10, 'min_lr_multi': 0.1, 'warmup_epochs': 0.5}, 'output_dir': '/singularity/ckpts_and_logs/qa_msrvtt/msrvtt', 'pretrained_path': '/singularity/ckpts_and_logs/ft_msrvtt_qa_singularity_17m.pth', 'resume': False, 'evaluate': False, 'eval_frame_ensemble': 'concat', 'device': 'cuda', 'seed': 42, 'log_freq': 100, 'dist_url': 'env://', 'distributed': True, 'fp16': True, 'debug': False, 'num_workers': 16, 'wandb': {'enable': True, 'entity': None, 'project': 'sb_qa_msrvtt'}, 'rank': 0, 'world_size': 1, 'gpu': 0, 'dist_backend': 'nccl', 'result_dir': '/singularity/ckpts_and_logs/qa_msrvtt/msrvtt'}
2022-11-28T08:23:48 | main: train_file: [['${anno_root_downstream}/msrvtt_qa_train.json', '${data_root}/msrvtt_2fps_224', 'video']]
2022-11-28T08:23:48 | main: Creating vqa QA datasets
Loading /singularity/data/anno_downstream/msrvtt_qa_train.json: 100%|█| 158581/158581 [
Loading /singularity/data/anno_downstream/msrvtt_qa_val.json: 100%|█| 12278/12278 [00:0
Loading /singularity/data/anno_downstream/msrvtt_qa_test.json: 100%|█| 72821/72821 [00:
2022-11-28T08:23:49 | tasks.shared_utils: Creating model
2022-11-28T08:23:56 | models.model_retrieval_base: Loading vit pre-trained weights from huggingface microsoft/beit-base-patch16-224-pt22k-ft22k.
WARNING 2022-11-28T08:24:01 | py.warnings: /miniconda3/envs/sl/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
WARNING 2022-11-28T08:24:01 | py.warnings: /miniconda3/envs/sl/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2022-11-28T08:24:02 | models.model_retrieval_base: Init new model with new image size 224, and load weights.
2022-11-28T08:24:05 | models.model_retrieval_base: _IncompatibleKeys(missing_keys=['encoder.layer.0.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.1.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.2.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.3.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.4.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.5.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.6.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.7.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.8.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.9.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.10.attention.attention.relative_position_bias.relative_position_index', 'encoder.layer.11.attention.attention.relative_position_bias.relative_position_index'], unexpected_keys=[])
2022-11-28T08:24:05 | models.model_retrieval_base: Build text_encoder bert-base-uncased
2022-11-28T08:24:10 | models.model_retrieval_base: Build text_encoder bert-base-uncased, done!
2022-11-28T08:24:10 | models.model_vqa: Build text_decoder bert-base-uncased
2022-11-28T08:24:14 | models.model_vqa: Build text_decoder bert-base-uncased, done!
2022-11-28T08:24:14 | utils.optimizer: optimizer -- lr=1e-05 wd=0.02 len(p)=208
2022-11-28T08:24:14 | utils.optimizer: optimizer -- lr=1e-05 wd=0 len(p)=329
2022-11-28T08:24:14 | tasks.shared_utils: Loading checkpoint from /singularity/ckpts_and_logs/ft_msrvtt_qa_singularity_17m.pth
2022-11-28T08:24:21 | models.utils: Load temporal_embeddings, lengths: 64-->1
Traceback (most recent call last):
File "tasks/vqa.py", line 295, in
main(cfg)
File "tasks/vqa.py", line 188, in main
find_unused_parameters=True
File "/singularity/ckpts_and_logs/qa_msrvtt/msrvtt/code/singularity/tasks/shared_utils.py", line 85, in setup_model
layer_num = int(encoder_keys[4])
ValueError: invalid literal for int() with base 10: 'attention'
The text was updated successfully, but these errors were encountered: