You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is my config file
I am getting cuda out of memory on using model_type=fp16 but the code works fine when i am using model_type=fp32
i reduced batch size till 128 still i get same error
if it helps :
1)I am using 2080ti gpu
2) I installed apex from the source not from requirements.txt
3) I have cuda version 11.1
4) Since apex only gets installed in cuda 10.2 i commented the error message ( choice was provided by apex)
The text was updated successfully, but these errors were encountered:
Have you tried with the filtertoolong transform? #2040 (comment)
Though it's strange that it would fail in fp16 but work in fp32.
Maybe there is an issue with your apex setup. Did you add these flags when building? --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--deprecated_fused_adam"
yeah i did used same command while installing apex , i also downgraded my cuda version to 10.2 tired the whole setup i am still getting same cuda out of memory error
data:
corpus_1:
path_src: /home/test/Train.en
path_tgt: /home/test/Train.de
valid:
path_src: /mnt/hdd/vignesh/final/training/newstest2013.en
path_tgt: /mnt/hdd/vignesh/final/training/newstest2013.de
save_model: /home/test/OpenNMT-py/fusedadam
share_vocab: True
valid_steps: 5000
save_checkpoint_steps: 10000
train_steps: 1300000
warmup_steps: 8000
report_every: 1000
seed : 2
src_vocab: /home/test/Opennmt_en_de.vocab.src
#model
#train_from : /home/test/OpenNMT-py/checkpoints_step_270000.pt
encoder_type: transformer
decoder_type: transformer
enc_layers: 12
dec_layers: 1
rnn_size: 512
word_vec_size: 512
share_decoder_embeddings : True
share_embeddings : True
heads: 8
transformer_ff: 2048
model_dtype: "fp16"
#loss function
accum_count: 8
optim: "fusedadam"
adam_beta1: 0.9
adam_beta2: 0.998
decay_method: noam
learning_rate: 2.0
max_grad_norm: 0.0
batch_size: 2048
batch_type: tokens
normalization: tokens
dropout: 0.1
label_smoothing: 0.1
max_generator_batches: 2
param_init: 0
param_init_glorot: 'true'
position_encoding: 'true'
world_size: 4
gpu_ranks: [0 ,1, 2, 3]
#log_file: log/en_de.log
#exp: transformer_student_deep_shallow_en_de
#train_from checkpoints/combined_wikimatrix_zoho_student/_step_295000.pt
This is my config file
I am getting cuda out of memory on using model_type=fp16 but the code works fine when i am using model_type=fp32
i reduced batch size till 128 still i get same error
if it helps :
1)I am using 2080ti gpu
2) I installed apex from the source not from requirements.txt
3) I have cuda version 11.1
4) Since apex only gets installed in cuda 10.2 i commented the error message ( choice was provided by apex)
The text was updated successfully, but these errors were encountered: