-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to create the BPE dataset for custom dataset #22
Comments
Hi @guanqun-yang, thanks for reporting this issue. Could you provide a full stack trace, pointing to the python line the error arises? Also, with your installation setup are you able to run the scripts for tasks like shakespeare transfer? |
@martiansideofthemoon Thank you for your prompt reply! I reconfigured the whole environment using the Here are the full error traces
|
Hi @guanqun-yang, this is almost certainly a |
@martiansideofthemoon Thanks for your reply! I removed all virtualenv -p python3 style-venv
source style-venv/bin/activate
pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
pip install --editable ./
cd fairseq
pip install --editable ./ The following are the full stack traces after executing Using cache found in /home/yang/.cache/torch/hub/pytorch_fairseq_master
/home/yang/style-transfer-paraphrase/style-venv/lib/python3.6/site-packages/hydra/experimental/initialize.py:37: UserWarning: hydra.experimental.initialize() is no longer experimental. Use hydra.initialize()
message="hydra.experimental.initialize() is no longer experimental."
Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=25', "common.log_format='json'", 'common.log_file=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=True', 'common.memory_efficient_fp16=True', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=4', 'common.fp16_scale_window=128', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=1.0', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', 'common.user_dir=null', 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=512', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=19812', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='c10d'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=200', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=True', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_algorithm='LocalSGD'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=True', 'distributed_training.memory_efficient_fp16=True', 'distributed_training.tpu=True', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'dataset.num_workers=2', 'dataset.skip_invalid_size_inputs_valid_test=True', 'dataset.max_tokens=999999', 'dataset.batch_size=null', 'dataset.required_batch_size_multiple=1', 'dataset.required_seq_len_multiple=1', "dataset.dataset_impl='mmap'", 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', "dataset.max_tokens_valid='${dataset.max_tokens}'", "dataset.batch_size_valid='${dataset.batch_size}'", 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'optimization.max_epoch=0', 'optimization.max_update=500000', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[1]', 'optimization.lr=[0.0006]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', "checkpoint.save_dir='checkpoints'", "checkpoint.restore_file='checkpoint_last.pt'", 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=True', 'checkpoint.reset_lr_scheduler=False', 'checkpoint.reset_meters=False', 'checkpoint.reset_optimizer=False', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=2000', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=True', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='loss'", 'checkpoint.maximize_best_checkpoint_metric=False', 'checkpoint.patience=-1', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', "checkpoint.model_parallel_size='${common.model_parallel_size}'", 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=10', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=512', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'task=masked_lm', 'task._name=masked_lm', "task.data='/home/yang/.cache/torch/hub/pytorch_fairseq/37d2bc14cf6332d61ed5abeb579948e6054e46cc724c7d23426382d11a31b2d6.ae5852b4abc6bf762e0b6b30f19e741aa05562471e9eb8f4a6ae261f04f9b350'", "task.sample_break_mode='complete'", 'task.tokens_per_sample=512', 'task.mask_prob=0.15', 'task.leave_unmasked_prob=0.1', 'task.random_token_prob=0.1', 'task.freq_weighted_replacement=False', 'task.mask_whole_words=False', 'task.mask_multiple_length=1', 'task.mask_stdev=0.0', "task.shorten_method='none'", "task.shorten_data_split_list=''", 'task.seed=1', 'criterion=masked_lm', 'criterion._name=masked_lm', 'criterion.tpu=True', 'bpe=gpt2', 'bpe._name=gpt2', "bpe.gpt2_encoder_json='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json'", "bpe.gpt2_vocab_bpe='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe'", 'optimizer=adam', 'optimizer._name=adam', "optimizer.adam_betas='(0.9, 0.98)'", 'optimizer.adam_eps=1e-06', 'optimizer.weight_decay=0.01', 'optimizer.use_old_adam=False', 'optimizer.fp16_adam_stats=False', 'optimizer.tpu=True', 'optimizer.lr=[0.0006]', 'lr_scheduler=polynomial_decay', 'lr_scheduler._name=polynomial_decay', 'lr_scheduler.warmup_updates=24000', 'lr_scheduler.force_anneal=null', 'lr_scheduler.end_learning_rate=0.0', 'lr_scheduler.power=1.0', 'lr_scheduler.total_num_update=500000.0', 'lr_scheduler.lr=[0.0006]']
Traceback (most recent call last):
File "dataset2bpe.py", line 10, in <module>
roberta = torch.hub.load('pytorch/fairseq', 'roberta.base')
File "/home/yang/style-transfer-paraphrase/style-venv/lib/python3.6/site-packages/torch/hub.py", line 369, in load
model = entry(*args, **kwargs)
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/models/roberta/model.py", line 284, in from_pretrained
**kwargs,
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/hub_utils.py", line 75, in from_pretrained
arg_overrides=kwargs,
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 421, in load_model_ensemble_and_task
state = load_checkpoint_to_cpu(filename, arg_overrides)
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 339, in load_checkpoint_to_cpu
state = _upgrade_state_dict(state)
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 643, in _upgrade_state_dict
state["cfg"] = convert_namespace_to_omegaconf(state["args"])
File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/utils.py", line 389, in convert_namespace_to_omegaconf
composed_cfg = compose("config", overrides=overrides, strict=False)
TypeError: compose() got an unexpected keyword argument 'strict' |
@martiansideofthemoon I managed to find a workaround after some attempts. I will post my solution after my experiments. |
Great good to know! Do post your solution here whenever you get a chance |
Hi
I am trying to train a style transfer model for a style (i.e., profane vs. civil) that is not supported in the paper. However, when I tried to run the first step as is instructed in the repository
where
datasets/golbeck
is a dataset on toxicity comments with required directory structure.a series of errors on some dependencies are reported.
It seems to me that this should be related to the installation. Here are the full installation commands I used.
I am wondering how I could resolve this issue. In order to reproduce this error, the data is provided here.
The text was updated successfully, but these errors were encountered: