Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vocab issue 발생의 문제 해결을 위한 문의의 건 입니다..! #1

Open
SOSONAGI opened this issue Oct 4, 2023 · 0 comments

Comments

@SOSONAGI
Copy link

SOSONAGI commented Oct 4, 2023

안녕하세요.
좋은 학습 자료 감사 드립니다.
platypus 깃헙 레포에서 requirements를 모두 잘 설치하고 업로드 주신 노트북 파라미터로 학습을 시키려고 했으나, 지속적으로,

아래와 같이 traceback 메세지가 발생하여

Traceback (most recent call last):
File "/home/sosohaja/anaconda3/envs/playllama/Platypus/finetune.py", line 312, in
fire.Fire(train)
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/sosohaja/anaconda3/envs/playllama/Platypus/finetune.py", line 150, in train
tokenizer = LlamaTokenizer.from_pretrained("beomi/llama-2-ko-7b")
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1988, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 96, in init
self.sp_model.Load(vocab_file)
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/sosohaja/anaconda3/envs/playllama/lib/python3.10/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string

다 방면으로 리서치 해본 결과, tokenizer.model 파일이 베이스 모델인 beomi/llama-2-ko-7b 에 없어 실행이 안되는 것으로 판단하였습니다..

해결을 어떻게 하신 후 올려주신 노트북으로 학습을 하셨는지 궁금합니다..!

감사합니다.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant