You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some issues when using huggingface APIs to load models, so I decided to download models and config files from huggingface.co manually (into ~/llama2.c/tiny/).
Then I tried to export the model with python export.py ./tiny/tiny.bin --hf ./tiny/ and the following error occurred:
Traceback (most recent call last):
File "/home/mrmat/llama2.c/export.py", line 565, in <module>
model = load_hf_model(args.hf)
File "/home/mrmat/llama2.c/export.py", line 479, in load_hf_model
layer.attention.wk.weight = nn.Parameter(permute_reverse(hf_dict[f'model.layers.{i}.self_attn.k_proj.weight']))
File "/home/mrmat/llama2.c/export.py", line 473, in permute_reverse
return w.view(n_heads, 2, dim1 // n_heads // 2, dim2).transpose(1, 2).reshape(dim1, dim2)
RuntimeError: shape '[32, 2, 32, 2048]' is invalid for input of size 524288
Then I checked the model config by print(config) and it seems that hf_model = AutoModelForCausalLM.from_pretrained(model_path) is not loading the model hyperparameters correctly:
Since config.json have "num_key_value_heads": 4, I modified the export.py to manually load the hyperparameters from the config.json and change the args of permute_reverse:
Then I tried another LLaMA2 model created by the same contributer. While the model can be exported correctly, ./run failed to load the tokenizer due to different vocab size. It turns out that there are three additional tokens in their vocabulary compared to the default one used by LLaMA2. Some LLaMA2 models in huggingface are trained even with GPT-2 tokenizer.
I looked at the tokenizer.py and it only supports to generate tokenizer.bin with tokenizer.model generated by SentencePiece library.
So my question is: How can I convert a custom tokenizer (not created by SentensePiece, either downloaded manually or saved by tokenizer.save_pretrained(path)) into tokenizer.bin since the function encode in run.c need to use the vocab_scores which custom tokenizers don't have?
I am new to LLaMA2 and this project, thank you so much for help!
The text was updated successfully, but these errors were encountered:
clebert
added a commit
to clebert/llama2.zig
that referenced
this issue
Oct 19, 2023
About model convertion
I have some issues when using huggingface APIs to load models, so I decided to download models and config files from huggingface.co manually (into
~/llama2.c/tiny/
).Then I tried to export the model with
python export.py ./tiny/tiny.bin --hf ./tiny/
and the following error occurred:Then I checked the model config by
print(config)
and it seems thathf_model = AutoModelForCausalLM.from_pretrained(model_path)
is not loading the model hyperparameters correctly:Since config.json have
"num_key_value_heads": 4
, I modified theexport.py
to manually load the hyperparameters from the config.json and change the args ofpermute_reverse
:and it finally works.
About custom tokenizer
Then I tried another LLaMA2 model created by the same contributer. While the model can be exported correctly,
./run
failed to load the tokenizer due to different vocab size. It turns out that there are three additional tokens in their vocabulary compared to the default one used by LLaMA2. Some LLaMA2 models in huggingface are trained even with GPT-2 tokenizer.I looked at the
tokenizer.py
and it only supports to generatetokenizer.bin
withtokenizer.model
generated bySentencePiece
library.So my question is: How can I convert a custom tokenizer (not created by
SentensePiece
, either downloaded manually or saved bytokenizer.save_pretrained(path)
) intotokenizer.bin
since the functionencode
inrun.c
need to use thevocab_scores
which custom tokenizers don't have?I am new to LLaMA2 and this project, thank you so much for help!
The text was updated successfully, but these errors were encountered: