Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about converting models and tokenizers downloaded from huggingface #431

Open
zxh0916 opened this issue Oct 19, 2023 · 1 comment

Comments

@zxh0916
Copy link

zxh0916 commented Oct 19, 2023

About model convertion

I have some issues when using huggingface APIs to load models, so I decided to download models and config files from huggingface.co manually (into ~/llama2.c/tiny/).
Then I tried to export the model with python export.py ./tiny/tiny.bin --hf ./tiny/ and the following error occurred:

Traceback (most recent call last):
  File "/home/mrmat/llama2.c/export.py", line 565, in <module>
    model = load_hf_model(args.hf)
  File "/home/mrmat/llama2.c/export.py", line 479, in load_hf_model
    layer.attention.wk.weight = nn.Parameter(permute_reverse(hf_dict[f'model.layers.{i}.self_attn.k_proj.weight']))
  File "/home/mrmat/llama2.c/export.py", line 473, in permute_reverse
    return w.view(n_heads, 2, dim1 // n_heads // 2, dim2).transpose(1, 2).reshape(dim1, dim2)
RuntimeError: shape '[32, 2, 32, 2048]' is invalid for input of size 524288

Then I checked the model config by print(config) and it seems that hf_model = AutoModelForCausalLM.from_pretrained(model_path) is not loading the model hyperparameters correctly:

ModelArgs(dim=2048, n_layers=22, n_heads=32, n_kv_heads=32, vocab_size=32000,
          hidden_dim=5632, multiple_of=256, norm_eps=1e-05, max_seq_len=2048, dropout=0.0)

Since config.json have "num_key_value_heads": 4, I modified the export.py to manually load the hyperparameters from the config.json and change the args of permute_reverse:

if any(['config.json' in path for path in os.listdir(model_path)]):
    with open(os.path.join(model_path, 'config.json'), 'r') as f:
        config_json = json.load(f)
    config.dim = config_json["hidden_size"]
    config.n_layers = config_json["num_hidden_layers"]
    config.n_heads = config_json["num_attention_heads"]
    config.n_kv_heads = config_json["num_key_value_heads"]
    config.vocab_size = config_json["vocab_size"]
    config.hidden_dim = config_json["intermediate_size"]
    config.norm_eps = config_json["rms_norm_eps"]
    config.max_seq_len = config_json["max_position_embeddings"]
else:
    config.dim = hf_model.config.hidden_size
    config.n_layers = hf_model.config.num_hidden_layers
    config.n_heads = hf_model.config.num_attention_heads
    config.n_kv_heads = hf_model.config.num_attention_heads
    config.vocab_size = hf_model.config.vocab_size
    config.hidden_dim = hf_model.config.intermediate_size
    config.norm_eps = hf_model.config.rms_norm_eps
    config.max_seq_len = hf_model.config.max_position_embeddings
layer.attention.wk.weight = nn.Parameter(permute_reverse(
    hf_dict[f'model.layers.{i}.self_attn.k_proj.weight'],
    n_heads=config.n_kv_heads,
    dim1=config.dim//config.n_heads*config.n_kv_heads))

and it finally works.

About custom tokenizer

Then I tried another LLaMA2 model created by the same contributer. While the model can be exported correctly, ./run failed to load the tokenizer due to different vocab size. It turns out that there are three additional tokens in their vocabulary compared to the default one used by LLaMA2. Some LLaMA2 models in huggingface are trained even with GPT-2 tokenizer.
I looked at the tokenizer.py and it only supports to generate tokenizer.bin with tokenizer.model generated by SentencePiece library.
So my question is: How can I convert a custom tokenizer (not created by SentensePiece, either downloaded manually or saved by tokenizer.save_pretrained(path)) into tokenizer.bin since the function encode in run.c need to use the vocab_scores which custom tokenizers don't have?
I am new to LLaMA2 and this project, thank you so much for help!

@umama-rahman1
Copy link

Are you also having a failed read error when doing a run?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants