Add mixtral-MoE format adaptors #145

llylly · 2024-01-19T08:08:13Z

In this PR, I provide full support of format conversion for mixtral-MoE architectures.

Converting mixtral-MoE (from either magnet source or huggingface source) to internal split and split-sparse consolidated checkpoints: accessory/tools/mixtral_moe_split_from_hf.py. Usage: python mixtral_moe_split_from_hf.py in-ckpt-dir output-ckpt-dir [--in_ckpt_source hf_or_magnet (default: hf)] [--convert_sparse (whether to convert to sparse format)] This functionality is a refactoring from https://huggingface.co/Alpha-VLLM/MoE-Mixtral-7B-8Expert/blob/main/converted/split.py and https://huggingface.co/Alpha-VLLM/MoE-Mixtral-7B-8Expert/blob/main/converted_sparse/split_sparse.py, but now it unifies the two scripts and supports huggingface checkpoint format.
Converting mixtral-MoE consolidated checkpoint to huggingface format: accessory/tools/convert_weights_to_hf.py. The usage is the same for mixtral-moe compared to that of llama, except that --mixtral is needed to identify it is a mixtral moe architecture. Note: we only support consolidated checkpoints of split format yet, rather than split_sparse which is a future work.

ChrisLiu6

I would suggest refactoring the accessory/tools directory as follows:

accessory/tools
--checkpoint_conversion
----mixtral
------convert_from_hf_or_magnet.py ( the current mixtral_moe_split_from_hf.py)
------convert_to_hf.py
----llama
-------convert_to_hf.py ( the original convert_weights_to_hf.py)
--...

All other things look good to me.

ChrisLiu6 · 2024-01-20T07:40:37Z

accessory/tools/convert_weights_to_hf.py

I think it would be better to use separate scripts to handle the conversion of llama and mixtral respectively, as we may support a lot more new models in the future and one conversion script could hardly support them all.

Add mixtral format adaptors

e30e452

ChrisLiu6 self-assigned this Jan 20, 2024

ChrisLiu6 requested changes Jan 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mixtral-MoE format adaptors #145

Add mixtral-MoE format adaptors #145

llylly commented Jan 19, 2024

ChrisLiu6 left a comment •

edited

Loading

ChrisLiu6 Jan 20, 2024

Add mixtral-MoE format adaptors #145

Are you sure you want to change the base?

Add mixtral-MoE format adaptors #145

Conversation

llylly commented Jan 19, 2024

ChrisLiu6 left a comment • edited Loading

Choose a reason for hiding this comment

ChrisLiu6 Jan 20, 2024

Choose a reason for hiding this comment

ChrisLiu6 left a comment •

edited

Loading