Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide to run the code #11

Open
Abolfazl-kr opened this issue Feb 12, 2024 · 2 comments
Open

guide to run the code #11

Abolfazl-kr opened this issue Feb 12, 2024 · 2 comments

Comments

@Abolfazl-kr
Copy link

Thanks for your effort. I have a little confusion about the process. Correct me if I'm wrong. First, we should run block_expansion.py to create our extended model. Then, we clone the repository at https://github.com/hills-code/open-instruct.git@7c2b14d and run finetune_codealpaca.sh. Is this correct?"

Regarding your repo I have some problem in this process too:
1- After running block_expansion.py, a 14.5 GB pytorch_model.bin file will be created. It does not have a pytorch_model.bin.index.json or any other files. However, in the Hugging Face model, there are two shards plus all extra files needed like pytorch_model.bin.index.json, special_tokens_map.json, generation_config.json, config.json. how could we create them?

2- I want to pre train model with my raw text. what should i do? my data is not in your mentioned data like SlimOrca and ....
how could i transform my dataset to work with your codes?

@hills-code
Copy link
Collaborator

  1. You do not need pytorch_model.bin.index.json. For the other necessary files, you can just copy the original base model.
  2. The code can directly load the dataset from the huggingface use datasets.load_dataset('YOUR_DATASET'). However, if you want to do pretrain, you may need to revise the tokenize function as the tokenize function is used for SFT and will mask the instruction label during the process.

@kiran-coditation
Copy link

Hi @Abolfazl-kr are you able to pretrain after block-expansion? If yes can you please guide me for the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants