Skip to content

Commit

Permalink
Merge branch 'curated' of https://github.com/gururise/alpaca-lora int…
Browse files Browse the repository at this point in the history
…o curated
  • Loading branch information
gururise committed Mar 17, 2023
2 parents c652b65 + 62f17e2 commit ececcc0
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 16 deletions.
14 changes: 5 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,11 @@ PRs adapting this code to multi-GPU setups and larger models are always welcome.
This file contains a script to convert the LoRA back into a standard PyTorch model checkpoint,
which should help users who want to use the model with projects like [llama.cpp](https://github.com/ggerganov/llama.cpp).

### To do

- [x] Merge LoRA weights into LLaMA weights to remove inference dependency on PEFT
- [x] Train/val split
- [ ] Hyperparameter tuning code
- [ ] Support for `13b`, `30b`, `65b`
- [ ] Train a version that doesn't waste tokens on the prompt header
- [ ] Inference CLI and evaluation
- [ ] Better disclaimers about why using LLaMA without permission is very bad!
### Notes

- Before we try to tune the weights on 13B+ models, we should note (sorry Tatsu) that [the quality of the Stanford Alpaca dataset is not very good](https://github.com/tloen/alpaca-lora/pull/32). We can likely improve our model performance significantly if we combed through the data and fixed bad examples; in fact, dataset quality might be our bottleneck. _The most impactful contribution anyone can make to this project is to provide a way to systematically iterate on the training data._
- We're continually fixing bugs and conducting training runs, and the weights on the Huggingface Hub are being updated accordingly. In particular, those facing issues with response lengths should make sure that they have the latest version of the weights and code.


### Example outputs

Expand Down
17 changes: 10 additions & 7 deletions generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,13 @@ def generate_prompt(instruction, input=None):


def evaluate(
instruction,
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
input=None,
**kwargs,
instruction,
input=None,
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
**kwargs,
):
prompt = generate_prompt(instruction, input)
inputs = tokenizer(prompt, return_tensors="pt")
Expand Down Expand Up @@ -82,6 +82,9 @@ def evaluate(
gr.components.Textbox(
lines=2, label="Instruction", placeholder="Tell me about alpacas."
),
gr.components.Textbox(
lines=2, label="Input", placeholder="none"
),
gr.components.Slider(minimum=0, maximum=1, value=0.1, label="Temperature"),
gr.components.Slider(minimum=0, maximum=1, value=0.75, label="Top p"),
gr.components.Slider(minimum=0, maximum=100, step=1, value=40, label="Top k"),
Expand Down

0 comments on commit ececcc0

Please sign in to comment.