stanfordnlp · frankaging · May 29, 2024 · May 29, 2024
diff --git a/README.md b/README.md
@@ -11,7 +11,8 @@ Want to try a fine-tuning method that uses a fraction of the parameter count of
 - Finetuning any pretrained LMs on HuggingFace with ReFT
 - Setting ReFT hyperparameters via configs
 - Sharing the fine-tuned results easily to HuggingFace
-- 🔥 Customizable trainer such as [DPO with ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/dpo)
+- 🔥 [DPO+ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/dpo)
+- 🔥 [LoRA+ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/peft)
 
 > [!TIP]
 > **Getting Started:** [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/pyreft/blob/main/main_demo.ipynb) [**ReFT with TinyLlama**]  
@@ -74,6 +75,33 @@ model params: 6,738,415,616 || trainable%: 0.00048634578018881287
 """
 ```
 
+Alternatively, you can also train ReFT together with LoRA as well by taking advantage of [the `peft` library](https://github.com/huggingface/peft):
+
+```py
+peft_config = LoraConfig(
+    r=4, lora_alpha=32, target_modules=["o_proj"], layers_to_transform=[15],
+    use_rslora=True, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM"
+)
+model = get_peft_model(model, peft_config)
+
+reft_config = pyreft.ReftConfig(representations=[{
+    # string component access is enforced for customized model such as a peft model!
+    "layer": l, "component": f"base_model.model.model.layers[{l}].output",
+    "low_rank_dimension": 4,
+    "intervention": pyreft.LoreftIntervention(embed_dim=model.config.hidden_size,
+    low_rank_dimension=4)} for l in [15]])
+
+reft_model = pyreft.get_reft_model(model, reft_config)
+# you need to call this to re-enable lora grads!
+reft_model.model.enable_adapter_layers()
+reft_model.print_trainable_parameters()
+
+"""
+trainable intervention params: 32,772 || trainable model params: 32,768
+model params: 6,738,448,384 || trainable%: 0.0009726274694871952
+"""
+```
+
 ### Step 3: a few demonstrations of the behavior you want.
 Quick adaptation or personalization requires very limited training data. Here, we play the same rule for ReFT. In this example, we want the Llama-2-chat model to **only return Emoji**. We create 10 examples:
 ```py

diff --git a/examples/peft/README.md b/examples/peft/README.md
@@ -0,0 +1,7 @@
+# Combining LoRA with ReFT with "one-click"
+
+Based on the script [`reft_icl.ipynb`](https://github.com/stanfordnlp/pyreft/blob/main/examples/peft/reft_with_lora.ipynb).
+
+You can wrap any `peft` model (from the 🤗 [PEFT: State-of-the-art Parameter-Efficient Fine-Tuning library](https://github.com/huggingface/peft)) as a ReFT model with a single line of code! Then, you can co-train your LoRA wights along with interventions. 
+
+Feel free to explore how to trade some heavy LoRA wights for some lightweight interventions!