Apply suggestions from code review

Co-authored-by: Agnieszka Ciborowska <[email protected]>
determined-ai · Feb 26, 2024 · d4656da · d4656da
1 parent 1d0c924
commit d4656da
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 14 deletions.
diff --git a/blog/llm-finetuning-2/README.md b/blog/llm-finetuning-2/README.md
@@ -1,6 +1,6 @@
 # Finetuning Mistral-7B using LoRA and DeepSpeed
 
-In this demo, we finetune the [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) using [LoRA](https://arxiv.org/abs/2106.09685) and [DeepSpeed](https://github.com/microsoft/DeepSpeed). We ran LoRA on two 80 GB A100 GPUs, and DeepSpeed on two, four, and eight 80 GB A100 GPUs.
+In this demo, we finetune [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) using [LoRA](https://arxiv.org/abs/2106.09685) and [DeepSpeed](https://github.com/microsoft/DeepSpeed). We ran LoRA on two 80 GB A100 GPUs, and DeepSpeed on two, four, and eight 80 GB A100 GPUs.
 
 To get started, first install Determined on your local machine:
 ```bash
@@ -25,7 +25,7 @@ Change configuration options in `distributed.yaml`. Some important options are:
 - `per_device_train_batch_size`: the batch size per GPU.
 
 
-DeepSpeed configuration options are in the `ds_configs` folder.
+DeepSpeed configuration files are in the `ds_configs` folder.
 
 ## Testing
 

diff --git a/blog/llm-finetuning-2/finetune.py b/blog/llm-finetuning-2/finetune.py
@@ -93,10 +93,6 @@ def compute_metrics(eval_preds):
         decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
         decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
 
-        for l, p in zip(decoded_labels, decoded_preds):
-            if l != p:
-                logging.error(f"decoded_label:{l}")
-                logging.error(f"decoded_pred:{p}")
 
         bleu_score = bleu.compute(predictions=decoded_preds, references=decoded_labels)
         accuracy = acc.compute(predictions=preds[~mask], references=labels[~mask])
@@ -114,7 +110,6 @@ def compute_metrics(eval_preds):
 
         model = get_peft_model(model, peft_config)
 
-    logging.error(f"dataset={dataset['train'][0]}")
 
     trainer = Trainer(
         args=training_args,
@@ -128,13 +123,6 @@ def compute_metrics(eval_preds):
     )
 
     trainer.add_callback(det_callback)
-    # we need to comment this one out, since it will lead to the following error:
-    # [parameter_offload.py:86:_apply_to_tensors_only] A module has unknown inputs or outputs type (<class 'transformers.cache_utils.DynamicCache'>)
-    # and the tensors embedded in it cannot be detected. The ZeRO-3 hooks designed to trigger before or after backward pass of the module relies on
-    # knowing the input and output tensors and therefore may not get triggered properly.
-    # The error happens due to deepspeed initialization happening in the trainer.train(), hence call on eval fails.
-
-    # trainer.evaluate()
 
     trainer.train()