Skip to content

Commit

Permalink
fix: better warmup error
Browse files Browse the repository at this point in the history
  • Loading branch information
OlivierDehaene committed Oct 25, 2023
1 parent f9910d1 commit 96a982a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion server/text_generation_server/models/flash_causal_lm.py
Original file line number Diff line number Diff line change
Expand Up @@ -670,7 +670,7 @@ def warmup(self, batch: FlashCausalLMBatch):
self.device,
)
_, batch = self.generate_token(batch)
except Exception as e:
except torch.cuda.OutOfMemoryError as e:
raise RuntimeError(
f"Not enough memory to handle {len(batch.input_ids)} prefill tokens. "
f"You need to decrease `--max-batch-prefill-tokens`"
Expand Down

0 comments on commit 96a982a

Please sign in to comment.