We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Just make the model keep generating new words and non-stop, until the generated sequence length exceeds the default seq_len.
seq_len
For example, change the prompt into
prompt = 'a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a'
and it will crash after generating 1022 tokens:
local_cache = val_cache.select(0, l).narrow(0, pos, 3) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: start (1022) + length (3) exceeds dimension size (1024).
The bug is due to the construction of local_cache:
local_cache
local_cache = val_cache.select(0, l).narrow(0, pos, 3)
when pos = seq_len - 2, using val_cache for this in-place construction for local_cache will cause an error.
pos = seq_len - 2
val_cache
For a quick (but perhaps not "beautiful") fix, just change line 74 into
val_cache = torch.zeros([n_layers, seq_len + 3, dim], dtype=data_type, device=device).clone()
to reserve more place for local_cache.
The text was updated successfully, but these errors were encountered:
Thank you, the kv cache is by default set up to 1024 - 2 words. To support longer context, you can also change seq_len from 1024 to 4096 here
Sorry, something went wrong.
No branches or pull requests
How to Reproduce
Just make the model keep generating new words and non-stop, until the generated sequence length exceeds the default
seq_len
.For example, change the prompt into
and it will crash after generating 1022 tokens:
How to Fix
The bug is due to the construction of
local_cache
:when
pos = seq_len - 2
, usingval_cache
for this in-place construction forlocal_cache
will cause an error.For a quick (but perhaps not "beautiful") fix, just change line 74 into
to reserve more place for
local_cache
.The text was updated successfully, but these errors were encountered: