You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:" -n 6 -temp 0
which gives the mentioned output. Changing Where is Mary? to Where is John? also gives the right output.
But changing n to 26 already gives a wrong output:
Answer: John is in the bedroom.- Mary went to the garden. John went to the bedroom. Where is John?Answer: John
When asking
How long is an average airplane? at n = 26
it outputs
Answer: The average flight time of an airplane is 2 hours and 15 minutes.What is the average flight time of a 737
So it seems that it's luck that the answer to the former was accurate. But is this an issue of quantization or the underlying model itself or where lies the issue?
The text was updated successfully, but these errors were encountered:
There might be some issues with the larger models.
I downloaded and quantized the 700m and 3b parameter models and the 3b one doesn't work at all.
Passing Lorem ipsum into bitnet_b1_58-large --> Lorem ipsum dolor sit amet, consectetur adipisicing elit. <continues with some random text>
Passing Lorem ipsum into bitnet_b1_58-3B --> Lorem ipsum b p p b b p p p p p p p p p p p p p p p p p p p p p. p p.
I am using the standard example
python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:" -n 6 -temp 0
which gives the mentioned output. Changing
Where is Mary?
toWhere is John?
also gives the right output.But changing
n
to26
already gives a wrong output:Answer: John is in the bedroom.- Mary went to the garden. John went to the bedroom. Where is John?Answer: John
When asking
How long is an average airplane?
atn = 26
it outputs
Answer: The average flight time of an airplane is 2 hours and 15 minutes.What is the average flight time of a 737
So it seems that it's luck that the answer to the former was accurate. But is this an issue of quantization or the underlying model itself or where lies the issue?
The text was updated successfully, but these errors were encountered: