[WIP] Add Fill-In-Middle example #2934

apaz-cli · 2023-08-31T14:53:30Z

I'm done implementing the FIM example, looking for code review. Currently untested.

…ap-fim

ggerganov · 2023-09-01T13:06:55Z

llama.cpp

+    id special_prefix_id = 32007;
+    id special_middle_id = 32009;
+    id special_suffix_id = 32008;
+    id special_eot_id = 32010;


Curious, does Code Llama 34B have these special tokens?
If it does not, then how would FIM work with it?

It does, yeah. These are new, and I think only in codellama. I don't think they're in llama2. To get the token ids themselves, the codellama people run the tokenizer, and these are the values that came out.

https://github.com/facebookresearch/codellama/blob/cb51c14ec761370ba2e2bc351374a79265d0465e/llama/tokenizer.py#L28-L31

It should work. But I've been busy with my day job, and haven't gotten a chance to test it yet. Definitely not going to suggest merging until I'm certain.

ggerganov

I think this is a good example. Would be nice to add a README.md with instructions to run simple test. Maybe include a sample prefix and suffix files to use for input

apaz-cli · 2023-09-01T20:11:36Z

@ggerganov Those things are also forthcoming, should be able to do all my testing and polishing over the weekend.

What I'm particularly worried about as far as code review is concerned is whether I'm calling llama_eval() right, and sampling right.

Also, I'm confused about the tokenizer. Is the tokenizer included with gguf models? Facebook's code reads three or four files, but creating a struct llama_context* seems to only require one.

…ap-fim

ggerganov · 2023-09-03T05:43:47Z

What I'm particularly worried about as far as code review is concerned is whether I'm calling llama_eval() right, and sampling right.

Looking at the code it seems OK.

Also, I'm confused about the tokenizer. Is the tokenizer included with gguf models? Facebook's code reads three or four files, but creating a struct llama_context* seems to only require one.

Yes, the llama.cpp lib provides built-in tokenization functionality. The vocab of the model is embedded inside the .gguf model file and is automatically loaded.

The usage seems correct, though I would double-check everything with extra logging. It's easy to make a mistake when concatenating tokenizer results.

apaz-cli · 2023-09-05T06:50:24Z

@ggerganov I'm getting a segfault on a null pointer from deep inside ggml. Do you have any idea what this means? It's crashing on the very first token, which is the prefix token. I've been debugging for a while, but I'm at a loss. Is it an unrelated bug with Q2 models, or is this what happens when you try to eval an invalid token?

Here is the test script that I ran, the latest commit is the commit that I'm testing.

#!/bin/bash

# run.sh
# Execute with `sudo ./run.sh` so that `ulimit` works and `mlock()` doesn't fail.

ulimit -l 2000000
./fill-in-middle \
    models/CodeLlama-34B-GGUF/codellama-34b.Q2_K.gguf \
    $'def add(a, b):\n' \
    $'\n' \
    40 \
    1

=================================================================
==99666==ERROR: AddressSanitizer: SEGV on unknown address 0x7c50a91cb2d0 (pc 0x564cd24ad03c bp 0x000000000000 sp 0x7fffd3d85aa0 T0)
==99666==The signal is caused by a READ memory access.
    #0 0x564cd24ad03c in dequantize_row_q2_K /home/apaz/git/llama.cpp/k_quants.c:418
    #1 0x564cd2319765 in ggml_compute_forward_get_rows_q /home/apaz/git/llama.cpp/ggml.c:11796
    #2 0x564cd2319765 in ggml_compute_forward_get_rows /home/apaz/git/llama.cpp/ggml.c:11875
    #3 0x564cd234087a in ggml_compute_forward /home/apaz/git/llama.cpp/ggml.c:15805
    #4 0x564cd23444d4 in ggml_graph_compute_thread /home/apaz/git/llama.cpp/ggml.c:17241
    #5 0x564cd236d3e1 in ggml_graph_compute /home/apaz/git/llama.cpp/ggml.c:17751
    #6 0x564cd23a2ee0 in ggml_graph_compute_helper /home/apaz/git/llama.cpp/llama.cpp:402
    #7 0x564cd23a9408 in llama_eval_internal /home/apaz/git/llama.cpp/llama.cpp:2935
    #8 0x564cd23a9ba9 in llama_eval /home/apaz/git/llama.cpp/llama.cpp:6060
    #9 0x564cd22f2040 in codellama_fill_in_middle examples/fill-in-middle/FIM.c:92
    #10 0x564cd22f2040 in main examples/fill-in-middle/FIM.c:181
    #11 0x7f01284456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #12 0x7f0128445784 in __libc_start_main_impl ../csu/libc-start.c:360
    #13 0x564cd22f4f90 in _start (/home/apaz/git/llama.cpp/fill-in-middle+0x32f90) (BuildId: 1d0a09041fd8fc40f04d4c50dbd084bfffe605f9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/apaz/git/llama.cpp/k_quants.c:418 in dequantize_row_q2_K
==99666==ABORTING

ggerganov · 2023-09-05T08:34:57Z

I thought only 7B and 13B Code Llama have the special tokens. 34B's vocab is only 32000 so I assume no FIM support.
I'll take a detailed look and run some tests a bit later

apaz-cli · 2023-09-05T15:32:17Z

@ggerganov You may be right. But, I would expect it in that case just to give garbage output, rather than segfault. When I use CodeLlama-7B-Python-GGUF/codellama-7b-python.Q2_K.gguf, the result is the same.

kurnevsky · 2023-09-06T02:04:39Z

Can it be added to the flake output?

apaz-cli · 2023-09-15T14:52:24Z

@alitariq4589 I got an email that you commented on this PR, did it somehow get deleted? Why do you need this PR merged? It's just an example.

@ggerganov Sorry I let the PR go stale, going to have more time to work on it over the weekend. Could you clone this branch and tell me what you see? Still having trouble debugging the segfault, since it happens deep inside ggml. Presumably, something at a higher level is not right, but I'm not sure what.

Still using the weights from CodeLlama-7B-Python-GGUF/codellama-7b-python.Q2_K.gguf, same as above.

alitariq4589 · 2023-09-15T14:53:55Z

@apaz-cli Sorry about that message. It was generated by CI and was sent to all the PRs accidentally.

apaz-cli · 2023-09-15T14:56:31Z

@alitariq4589 Gotcha, no problem. It reminded me this still exists :)

ggerganov · 2023-09-15T15:01:08Z

@apaz-cli No worries - this example is still on my radar, but haven't had time to come back to it. Will do so eventually

apaz-cli · 2023-09-15T15:09:33Z

Sounds good, thanks @ggerganov ❤️

ggerganov · 2023-10-08T10:51:30Z

We have recently introduced the infill example to demonstrate this functionality. It should cover what has been proposed here.

There is still some work to make the tokenization correct #3503

apaz-cli added 6 commits August 26, 2023 17:11

Added FIM token IDs.

21757ee

Added FIM example.

93753a8

Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …

90afd6d

…ap-fim

Added makefile, better error messages

828a43d

Resolved merge conflicts.

16841ac

Updated gitignore for new example.

1e85f6b

ggerganov reviewed Sep 1, 2023

View reviewed changes

ggerganov approved these changes Sep 1, 2023

View reviewed changes

Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …

142d79b

…ap-fim

apaz-cli added 3 commits September 3, 2023 16:32

Debugging crash.

314c29c

Added -fsanitize=address to the makefile.

82dcadd

Added FIM readme.

ca588a3

ggerganov added the need feedback Testing and feedback with results are needed label Sep 4, 2023

Added string debugging, removed bos token from end, added mlock.

2636a8b

This was referenced Sep 27, 2023

add refact model #3329

Merged

infill mode for main and server #3296

Merged

ggerganov closed this Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add Fill-In-Middle example #2934

[WIP] Add Fill-In-Middle example #2934

apaz-cli commented Aug 31, 2023

ggerganov Sep 1, 2023

apaz-cli Sep 1, 2023

ggerganov left a comment

apaz-cli commented Sep 1, 2023

ggerganov commented Sep 3, 2023

apaz-cli commented Sep 5, 2023

ggerganov commented Sep 5, 2023

apaz-cli commented Sep 5, 2023

kurnevsky commented Sep 6, 2023

apaz-cli commented Sep 15, 2023

alitariq4589 commented Sep 15, 2023

apaz-cli commented Sep 15, 2023

ggerganov commented Sep 15, 2023

apaz-cli commented Sep 15, 2023

ggerganov commented Oct 8, 2023

[WIP] Add Fill-In-Middle example #2934

[WIP] Add Fill-In-Middle example #2934

Conversation

apaz-cli commented Aug 31, 2023

ggerganov Sep 1, 2023

Choose a reason for hiding this comment

apaz-cli Sep 1, 2023

Choose a reason for hiding this comment

ggerganov left a comment

Choose a reason for hiding this comment

apaz-cli commented Sep 1, 2023

ggerganov commented Sep 3, 2023

apaz-cli commented Sep 5, 2023

ggerganov commented Sep 5, 2023

apaz-cli commented Sep 5, 2023

kurnevsky commented Sep 6, 2023

apaz-cli commented Sep 15, 2023

alitariq4589 commented Sep 15, 2023

apaz-cli commented Sep 15, 2023

ggerganov commented Sep 15, 2023

apaz-cli commented Sep 15, 2023

ggerganov commented Oct 8, 2023