Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unprintable characters from vocab list #25

Closed
wants to merge 2 commits into from

Conversation

beiller
Copy link
Contributor

@beiller beiller commented Mar 11, 2023

Fixes #11

This fixes a Japanese prompt I was attempting to run

EG:

./main -m ./models/13B/ggml-model-q4_0.bin -t 8 -n 128 -n 512 -p $'人生の意味は'

Output before change:

人生の意���、フロントカードに���いてる。 2019年3月 © All Rights Reserved. [end of text]

So it is outputting some characters but some �

Output after change:

人生の意は、一人が一人ということであります。は安部が立していたので、去からは一人の人にれるのはにとどまったのですが、そう

@beiller beiller closed this Mar 11, 2023
@beiller beiller deleted the feature/remove_unprintable branch March 11, 2023 22:09
flowgrad pushed a commit to flowgrad/llama.cpp that referenced this pull request Jun 27, 2023
* Added ggml_tensor_printf() to debug tensors.
Not sure if all cases work, it was only tested a bit.

Example for the ggml_repeat2 dst tensor after it was computed:
+======================+======================+======================+======================+
| ggml_compute_forward_repeat2_f32:9497
| node_1233
+----------------------+----------------------+----------------------+----------------------+
| Dimensions           | Quantization         | Layer id             | Backend              |
| 3                    | f32                  | 31                   | CPU                  |
+----------------------+----------------------+----------------------+----------------------+
| Elements             | Src0                 | Src1                 | Operation            |
| 64 x 2 x 71          | 64 x 2 x 1           | 64 x 2 x 71          | REPEAT2              |
+----------------------+----------------------+----------------------+----------------------+
| Src0 name:           | node_1232                                                          |
| Src1 name:           | leaf_17                                                            |
+----------------------+----------------------+----------------------+----------------------+

+-------------------------------------------------------------------------------------------+
| Content of src0 "node_1232" (3 dim)
Layer 0
| -0.019758            0.772589             0.000000             |
| 0.772589             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

Layer 1
| 0.001423             -1.063233            0.000000             |
| -1.063233            0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

Layer 2
| -0.042461            -0.936166            0.000000             |
| -0.936166            0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

+-------------------------------------------------------------------------------------------+
| Content of src1 "leaf_17" (3 dim)
Layer 0
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

Layer 1
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

Layer 2
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
| 0.000000             0.000000             0.000000             |
+-------------------------------------------------------------------------------------------+

+-------------------------------------------------------------------------------------------+
| Content of dst "node_1233" (3 dim)
Layer 0
| -0.019758            -0.019758            -0.019758            |
| 0.772589             0.772589             0.772589             |
| -0.019758            -0.019758            -0.019758            |
+-------------------------------------------------------------------------------------------+

Layer 1
| 0.001423             0.001423             0.001423             |
| -1.063233            -1.063233            -1.063233            |
| 0.001423             0.001423             0.001423             |
+-------------------------------------------------------------------------------------------+

Layer 2
| -0.042461            -0.042461            -0.042461            |
| -0.936166            -0.936166            -0.936166            |
| -0.042461            -0.042461            -0.042461            |
+-------------------------------------------------------------------------------------------+

+======================+======================+======================+======================+

* typo stride>n_elem - sample print is probably still bugged

* added strides and boolean info flags

---------

Co-authored-by: John <[email protected]>
rooprob pushed a commit to rooprob/llama.cpp that referenced this pull request Aug 2, 2023
Add information on compiler flags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unicode support
1 participant