Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync : ggml (conv 1d + 2d updates) #3468

Merged
merged 3 commits into from
Oct 4, 2023
Merged

sync : ggml (conv 1d + 2d updates) #3468

merged 3 commits into from
Oct 4, 2023

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Oct 4, 2023

While doing this sync, I noticed that the ./bin/test-quantize-perf example produces undefined behaviour when the Addr and UB sanitizers are enabled:

  • resolved:

43bbfb7

  • unresolved:
q4_1
  quantize_row_q_reference
    4096 values (0.02 MB)
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16: runtime error: -1.20472 is outside the range of representable values of type 'unsigned long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16 in 
      min cycles/32 vals   :      0.00
      avg cycles/32 vals   :      0.00
      float32 throughput   :      0.17 GB/s
      quantized throughput :      0.03 GB/s

  quantize_row_q
    4096 values (0.02 MB)
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16: runtime error: -1.20472 is outside the range of representable values of type 'unsigned long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16 in 
      min cycles/32 vals   :      0.00
      avg cycles/32 vals   :      0.00
      float32 throughput   :      0.17 GB/s
      quantized throughput :      0.03 GB/s

  dequantize_row_q
    4096 values (0.02 MB)
      min cycles/32 vals   :      0.00
      avg cycles/32 vals   :      0.00
      float32 throughput   :      0.30 GB/s
      quantized throughput :      0.05 GB/s

  quantize_row_q_dot
    4096 values (0.02 MB)
      min cycles/32 vals   :      0.00
      avg cycles/32 vals   :      0.00
      float32 throughput   :      0.19 GB/s
      quantized throughput :      0.03 GB/s

  vec_dot_q
    4096 values (0.02 MB)
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16: runtime error: -7.84407e+35 is outside the range of representable values of type 'unsigned long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/c++/v1/__functional/invoke.h:478:16 in 

Also, ./bin/test-quantize-perf crashes in Release on latest Mac Sonoma 14:

q2_K
  quantize_row_q_reference
    4096 values (0.02 MB)
Illegal instruction: 4

Bisecting, this starts crashing at 2db94d9, which does not make much sense. I'm wondering if something is wrong in my build environment after updating the OS

ggml.c:1033:39: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior

ggml.c:1081:39: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior

ggml-ci
@ggerganov ggerganov added the need feedback Testing and feedback with results are needed label Oct 4, 2023
@slaren
Copy link
Collaborator

slaren commented Oct 4, 2023

I tried running the tests with the sanitizers enabled, but I didn't get any errors. Other people reported issues with illegal instructions on Macs (#3279), and I think that the conclusion was that it might be a compiler bug.

@ggerganov
Copy link
Owner Author

On M2 Ultra, I have Sonoma 14 with Xcode 15 and running ./bin/test-quantize-perf on master crashes with illegal instruction: 4 in nearest_int():

$ /usr/bin/xcodebuild -version
Xcode 15.0
Build version 15A240d

$ g++ --version
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

On M1 Pro, I also have Sonoma 14, but I haven't updated Xcode yet and it does not crash:

$ /usr/bin/xcodebuild -version
Xcode 14.3.1
Build version 14E300c

$ g++ --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

However, both setups produce sanitizer errors in Q4_1, Q5_0 and Q5_1.

The errors in Q5_0 and Q5_1 are fixed with 2db94d9, but the others I cannot figure out what is causing them yet

@slaren
Copy link
Collaborator

slaren commented Oct 4, 2023

I tried building with clang 15, but I still don't get any sanitizer errors. Maybe this check is something exclusive to the Apple clang version.

Looking at the errors and the code, I think that the cause might be in the lambdas such as this one:

                    auto quantize_fn = [&](void ) {
                        qfns.from_float_reference(test_data1, test_q1, size);
                        return test_q1[0];
                    };

This returns a float, but benchmark_function takes a function that returns a size_t, so there is probably a conversion happening inside std::function, which causes that error. Maybe changing the std::function argument in benchmark_function to return float would fix this.

@ggerganov
Copy link
Owner Author

ggerganov commented Oct 4, 2023

Nice! That fixed all UBs

The Illegal instruction remains, but it's very suspicious - if I put a print or try to build in Debug or RelWithDebInfo, it stops crashing. Running in Release with lldb I only get this:

q2_K
  quantize_row_q_reference
    4096 values (0.02 MB)
Process 39004 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x1e011222)
    frame #0: 0x0000000100099afc test-quantize-perf`quantize_row_q2_K_reference + 520
test-quantize-perf`quantize_row_q2_K_reference:
->  0x100099afc <+520>: .long  0x1e011222                ; unknown opcode
    0x100099b00 <+524>: ldp    s4, s26, [x9, #0x1c]
    0x100099b04 <+528>: fcmp   s18, s1
    0x100099b08 <+532>: fcsel  s1, s18, s1, mi
Target 0: (test-quantize-perf) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x1e011222)
  * frame #0: 0x0000000100099afc test-quantize-perf`quantize_row_q2_K_reference + 520
    frame #1: 0x0000000100008260 test-quantize-perf`std::__1::__function::__func<main::$_0, std::__1::allocator<main::$_0>, float ()>::operator()() + 44
    frame #2: 0x0000000100007c8c test-quantize-perf`benchmark_function(unsigned long, unsigned long, long long, std::__1::function<float ()> const&) + 72
    frame #3: 0x0000000100007034 test-quantize-perf`main + 7252
    frame #4: 0x0000000189139058 dyld`start + 2224
(lldb) 

Replacing nearest_int with roundf still crashes:

int l = nearest_int(iscale*(x[i] - min));

Replacing with int l = 0; does not crash.

I guess I'll just merge as it is and blame clang for now.

@ggerganov ggerganov merged commit f93af02 into master Oct 4, 2023
33 checks passed
@ggerganov ggerganov deleted the sync branch October 4, 2023 12:30
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 5, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp: (24 commits)
  convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299)
  readme : add project status link
  ggml : fix build after ggerganov#3329
  llm : add Refact model (ggerganov#3329)
  sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468)
  finetune : readme fix typo (ggerganov#3465)
  ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453)
  main : consistent prefix/suffix coloring (ggerganov#3425)
  llama : fix session saving/loading (ggerganov#3400)
  llama : expose model's rope_freq_scale in the API (ggerganov#3418)
  metal : alibi for arbitrary number of heads (ggerganov#3426)
  cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273)
  Work on the BPE tokenizer (ggerganov#3252)
  convert : fix vocab size when not defined in hparams (ggerganov#3421)
  cmake : increase minimum version for add_link_options (ggerganov#3444)
  CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402)
  gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408)
  gguf : general usability improvements (ggerganov#3409)
  cmake : make CUDA flags more similar to the Makefile (ggerganov#3420)
  finetune : fix ggerganov#3404 (ggerganov#3437)
  ...
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
* sync : ggml (conv 1d + 2d updates)

ggml-ci

* ggml : fix UB in q5_0 and q5_1 quantize code

ggml.c:1033:39: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior

ggml.c:1081:39: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior

ggml-ci

* tests : fix UB in test-quantize-perf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need feedback Testing and feedback with results are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants