Shadow memory range interleaves with an existing memory mapping #1630

jason-infra · 2023-03-08T15:35:11Z

I have an ASAN test suite that gives me the following error:

==25==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==25==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==25==This might be related to ELF_ET_DYN_BASE change in Linux 4.12.
==25==See https://github.com/google/sanitizers/issues/856 for possible workarounds.
==25==Process memory map follows:
	0x00007fff7000-0x00008fff7000	
	0x000091ff6000-0x004091ff7000	
	0x02008fff7000-0x10007fff8000	
         ...

I am running

Nvidia 525.60.13 drivers
Ubuntu 20.04.5
Linux kernel 5.15.0-1027-gcp

IMPORTANTLY, All Asan test pass with no errors if I run the ASAN tests using nvidia 470 drivers:

Nvidia 470.82.01 drivers
Ubuntu 20.04.5
Linux kernel 5.15.0-1027-gcp

Other Info

I have kept all variables the same, including Asan testsuite, OS, kernel, nvidia gpu t4. The only difference being the Nvidia drivers.
I do not believe -fsanitize=address is related in #856, because the test suite runs flawlessly with Nvidia 470 drivers.
All my other tests pass, including tsan.

Why am I still getting the "Shadow memory range interleaves" error?

The text was updated successfully, but these errors were encountered:

mjj48 · 2023-03-29T22:12:14Z

Standalone command to repro this:

#!/bin/bash
set -e 

# Things you must change:
LIB_ASAN=/usr/lib/x86_64-linux-gnu/libasan.so.5
LIB_CUDART=cuda-11.2/targets/x86_64-linux/lib/libcudart.so.11.0
NVCC_PATH=/home/michael.johnson/cuda-11.2/bin/nvcc
CLANG10_PATH=/home/michael.johnson/clang/bin/clang
# End of things you must change.

# Create .h
cat <<EOT >> hello.h
void call_hello();
EOT
# Create .cu
cat <<EOT >> hello.cu
#include <cstdio>
__global__ void cuda_hello(){
    printf("Hello World from GPU!\n");
}
void call_hello() {
    cuda_hello<<<1,1>>>(); 
}
EOT
# Create .cpp
cat <<EOT >> hello_bin.cpp
#include "hello.h"
int main() {
    call_hello(); 
    return 1;
}
EOT
mkdir -p needed_libs/
cp $LIB_ASAN needed_libs/
cp $LIB_CUDART needed_libs/

$NVCC_PATH  --objdir-as-tempdir  --compiler-options "-fPIC" --compiler-bindir=$CLANG10_PATH  -x cu  -O2 -c hello.cu -o hello.pic.o

$CLANG10_PATH -shared -o libhello.so hello.pic.o -fsanitize=address -stdlib=libstdc++ -Lneeded_libs -lstdc++

cp libhello.so needed_libs/

$CLANG10_PATH  -fPIC  -nostdinc '-std=c++17' -nostdinc++ -c hello_bin.cpp -o hello_bin.pic.o

$CLANG10_PATH -o hello_bin -Lneeded_libs hello_bin.pic.o -lhello -l:libcudart.so.11.0 -l:libstdc++.so.6 -pie -fsanitize=address -fuse-ld=gold  -stdlib=libstdc++ -lstdc++

LD_LIBRARY_PATH=needed_libs/ ./hello_bin 2>&1 | head

noaxp · 2024-03-06T09:27:08Z

It seems the problem only happens when you use both clang and gold ld.
I also encountered this issue on gcc, but it's weird I couldn't reproduce it now .
As a temporary solution, you could try gcc or other ld.

Reproduce:
clang demo.cc -fsanitize=address -I/usr/local/cuda/include -lcuda -o demo -fuse-ld=gold
Remove -fuse-ld=gold the program would work well.

// demo.cc
#include <cuda.h>
int main() {
  cuInit(0);
  return 0;
}

noaxp · 2024-03-14T12:04:46Z

It's caused by duplicated invoking of InitializeShadowMemory.
First invoking is before main(), second is before cuInit(0).

The memory address of variable asan_inited is different between the twice invoking, so the program consider asan uninitialized and call InitializeShadowMemory in the second time. Then it try to allocate shadow memory on the same address i.e. 0x00007fff7000-0x10007fff7fff and cause error.

But I'm still confused why there are two asan_inited object, and why cuda driver & ld could effect it.

noaxp · 2024-03-19T08:18:06Z

I found the root cause, there is below code in libcuda.so

Dl_info attr[2];
dladdr((void*)&pthread_join, attr);
dlopen(attr[0].dli_fname, 1);

So the program try to dlopen itself, and dlopen pie file is undefined behavior.

adam-smnk mentioned this issue Aug 15, 2023

Issue #669: updated CUDA API to v12.2 libxsmm/tpp-mlir#670

Merged

swkim101 mentioned this issue Apr 10, 2024

LLVM build fails with ASAN=y for native_sim zephyrproject-rtos/zephyr#71288

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shadow memory range interleaves with an existing memory mapping #1630

Shadow memory range interleaves with an existing memory mapping #1630

jason-infra commented Mar 8, 2023 •

edited

Loading

mjj48 commented Mar 29, 2023 •

edited

Loading

noaxp commented Mar 6, 2024

noaxp commented Mar 14, 2024

noaxp commented Mar 19, 2024

Shadow memory range interleaves with an existing memory mapping #1630

Shadow memory range interleaves with an existing memory mapping #1630

Comments

jason-infra commented Mar 8, 2023 • edited Loading

Other Info

mjj48 commented Mar 29, 2023 • edited Loading

noaxp commented Mar 6, 2024

noaxp commented Mar 14, 2024

noaxp commented Mar 19, 2024

jason-infra commented Mar 8, 2023 •

edited

Loading

mjj48 commented Mar 29, 2023 •

edited

Loading