Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run HTTP3 stress with Adress sanitization #100023

Merged
merged 7 commits into from
Mar 27, 2024
Merged

Conversation

rzikm
Copy link
Member

@rzikm rzikm commented Mar 20, 2024

This PR turns on MsQuic address sanitization on Linux Http3 stress runs.

@rzikm rzikm added the NO-REVIEW Experimental/testing PR, do NOT review it label Mar 20, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@rzikm
Copy link
Member Author

rzikm commented Mar 20, 2024

/azp list

Copy link

CI/CD Pipelines for this repository:

@rzikm
Copy link
Member Author

rzikm commented Mar 20, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkoritzinsky
Copy link
Member

Since we have support to build the runtime with ASAN enabled, could we instead use that instead of only sanitizing msquic? That way you'll get default settings that will definitely work and you won't have to use LD_PRELOAD.

@rzikm
Copy link
Member Author

rzikm commented Mar 20, 2024

Since we have support to build the runtime with ASAN enabled, could we instead use that instead of only sanitizing msquic? That way you'll get default settings that will definitely work and you won't have to use LD_PRELOAD.

when I tried this locally, it complained at runtime about some version mismatch (msQuic also adds a lot more options to -fsanitize).

I wonder, is there a way to get the managed callstacks show as well on the asan output?

@rzikm
Copy link
Member Author

rzikm commented Mar 20, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkoritzinsky
Copy link
Member

Since we have support to build the runtime with ASAN enabled, could we instead use that instead of only sanitizing msquic? That way you'll get default settings that will definitely work and you won't have to use LD_PRELOAD.

when I tried this locally, it complained at runtime about some version mismatch (msQuic also adds a lot more options to -fsanitize).

Yeah in that case then you're doing the right thing for now.

I wonder, is there a way to get the managed callstacks show as well on the asan output?

The regular xunit tests have some logic today to dump managed callstacks from dumps. We could use similar logic in the container to symbolize.

@rzikm
Copy link
Member Author

rzikm commented Mar 21, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 21, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 21, 2024

Got a hit

2024-03-21T10:51:18.9817476Z client_1  | =================================================================
2024-03-21T10:51:18.9818312Z client_1  | ==6==ERROR: AddressSanitizer: heap-use-after-free on address 0x602011c36590 at pc 0x7ec010329955 bp 0x7ebfff168fe0 sp 0x7ebfff168fd8
2024-03-21T10:51:18.9818649Z client_1  | READ of size 4 at 0x602011c36590 thread T23
2024-03-21T10:51:19.8220972Z client_1  |     #0 0x7ec010329954 in MsQuicStreamSend /msquic/msquic/src/msquic/src/core/api.c:1045
2024-03-21T10:51:19.8221338Z client_1  |     #1 0x7f00b115bf36  (/memfd:doublemapper (deleted)+0x101cf36)
2024-03-21T10:51:19.8221626Z client_1  |     #2 0x7f00b115ab50  (/memfd:doublemapper (deleted)+0x101bb50)
2024-03-21T10:51:19.8221900Z client_1  |     #3 0x7f00b115b624  (/memfd:doublemapper (deleted)+0x101c624)
2024-03-21T10:51:19.8222180Z client_1  |     #4 0x7f00b115ca93  (/memfd:doublemapper (deleted)+0x101da93)
2024-03-21T10:51:19.8222428Z client_1  |     #5 0x7f00b115fc74  (/memfd:doublemapper (deleted)+0x1020c74)
2024-03-21T10:51:19.8222698Z client_1  |     #6 0x7f00b115f20c  (/memfd:doublemapper (deleted)+0x102020c)
2024-03-21T10:51:19.8222945Z client_1  |     #7 0x7f00b101c6b3  (/memfd:doublemapper (deleted)+0xedd6b3)
2024-03-21T10:51:19.8223196Z client_1  |     #8 0x7f00b115e81b  (/memfd:doublemapper (deleted)+0x101f81b)
2024-03-21T10:51:19.8223451Z client_1  |     #9 0x7f00b101c9b0  (/memfd:doublemapper (deleted)+0xedd9b0)
2024-03-21T10:51:19.8223693Z client_1  |     #10 0x7f00b101c119  (/memfd:doublemapper (deleted)+0xedd119)
2024-03-21T10:51:19.8223946Z client_1  |     #11 0x7f00b115d589  (/memfd:doublemapper (deleted)+0x101e589)
2024-03-21T10:51:19.8224193Z client_1  |     #12 0x7f00b101c6b3  (/memfd:doublemapper (deleted)+0xedd6b3)
2024-03-21T10:51:19.8224453Z client_1  |     #13 0x7f00b101c907  (/memfd:doublemapper (deleted)+0xedd907)
2024-03-21T10:51:19.8224704Z client_1  |     #14 0x7f00b101c119  (/memfd:doublemapper (deleted)+0xedd119)
2024-03-21T10:51:19.8224959Z client_1  |     #15 0x7f00b115b17a  (/memfd:doublemapper (deleted)+0x101c17a)
2024-03-21T10:51:19.8225219Z client_1  |     #16 0x7f00b1143895  (/memfd:doublemapper (deleted)+0x1004895)
2024-03-21T10:51:19.8225481Z client_1  |     #17 0x7f00b115c791  (/memfd:doublemapper (deleted)+0x101d791)
2024-03-21T10:51:19.8225722Z client_1  |     #18 0x7f00b1016f53  (/memfd:doublemapper (deleted)+0xed7f53)
2024-03-21T10:51:19.8226517Z client_1  |     #19 0x7f00af5c7c41  (/live-runtime-artifacts/testhost/net9.0-linux-Release-x64/shared/Microsoft.NETCore.App/9.0.0/System.Private.CoreLib.dll+0x227c41)
2024-03-21T10:51:19.8227135Z client_1  |     #20 0x7f012e7ad726  (/live-runtime-artifacts/testhost/net9.0-linux-Release-x64/shared/Microsoft.NETCore.App/9.0.0/libcoreclr.so+0x49a726)
2024-03-21T10:51:19.8227517Z client_1  |     #21 0x7f012e5e8fc5 in CallDescrWorkerWithHandler(CallDescrData*, int) /repo/src/coreclr/vm/callhelpers.cpp:67
2024-03-21T10:51:19.8227880Z client_1  |     #22 0x7f012e5e8fc5 in DispatchCallSimple(unsigned long*, unsigned int, unsigned long, unsigned int) /repo/src/coreclr/vm/callhelpers.cpp:218
2024-03-21T10:51:19.8228462Z client_1  |     #23 0x7f012e5fe451 in ThreadNative::KickOffThread_Worker(void*) /repo/src/coreclr/vm/comsynchronizable.cpp:157
2024-03-21T10:51:19.8228832Z client_1  |     #24 0x7f012e5b8144 in ManagedThreadBase_DispatchInner(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7276
2024-03-21T10:51:19.8229202Z client_1  |     #25 0x7f012e5b8144 in ManagedThreadBase_DispatchMiddle(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7320
2024-03-21T10:51:19.8229756Z client_1  |     #26 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::$_0::operator()(ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::TryArgs*) const::{lambda(Param*)#1}::operator()(Param*) const /repo/src/coreclr/vm/threads.cpp:7478
2024-03-21T10:51:19.8230460Z client_1  |     #27 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::$_0::operator()(ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::TryArgs*) const /repo/src/coreclr/vm/threads.cpp:7480
2024-03-21T10:51:19.8230968Z client_1  |     #28 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7504
2024-03-21T10:51:19.8231345Z client_1  |     #29 0x7f012e5b871c in ManagedThreadBase_FullTransition(void (*)(void*), void*, UnhandledExceptionLocation) /repo/src/coreclr/vm/threads.cpp:7524
2024-03-21T10:51:19.8231857Z client_1  |     #30 0x7f012e5b871c in ManagedThreadBase::KickOff(void (*)(void*), void*) /repo/src/coreclr/vm/threads.cpp:7559
2024-03-21T10:51:19.8232219Z client_1  |     #31 0x7f012e5fe527 in ThreadNative::KickOffThread(void*) /repo/src/coreclr/vm/comsynchronizable.cpp:228
2024-03-21T10:51:19.8232591Z client_1  |     #32 0x7f012e91c5dd in CorUnix::CPalThread::ThreadEntry(void*) /repo/src/coreclr/pal/src/thread/thread.cpp:1760
2024-03-21T10:51:19.8233064Z client_1  |     #33 0x7f012ff70133  (/lib/x86_64-linux-gnu/libc.so.6+0x89133)
2024-03-21T10:51:19.8233265Z client_1  | 
2024-03-21T10:51:19.8233651Z client_1  | 0x602011c36590 is located 0 bytes inside of 16-byte region [0x602011c36590,0x602011c365a0)
2024-03-21T10:51:19.8233896Z client_1  | freed by thread T19 here:
2024-03-21T10:51:19.8430779Z client_1  |     #0 0x7f01304ac6a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
2024-03-21T10:51:19.8431552Z client_1  |     #1 0x7f00af4a2ce3  (/live-runtime-artifacts/testhost/net9.0-linux-Release-x64/shared/Microsoft.NETCore.App/9.0.0/System.Private.CoreLib.dll+0x102ce3)
2024-03-21T10:51:19.8431932Z client_1  |     #2 0x7f00b1198871  (/memfd:doublemapper (deleted)+0x1059871)
2024-03-21T10:51:19.8432222Z client_1  |     #3 0x7f00b1143895  (/memfd:doublemapper (deleted)+0x1004895)
2024-03-21T10:51:19.8432500Z client_1  |     #4 0x7f00b1258425  (/memfd:doublemapper (deleted)+0x1119425)
2024-03-21T10:51:19.8432790Z client_1  |     #5 0x7f00b1016f53  (/memfd:doublemapper (deleted)+0xed7f53)
2024-03-21T10:51:19.8433418Z client_1  |     #6 0x7f00af5c7c41  (/live-runtime-artifacts/testhost/net9.0-linux-Release-x64/shared/Microsoft.NETCore.App/9.0.0/System.Private.CoreLib.dll+0x227c41)
2024-03-21T10:51:19.8434092Z client_1  |     #7 0x7f012e7ad726  (/live-runtime-artifacts/testhost/net9.0-linux-Release-x64/shared/Microsoft.NETCore.App/9.0.0/libcoreclr.so+0x49a726)
2024-03-21T10:51:19.8434516Z client_1  |     #8 0x7f012e5e8fc5 in CallDescrWorkerWithHandler(CallDescrData*, int) /repo/src/coreclr/vm/callhelpers.cpp:67
2024-03-21T10:51:19.8434936Z client_1  |     #9 0x7f012e5e8fc5 in DispatchCallSimple(unsigned long*, unsigned int, unsigned long, unsigned int) /repo/src/coreclr/vm/callhelpers.cpp:218
2024-03-21T10:51:19.8435362Z client_1  |     #10 0x7f012e5fe451 in ThreadNative::KickOffThread_Worker(void*) /repo/src/coreclr/vm/comsynchronizable.cpp:157
2024-03-21T10:51:19.8435777Z client_1  |     #11 0x7f012e5b8144 in ManagedThreadBase_DispatchInner(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7276
2024-03-21T10:51:19.8436179Z client_1  |     #12 0x7f012e5b8144 in ManagedThreadBase_DispatchMiddle(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7320
2024-03-21T10:51:19.8436948Z client_1  |     #13 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::$_0::operator()(ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::TryArgs*) const::{lambda(Param*)#1}::operator()(Param*) const /repo/src/coreclr/vm/threads.cpp:7478
2024-03-21T10:51:19.8437659Z client_1  |     #14 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::$_0::operator()(ManagedThreadBase_DispatchOuter(ManagedThreadCallState*)::TryArgs*) const /repo/src/coreclr/vm/threads.cpp:7480
2024-03-21T10:51:19.8438166Z client_1  |     #15 0x7f012e5b8144 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) /repo/src/coreclr/vm/threads.cpp:7504
2024-03-21T10:51:19.8438560Z client_1  |     #16 0x7f012e5b871c in ManagedThreadBase_FullTransition(void (*)(void*), void*, UnhandledExceptionLocation) /repo/src/coreclr/vm/threads.cpp:7524
2024-03-21T10:51:19.8438950Z client_1  |     #17 0x7f012e5b871c in ManagedThreadBase::KickOff(void (*)(void*), void*) /repo/src/coreclr/vm/threads.cpp:7559
2024-03-21T10:51:19.8439330Z client_1  |     #18 0x7f012e5fe527 in ThreadNative::KickOffThread(void*) /repo/src/coreclr/vm/comsynchronizable.cpp:228
2024-03-21T10:51:19.8439810Z client_1  |     #19 0x7f012e91c5dd in CorUnix::CPalThread::ThreadEntry(void*) /repo/src/coreclr/pal/src/thread/thread.cpp:1760
2024-03-21T10:51:19.8440285Z client_1  |     #20 0x7f012ff70133  (/lib/x86_64-linux-gnu/libc.so.6+0x89133)
2024-03-21T10:51:19.8440471Z client_1  | 

@rzikm
Copy link
Member Author

rzikm commented Mar 21, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 25, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 25, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 25, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 25, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 26, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm rzikm changed the title [NOREVIEW] Run HTTP3 stress with Adress sanitization Run HTTP3 stress with Adress sanitization Mar 26, 2024
@rzikm rzikm requested a review from a team March 26, 2024 18:39
@rzikm rzikm removed the NO-REVIEW Experimental/testing PR, do NOT review it label Mar 26, 2024
Copy link
Member

@wfurt wfurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Should we also use debug build of msquic as it has more asserts and sanity checks?

@rzikm
Copy link
Member Author

rzikm commented Mar 26, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

mkdir build && \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DQUIC_ENABLE_LOGGING=false -DQUIC_USE_SYSTEM_LIBCRYPTO=true -DQUIC_BUILD_TOOLS=off -DQUIC_BUILD_TEST=off -DQUIC_BUILD_PERF=off -DQUIC_TLS=openssl3 && \
cmake -B build -DCMAKE_BUILD_TYPE=Debug -DQUIC_ENABLE_LOGGING=false -DQUIC_USE_SYSTEM_LIBCRYPTO=true -DQUIC_BUILD_TOOLS=off -DQUIC_BUILD_TEST=off -DQUIC_BUILD_PERF=off -DQUIC_TLS=openssl3 -DQUIC_ENABLE_SANITIZERS=on && \
Copy link
Member

@antonfirsov antonfirsov Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use debug build of msquic as it has more asserts and sanity checks?

Doesn't this change the behavior way too much? Can we compare the total number of stress RPS main vs PR in a stable environment to get a rough feeling about the difference and see how important it is to have multi-config runs set up?

If we think that running msquic in debug is justified because of additional checks, the same reasoning might also justify running stress against debug builds of libraries code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that what perf tests would be for. In my mind the stress is just to shake out the tree. I'm certainly fine if we don't do it right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running stress against debug builds of libraries code.

That won't fly until #93713 is fixed.

Copy link
Member

@ManickaP ManickaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo nits.

mkdir build && \
cmake -B build -DCMAKE_BUILD_TYPE=Release -DQUIC_ENABLE_LOGGING=false -DQUIC_USE_SYSTEM_LIBCRYPTO=true -DQUIC_BUILD_TOOLS=off -DQUIC_BUILD_TEST=off -DQUIC_BUILD_PERF=off -DQUIC_TLS=openssl3 && \
cmake -B build -DCMAKE_BUILD_TYPE=Debug -DQUIC_ENABLE_LOGGING=false -DQUIC_USE_SYSTEM_LIBCRYPTO=true -DQUIC_BUILD_TOOLS=off -DQUIC_BUILD_TEST=off -DQUIC_BUILD_PERF=off -DQUIC_TLS=openssl3 -DQUIC_ENABLE_SANITIZERS=on && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running stress against debug builds of libraries code.

That won't fly until #93713 is fixed.

@rzikm
Copy link
Member Author

rzikm commented Mar 27, 2024

/azp run runtime-libraries stress-http

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rzikm
Copy link
Member Author

rzikm commented Mar 27, 2024

http stress builds and is running.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants