[wasm-ep] Implement DiagnosticServer and startup sessions for WebAssembly #72482

lambdageek · 2022-07-19T19:11:14Z

Add a diagnostic server for WebAssembly. Enable by building the runtime with /p:WasmEnablePerfTracing=true or /p:WasmEnableThreads=true.

To configure a project to start the diagnostic server, add this to the .csproj:

      <WasmExtraConfig Include="diagnostic_options" Value='
{
  "server": { "suspend": false, "connect_url": "ws://localhost:8088/diagnostics" }
}' />

The connect_url should be a WebSocket url serviced by dotnet-dsrouter server-websocket from this branch https://github.com/lambdageek/diagnostics/tree/wasm-server

Note that setting "suspend": true will hang the browser tab until a diagnostic tool such as dotnet-trace collect connects to the dsrouter.

Implement creating VFS file based sessions at runtime startup. Add the following to a .csproj:

    <WasmExtraConfig Include="diagnostic_options" Value='
{
  "sessions": [ { "collectRundownEvents": "true", "providers": "WasmHello::5:EventCounterIntervalSec=1" } ]
}' />

That will create and start one or more EventPipe sessions that will store their results into the VFS.

The startup session can be retrieved via MONO.diagnostics.getStartupSessions(). Each session s should be stopped via s.stop() and the data can then be extraced in a Blob using s.getTraceBlob().

This is orthogonal to the diagnostic server support. You don't need dotnet-dsrouter running on the host. But you do need access to JavaScript on the main thread.

Notes/Issues:

Tree shaking: I verified that if threads are not available, all the TypeScript diagnostics code is removed.
Right now the server is not very robust to dotnet-dsrouter stopping, or starting after the runtime starts. The ideal order is to start dotnet-dsrouter first, and then open the browser
Unrelated housekeeping fixes:
- Tell wasm.proj about all the subdirectories with .ts files - makes incremental builds notice changes in subdirectories.
- Add a rollup dependencies property to quiet a warning about node/buffer
- There's a mock implementation of a "websocket" that was used for protocol testing. I verified that tree-shaking removes this in thread-enabled Release builds.
- Bump PTHREAD_POOL_SIZE to 4 and set PTHREAD_POOL_SIZE_STRICT=2 (returns EAGAIN from pthread_create if the pool needs to grow). The previous setting PTHREAD_POOL_SIZE_STRING=1 (warn and try to grow the pool) seemed to lead to hangs. Usually that means the main thread is creating a thread and immediately calling pthread_join without returning to JS. We should investigate separately.
- The only implemented diagnostic server commands are CollectTracing2, StopCollecting and ResumeRuntime. None of the Dump, Process and Profiler commands are implemented and the server will crash if it receives them. It should be relatively straightforward to return a "command unsupported" reply (which would allow clients to gracefully disconnect), but it's not done yet.
In some error states the runtime kills the browser tab with the following in a terminal window (if Chrome is started from a terminal: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory). This probably means we're hitting a loop somewhere that rapidly exhausts JIT memory, but it's difficult to investigate when the JS console dies, too (happens with chrome stable v103 and chrome beta v104).

Fixes #69674, contributes to #72481

Currently not wired up to the runtime

IDs out

The issue is that once we're setting up streaming sessions, we will need to send back a DS IPC reply with the session id before we start streaming. So it's better to just call back to JS when we start up and setup all the EP sessions from JS so that when we return to C everything is all ready.

more printfs

We won't be using the native C version

we will implement DS in JS

Combine mono_wasm_init_diagnostics and configure_diagnostics. Now that finalize_startup is async, we can pause to wait for the diagnostic server to startup at this point

once we're able to start the DS server, make it log a ping to the console

Clean up some of the old WIP code - we will probably not send configuration strings from the diagnostic server back to the main thread.

It doesn't work right now because the MessagePort is not created until the server thread attaches to Mono, which doesn't happen because it's started before Mono. Also it doesn't yet send a resume event, so the main thread just blocks forever

Start the diagnostic server and have it perform the open/advertise steps with the mock.

we're not going ot need a JS version of the DS EP streaming sessions. And file-based auto-stop sessions are not going in

src/mono/wasm/runtime/startup.ts

lambdageek · 2022-07-22T13:10:49Z

It would be good to have JavaScript-only unit tests for complex components at some point. https://jestjs.io/ is good choice I think.

I'll play around with that this afternoon. Probably won't be part of this PR unless it turns out to be completely trivial to add the unit tests to our setup.

Allow each test project to specify its own mock script. Also provide TypeScript declarations for the mocking interfaces Also always use binary protocol commands - don't send json for mocks.

lambdageek · 2022-07-22T19:44:29Z

verified one more time that with threading or perftracing disabled, all of the related JS is removed by tree-shaking

pavelsavara · 2022-07-25T08:53:30Z

src/mono/wasm/runtime/diagnostics/README.md

+- `shared/` type definitions to be shared between the worker and browser main thread
+- `mock/` a utility to fake WebSocket connectings by playing back a script.  Used for prototyping the diagnostic server without hooking up to a real WebSocket.
+
+## Mocking


What is it we are mocking and why ? (high level intro)

Do I understand that the diagnosed process is a "server" and the tooling outside of the browser is a "client" ? And so, we are mocking the client here and it's requests ?

pavelsavara · 2022-07-25T09:10:44Z

src/mono/wasm/runtime/diagnostics/server_pthread/mock-remote.ts

+    if (monoDiagnosticsMock) {
+        const mockPrefix = "mock:";
+        const scriptURL = mockURL.substring(mockPrefix.length);
+        return import(scriptURL).then((mockModule) => {


CJS version of the runtime is out of scope, right ?

Yea, if there's a need for writing the mocking script using CJS, we can add it here. But since the goal here is just to have a tool to improve the server's robustness (for example: make scripts that send garbage and verify that the DS can recover), I don't think there's a need for CJS. This isn't something we're ever going to ship. It's just for better testing.

pavelsavara · 2022-07-25T09:13:44Z

src/mono/wasm/runtime/diagnostics/shared/create-session.ts

+    if (!cwraps.mono_wasm_event_pipe_enable(tracePath, ipcStreamAddr, options.bufferSizeInMB, options.providers, options.rundownRequested, sessionIdOutPtr)) {
+        return false;
+    } else {
+        return memory.getI32(sessionIdOutPtr);


getU32 for pointers

pavelsavara · 2022-07-25T09:15:44Z

src/mono/wasm/runtime/polyfills.ts

+                for (const sub of subscribers) {
+                    const listener = sub.listener;
+                    if (sub.oneShot) {
+                        this.removeEventListener(event.type, listener);


do we need copy of subscribers because of the change in the loop ?

pavelsavara · 2022-07-25T09:20:12Z

src/mono/wasm/runtime/pthreads/worker/index.ts

 ///    });
 export const currentWorkerThreadEvents: WorkerThreadEventTarget =
    MonoWasmThreads ? new EventTarget() : null as any as WorkerThreadEventTarget; // treeshake if threads are disabled

 function monoDedicatedChannelMessageFromMainToWorker(event: MessageEvent<string>): void {
-    console.debug("got message from main on the dedicated channel", event.data);
+    console.debug("MONO_WASM: got message from main on the dedicated channel", event.data);


After we merge both this and my PR, we could prefix this with if (runtimeHelpers.diagnostic_tracing) so that by default the runtime is not noisy. It matters in console like nodeJS. @maraf suggested it's about a time when we introduce logging component to centralize all this logic to one place.

pavelsavara

LGTM, none of my other comments are blocking merge.

- improve diagnostics mock README - note that mocking just uses ES6 modules, testing with CJS is not supported right now. - fix iteration over listeners when dispatching a one-shot event in the EventTargt polyfill - use U32 getter in EP session creation

lambdageek · 2022-07-25T17:09:35Z

https://helix.dot.net/api/2019-06-17/jobs/d29bfb53-4790-47f4-8a1e-aa868ec7afd5/workitems/System.Net.Mail.Functional.Tests/console failure is unrelated

/__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

pavelsavara · 2022-07-27T13:26:06Z

/azp run runtime-wasm

lambdageek and others added 30 commits July 16, 2022 13:02

[wasm] Enable the tracing component if threading is supported

e73537f

WIP: add a way to specify EP sessions in the MonoConfig

8c3360a

Currently not wired up to the runtime

Add a mechanism to copy startup configs into the runtime and session

6334ee7

IDs out

WIP: C side startup provider copying

4897241

checkpoint: starting a session at startup works

b232adb

WIP: checkpoint add a controller and a webworker for DS

3d22fe1

WIP checkpoint EventPipeIPCSession class skeleton

ee53f05

WIP checkpoint: runtime crashes; but WS from JS works

6ddb829

WIP: diagnostic server

5c4aa81

XXX PrintfDebuggingHacks

dfcc613

WIP some bits of the websocket worker

0daa0e0

WIP some notes on diagnostics and JS workers

377a50a

fix eslint

4a17d5d

debug printfs etc

fd8df98

more printfs

WIP: start moving the diagnostic server to a JS pthread

f235f55

WIP: move things around

431e2c6

cleanup

563b31a

notes

5e999a3

[diagnostic_server] wasm-specific fn_table

2906e1e

We won't be using the native C version

[wasm-ep] disable DS connect ports in C, too

6c249f0

we will implement DS in JS

asyncify finalize_startup; make 1 diagnostics init function

567484c

Combine mono_wasm_init_diagnostics and configure_diagnostics. Now that finalize_startup is async, we can pause to wait for the diagnostic server to startup at this point

(not implemented) set browser-eventpipe sample to start a DS server

199cc80

ping in the DS server (not functional yet)

c3e7fe1

once we're able to start the DS server, make it log a ping to the console

Start diagnostic server pthread

e64e539

Clean up some of the old WIP code - we will probably not send configuration strings from the diagnostic server back to the main thread.

WIP try to start the server

00f73ba

It doesn't work right now because the MessagePort is not created until the server thread attaches to Mono, which doesn't happen because it's started before Mono. Also it doesn't yet send a resume event, so the main thread just blocks forever

WIP diagnostic server server

f803c3c

Add a mock WebSocket connection to simulate the remote end

58851b4

Start the diagnostic server and have it perform the open/advertise steps with the mock.

cleanup diagnostics.ts

c497c59

we're not going ot need a JS version of the DS EP streaming sessions. And file-based auto-stop sessions are not going in

wasm-mt: use a PThreadSelf struct instead of a raw MessagePort

74d8443

lambdageek added 2 commits July 20, 2022 16:12

merge fixup

b8b2148

fix bug in queue_push_sync main thread detection

0de9719

lambdageek mentioned this pull request Jul 21, 2022

[wasm] .NET WebAssembly profiling and memory diagnostics #69268

Closed

runfoapp bot mentioned this pull request Jul 21, 2022

Methodical_others test JIT/Methodical/Coverage/copy_prop_byref_to_native_int crashing #69832

Open

lambdageek and others added 4 commits July 21, 2022 16:49

fix typo

17d12e8

Merge remote-tracking branch 'origin/main' into wasm-ep-on-startup

6abe463

merge fixup

d2f0acd

fix rollup warning when making the crypto worker

2e135e7

pavelsavara reviewed Jul 22, 2022

View reviewed changes

src/mono/wasm/runtime/startup.ts Outdated Show resolved Hide resolved

lambdageek and others added 3 commits July 22, 2022 09:22

add MONO_WASM: prefix to logging

7da9d41

make diagnostic server mocking friendlier

f4219a6

Allow each test project to specify its own mock script. Also provide TypeScript declarations for the mocking interfaces Also always use binary protocol commands - don't send json for mocks.

disable mocking in the sample project by default

18e199a

lambdageek added 2 commits July 24, 2022 09:19

Merge remote-tracking branch 'origin/main' into wasm-ep-on-startup

4bdbbfa

fixup after merge

cbd4691

pavelsavara reviewed Jul 25, 2022

View reviewed changes

pavelsavara approved these changes Jul 25, 2022

View reviewed changes

review feedback

2fe9a3e

- improve diagnostics mock README - note that mocking just uses ES6 modules, testing with CJS is not supported right now. - fix iteration over listeners when dispatching a one-shot event in the EventTargt polyfill - use U32 getter in EP session creation

lambdageek merged commit 7aaa279 into dotnet:main Jul 25, 2022

runfoapp bot mentioned this pull request Jul 25, 2022

System.Net.* cancellation tests fail in CI #72818

Closed

ghost locked as resolved and limited conversation to collaborators Aug 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wasm-ep] Implement DiagnosticServer and startup sessions for WebAssembly #72482

[wasm-ep] Implement DiagnosticServer and startup sessions for WebAssembly #72482

lambdageek commented Jul 19, 2022

lambdageek commented Jul 22, 2022

lambdageek commented Jul 22, 2022

pavelsavara Jul 25, 2022

pavelsavara Jul 25, 2022

pavelsavara Jul 25, 2022

lambdageek Jul 25, 2022

pavelsavara Jul 25, 2022

pavelsavara Jul 25, 2022

pavelsavara Jul 25, 2022

lambdageek Jul 25, 2022

pavelsavara left a comment

lambdageek commented Jul 25, 2022

pavelsavara commented Jul 27, 2022

[wasm-ep] Implement DiagnosticServer and startup sessions for WebAssembly #72482

[wasm-ep] Implement DiagnosticServer and startup sessions for WebAssembly #72482

Conversation

lambdageek commented Jul 19, 2022

lambdageek commented Jul 22, 2022

lambdageek commented Jul 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavelsavara left a comment

Choose a reason for hiding this comment

lambdageek commented Jul 25, 2022

pavelsavara commented Jul 27, 2022