Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Targetting wasm-32-wasmrt-webgpu seems to require native webgpu too? #8063

Closed
ziriax opened this issue Feb 5, 2024 · 19 comments
Closed

Targetting wasm-32-wasmrt-webgpu seems to require native webgpu too? #8063

ziriax opened this issue Feb 5, 2024 · 19 comments
Labels
backport me This change should be backported to release versions

Comments

@ziriax
Copy link

ziriax commented Feb 5, 2024

I was trying to modify the HelloWasm app to use WebGPU for running in a browser.

But CMake fails:

Could NOT find Halide_WebGPU (missing: Halide_WebGPU_NATIVE_LIB)

Why is Halide_WebGPU_NATIVE_LIB required when targetting the web browser?

The command that fails is:

set(target "wasm-32-wasmrt-webgpu")
set(config "wasm")

foreach (gen IN ITEMS init update render)
    add_halide_library(${config}_${gen} FROM reaction_diffusion_generator
                        GENERATOR reaction_diffusion_${gen}
                        FUNCTION_NAME reaction_diffusion_${gen}
                        PARAMS threads=false
                        TARGETS ${target})
endforeach ()

After the following fix in HalideGeneratorHelpers.cmake, it worked:

if ("${ARGN}" MATCHES "webgpu")
if ("${ARGN}" MATCHES "host-webgpu")

PS: However, the wasn't enough to actually make the app use WebGPU (how can I know?), so more help is appreciated 😉

@steven-johnson
Copy link
Contributor

This should be covered in README_webgpu.md -- please take a look and let me know if that doesn't explain.

@ziriax
Copy link
Author

ziriax commented Feb 5, 2024

Thanks.

Yes, I was following that guide, for Emscripten, but this line always requires the presence of a native webgpu module it seems?

When HL_TARGET=wasm-32-wasmrt-webgpu, the predicate if ("${ARGN}" MATCHES "webgpu") will be true, and a native webgpu module is searched by cmake through FindHalide_WebGPU.cmake? But that should not be required when targetting the web browser IMO.

The project builds fine without it (although HelloWasm doesn't contain a GPU scheduler, so I wasn't able to verify if it all worked)

@steven-johnson
Copy link
Contributor

if ("${ARGN}" MATCHES "host-webgpu")

In general, this is not a good fix, because host is just shortcut for whatever the current compiler generates, so there are an arbitrary number of possible matches. A better fix (I suspect) is to check for not-webassembly, eg

    if ("${ARGN}" MATCHES "webgpu" AND NOT "${ARGN}" MATCHES "wasm")

Please try this and let me know if it solves your problem.

@ziriax
Copy link
Author

ziriax commented Feb 6, 2024

This seemed to work, but when I actually schedule for the GPU, I get the error:

FAILED: reaction_diffusion_render.h reaction_diffusion_render.o reaction_diffusion_render.registration.cpp /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.h /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.o /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.registration.cpp
cd /home/Halide/HelloWasmWebGPU/build && /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_generator -n reaction_diffusion_render -d 0 -g reaction_diffusion_render -f reaction_diffusion_render -e c_header,object,registration -o . target=host-no_runtime threads=true
Unhandled exception: Error: Schedule for Func render requires <Default_GPU> but no compatible target feature is enabled in target x86-64-linux-avx-avx2-avx512-avx512_cannonlake-avx512_skylake-f16c-fma-no_runtime-sse41

target=host-no_runtime seems wrong here.

Somehow I need to tell cmake that target=wasm-32-wasmrt-webgpu

I've create a fork: https://github.com/StronglyTypedSolutions/Halide/tree/pv/hello-webgpu/apps/HelloWasmWebGPU

@steven-johnson
Copy link
Contributor

This seemed to work, but when I actually schedule for the GPU, I get the error:

FAILED: reaction_diffusion_render.h reaction_diffusion_render.o reaction_diffusion_render.registration.cpp /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.h /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.o /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_render.registration.cpp
cd /home/Halide/HelloWasmWebGPU/build && /home/Halide/HelloWasmWebGPU/build/reaction_diffusion_generator -n reaction_diffusion_render -d 0 -g reaction_diffusion_render -f reaction_diffusion_render -e c_header,object,registration -o . target=host-no_runtime threads=true
Unhandled exception: Error: Schedule for Func render requires <Default_GPU> but no compatible target feature is enabled in target x86-64-linux-avx-avx2-avx512-avx512_cannonlake-avx512_skylake-f16c-fma-no_runtime-sse41

target=host-no_runtime seems wrong here.

Somehow I need to tell cmake that target=wasm-32-wasmrt-webgpu

I've create a fork: https://github.com/StronglyTypedSolutions/Halide/tree/pv/hello-webgpu/apps/HelloWasmWebGPU

When configuring CMake, use -DHalide_TARGET=wasm-32-wasmrt-webgpu to the cmake commandline

steven-johnson added a commit that referenced this issue Feb 6, 2024
* Don't require Halide_WebGPU when using wasm (#8063)

* trigger buildbots
@ziriax
Copy link
Author

ziriax commented Feb 7, 2024

Thanks! Sorry for these silly questions, I am completely new to Halide.

Now indeed that step passes, but it fails further down the line with:

FAILED: run 
: && /usr/bin/c++ -O3 -DNDEBUG  CMakeFiles/run.dir/reaction_diffusion_init.registration.cpp.o CMakeFiles/run.dir/reaction_diffusion_update.registration.cpp.o CMakeFiles/run.dir/reaction_diffusion_render.registration.cpp.o CMakeFiles/run.dir/home/install/halide/share/tools/RunGenMain.cpp.o -o run  reaction_diffusion_init.a  reaction_diffusion_update.a  reaction_diffusion_render.a  /usr/lib/x86_64-linux-gnu/libpng.so  /usr/lib/x86_64-linux-gnu/libz.so  /usr/lib/x86_64-linux-gnu/libjpeg.so  reaction_diffusion_init.runtime.a  reaction_diffusion_update.runtime.a  reaction_diffusion_render.runtime.a  -ldl && :
/usr/bin/ld: reaction_diffusion_init.a: error adding symbols: file format not recognized
collect2: error: ld returned 1 exit status

That seems logical, because it is caused by the run target that is for the unit tests, and that requires the native target.

So when disabling


# add_executable(run ${init_cpp} ${update_cpp} ${render_cpp})
# target_link_libraries(run PRIVATE Halide::RunGenMain reaction_diffusion_init reaction_diffusion_update reaction_diffusion_render)

# add_test(NAME init_bench COMMAND run --benchmarks=all --benchmark_min_time=1 --name=reaction_diffusion_init --output_extents=[1024,1024] --parsable_output)
# add_test(NAME update_bench COMMAND run --benchmarks=all --benchmark_min_time=1 --name=reaction_diffusion_update --output_extents=[1024,1024] frame=0 mouse_x=0 mouse_y=0 state=random:0:[1024,1024,3] --parsable_output)
# add_test(NAME render_bench COMMAND run --benchmarks=all --benchmark_min_time=1 --name=reaction_diffusion_render --output_extents=[1024,1024] state=random:0:[1024,1024,3] --parsable_output)

it builds fine!

But, it doesn't run, Chrome reports the following error:

Aborted(Assertion failed: descriptor == nullptr, at: system/lib/webgpu/webgpu.cpp,22,wgpuCreateInstance)

I can't find any example of Halide using WebGPU in the browser, so I'm stuck a bit 😉

@steven-johnson
Copy link
Contributor

Adding @jrprice to see if he has any insight on why wgpuCreateInstance might fail.

Dumb q: is WebGPU enabled in your browser?

@jrprice
Copy link
Contributor

jrprice commented Feb 7, 2024

It looks like the descriptor == nullptr assertion is relatively new in Emscripten, added here:
https://github.com/emscripten-core/emscripten/pull/21113/files

I'll talk to the author of that PR and see if we can assert the contents of the descriptor instead of the whole descriptor itself. In the meantime you could either rollback Emscripten to before that change, or hack Halide to pass nullptr to wgpuCreateInstance().

@ziriax
Copy link
Author

ziriax commented Feb 8, 2024

This indeed gets me one step in the right direction, but then the Javascript raises an exception because the ErrorFilter enums in the header does not match the one in Emscripten, so an exception is thrown.

wgpuDevicePushErrorScope is called with a value Validation of 0

typedef enum WGPUErrorFilter {
    WGPUErrorFilter_Validation = 0x00000000,
    WGPUErrorFilter_OutOfMemory = 0x00000001,
    WGPUErrorFilter_Internal = 0x00000002,
    WGPUErrorFilter_Force32 = 0x7FFFFFFF
} WGPUErrorFilter WGPU_ENUM_ATTRIBUTE;

However, in Emscripten, the enum is different (off by one):

    ErrorFilter: {
      1: 'validation',
      2: 'out-of-memory',
      3: 'internal',
    }

Emscripten however assumes the filter argument matches the enums, so for 0, undefined is passed to pushErrorScope. This strangely enough hangs the browser, the code goes into an animation loop anyway, but that's another issue.

  wgpuDevicePushErrorScope: (deviceId, filter) => {
    var device = WebGPU.mgrDevice.get(deviceId);
    device["pushErrorScope"](WebGPU.ErrorFilter[filter]);
  },

The question is, who is right?

In the webgpu-native repo, the enums are also 0 based.

In Emscripten, this pull request removed None=0 from the enum...

If will change the code by using the HALIDE_RUNTIME_WEBGPU_NATIVE_API macro, and see what goes wrong next

@jrprice
Copy link
Contributor

jrprice commented Feb 8, 2024

Emscripten is right and matches Dawn; webgpu-native hasn't caught up yet.

I'll work on getting Halide's WebGPU APIs updated to match Dawn+Emscripten now, which will hopefully resolve all of these issues.

Apologies for the problems - the WebGPU native API is still in flux but I'm hearing signals that it might be starting to stabilize soon.

@steven-johnson
Copy link
Contributor

Let me know if you need assistance -- I can takepoint on upgrading emcc/dawn on all the buildbots if necessary.

@ziriax
Copy link
Author

ziriax commented Feb 8, 2024

Apologies for the problems - the WebGPU native API is still in flux but I'm hearing signals that it might be starting to stabilize soon.

I am glad I could be of assistance, and thanks for the help!

PS: My hobby project is https://vikid.net, and I was thinking about using Halide for high performance reactive systems, like particle systems. Just having fun.

@jrprice
Copy link
Contributor

jrprice commented Feb 8, 2024

OK, with this Halide PR and this Emscripten PR, Halide+WASM+WebGPU should be working again. I've tested this with a modified version of the apps/blur sample, which works for me in Chrome with these changes.

@steven-johnson
Copy link
Contributor

I've updated and restarted all our buildbots that test WebGPU, so the tests after this point should be valid.

@steven-johnson
Copy link
Contributor

oh wait, do I need to wait for that EMCC change to land?

@steven-johnson steven-johnson added the backport me This change should be backported to release versions label Feb 8, 2024
@steven-johnson
Copy link
Contributor

(We may want to backport this fix to the recent Halide 17 release)

@jrprice
Copy link
Contributor

jrprice commented Feb 8, 2024

oh wait, do I need to wait for that EMCC change to land?

Not sure. The API difference fixed by that change has been there a while so if the bots have been passing then maybe it's not necessary? Although that would suggest that the Emscripten path isn't being tested on the bots.

In any case, the Emscripten PR has landed.

@ziriax
Copy link
Author

ziriax commented Feb 8, 2024

You guys rock!

@ziriax
Copy link
Author

ziriax commented Feb 9, 2024

Closing this, it works now. I still need to figure out how to do WebGPU interop between Javascript and wasm (e.g. "passing" a WebGPU buffer from JS to WASM), but that is emscripten stuff I guess, not related to Halide. Thanks!

@ziriax ziriax closed this as completed Feb 9, 2024
steven-johnson added a commit that referenced this issue Feb 19, 2024
* Don't require Halide_WebGPU when using wasm (#8063)

* trigger buildbots
steven-johnson added a commit that referenced this issue Feb 19, 2024
…elease (#8106)

* Don't require Halide_WebGPU when using wasm (#8063) (#8065)

* Don't require Halide_WebGPU when using wasm (#8063)

* trigger buildbots

* [WebGPU] Update to latest native headers (#8081)

* [WebGPU] Update to latest native headers

* Remove #ifdef for `requiredFeature[s]Count`

* Pass nullptr to wgpuCreateInstance
  * Emscripten currently requires this
  * Dawn accepts it too

* Use nullptr for another wgpuCreateInstance call

---------

Co-authored-by: James Price <[email protected]>
ardier pushed a commit to ardier/Halide-mutation that referenced this issue Mar 3, 2024
* Don't require Halide_WebGPU when using wasm (halide#8063)

* trigger buildbots
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport me This change should be backported to release versions
Projects
None yet
Development

No branches or pull requests

3 participants