Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about the GPU? #273

Closed
BrendanEich opened this issue Jul 20, 2015 · 21 comments
Closed

What about the GPU? #273

BrendanEich opened this issue Jul 20, 2015 · 21 comments

Comments

@BrendanEich
Copy link

https://github.com/WebAssembly/design/FutureFeatures.md has long-SIMD, you can see other uses of SIMD by searching the design repository. Searching for GPU finds nothing; if you search closed issues, the GPU is invoked to broaden thinking about subnormals and similar such edge cases.

My question for everyone: should we consider lifting WebGL primitives -- or really OpenGL/ES3.1 and beyond -- into WebAssembly? Treating WebGL as a black box API has the following drawbacks:

  1. WebGL lags OpenGL/ES3.1 and common extensions such as ARB_compute_shader, never mind OpenGL 4.5 (non-ES) or Vulkan.
  2. Separating concerns loses vs. exposing GPU primitives via WebAssembly, which enables compilers to move if/else logic to the GPU via a predication layer, and do other optimizations that span the CPU and GPU.

See some work by Khronos Group that started from LLVM IR and diverged into SPIR-V:

https://www.khronos.org/spir

Of course the WebAssembly community group can say "not in scope, use WebGL or SPIR-V or whatever the embedding provides." That's a possible answer, but I suspect we should not "default" into it for want of asking the question.

I know folks at OTOY would be interested in this approach. It does not need to slow anything down in WebAssembly as scoped so far. I'm really asking whether the GPU is in-bounds as a hardware unit to program via WebAssembly in the same way, as directly and with full optimization wins, as the SIMD units and the CPU are. Thanks,

/be

@pizlonator
Copy link
Contributor

We already have tensions due to bikeshed differences between arm and x86. I think that adding another set of hardware targets would create more tension: more operations would either have to be slow due to emulation costs to get uniform semantics on all targets, or more operations would have to have undefined behavior to allow everyone to run fast. I think that makes it unprofitable to consider the GPU at this time (or ever).

-Fil

On Jul 20, 2015, at 4:16 PM, Brendan Eich [email protected] wrote:

https://github.com/WebAssembly/design/FutureFeatures.md has long-SIMD, you can see other uses of SIMD by searching the design repository. Searching for GPU finds nothing; if you search closed issues, the GPU is invoked to broaden thinking about subnormals and similar such edge cases.

My question for everyone: should we consider lifting WebGL primitives -- or really OpenGL/ES3.1 -- into WebAssembly? Treating WebGL as a black box API has the following drawbacks:

WebGL lags OpenGL/ES3.1 and common extensions such as ARB_compute_shader, never mind OpenGL 4.5 (non-ES) or Vulkan.

Exposing GPU primitives via WebAssembly enables compilers to move if/else logic to the GPU via a predication layer, and do other optimizations that span the CPU and GPU.

For some work by Khronos Group that started from LLVM IR and diverged into SPIR-V:

https://www.khronos.org/spir

Of course the WebAssembly community group can say "not in scope, use WebGL or SPIR-V or whatever the embedding provides." That's a possible answer, but I suspect we should not "default" into it for want of asking the question.

I know folks at OTOY would be interested in this approach. It does not need to slow anything down in WebAssembly as scoped so far. I'm really asking whether the GPU is in-bounds as a hardware unit to program via WebAssembly in the same way, as directly and with full optimization wins, as the SIMD units and the CPU are. Thanks,

/be


Reply to this email directly or view it on GitHub.

@BrendanEich
Copy link
Author

Hi Fil -- thanks, fair answer -- except the "(or ever)" bit :-P.

There's nothing particularly magical about Khronos that gives them a perpetual lock on the GPU. Of course GPU vendors feel tension between standardization/interoperation, and new special and divergent features, but the extension model handles most of this. Radically different GPU architectures will require new thinking, for sure -- out of scope for WebAssembly and WebGL (and even Vulkan).

But CPU (and memory model, with such things as store order), including floating point, and SIMD, with all the speciation we still see, have converged enough to be included in WebAssembly. Up to at least very useful OpenGL/ES3.1 levels of interop, I claim that GPUs have converged as much.

More comments welcome. Not looking for more work, believe me. I'm looking for the equivalent of Occam's razor, or even an empirical law, that lets us defer GPU for now, or even forever. You may have a good case for "not now" -- community group is a consensus thing at best, so the group should decide. I don't think you've made a case for "not ever".

/be

@jfbastien
Copy link
Member

It's a (mostly silent) goal of mine for WebAssembly to be able to eventually target GPUs, but I don't think it's an MVP feature at all. I think it's doable but difficult, as discussed in #41: the C++ standard committee has been looking at standardizing fixed-width and variable-width SIMD for a while, and I think WebAssembly will probably want to adopt a similar approach. We can engage with the same vendors (NVIDIA and Intel), but I suspect that we'll get to the same discussions as those the C++ standards committee is having (and which I attend).

I'm hoping that we can define a "fast" subset of WebAssembly that'll work well for such targets, and that other operations Just Work™ but may be slow (heavy divergence, exceptions, ...).

@kg
Copy link
Contributor

kg commented Jul 21, 2015

SIMD isn't particularly important for targeting GPUs, at least at a basic level. IIRC AMD and NVidia's GPU architectures have been scalar (not SIMD) for a long while now, and PowerVR is VLIW. Not sure about the other android architectures or Intel but I wouldn't be shocked if 4-element/etc SIMD is falling out of fashion entirely on GPUs. Scalar GPU compute provides better scalability and parallelism, at least for workloads that can parallelize. See ftp://download.nvidia.com/developer/cuda/seminar/TDCI_Arch.pdf for some (outdated, to be fair) context.

@BrendanEich
Copy link
Author

@jfbastien: right, way post-MVP, which is why I mentioned FutureFeatures.md only.

@kg: GPU should give 40x or better parallelism, agree SIMD is not the right model.

/be

@sunfishcode
Copy link
Member

If the problem here is just the pace of WebGL standardization, then it's out of scope here because it's an independent API concern.

Other than that, what we'd need more than anything else to make progress in this space is people stepping up. The first step is for someone to step up with a pull request for what we might add to FutureFeatures.md to attract the kinds of ideas that would be useful to consider :-).

@lukewagner
Copy link
Member

My main question (for including a mention of GPUs in future features) is what problems we'd be trying to address. Since the overall memory/processing models of CPUs/GPUs are still intentionally quite distinct (even if the low-level instruction sets are converging), it seems like we wouldn't get any sorts of magic "run it on either a CPU or GPU" portability. Rather, it seems like the wins would be more around a unification of tooling/code at the different levels of the pipeline. Anything else?

@sunfishcode sunfishcode added this to the Future Features milestone Jul 21, 2015
@sunfishcode
Copy link
Member

Another question is whether GPU ISA convergence (including mobile GPUs) is really approaching CPU levels or not. For example, floating point division by zero produces an undefined result in SPIR-V. We won't be willing to take such things lightly in WebAssembly.

@kg
Copy link
Contributor

kg commented Jul 22, 2015

The computational model is probably always going to have differences, since the architectures have to be different to solve their specific concerns. Intel had to face this reality with Larrabee.

It's quite reasonable to consider the target of 'basic compute/logic algorithms written in WebAssembly can be cross-compiled to OpenCL or GL shaders' and make that work. For some use cases that will be much better than nothing, because it provides a sensible fallback for both JS-only and wasm-capable browser runtimes. Incremental performance improvements like that provide good results in many cases.

I'd argue that nobody will ever be writing all their shaders/GPU compute exclusively in WebAssembly (nor would they want to if they care about performance), but we can provide a nice medium.

@titzer
Copy link

titzer commented Jul 23, 2015

I agree with @lukewagner that the wins would mostly be around unifying tooling, making WebAssembly a useful static IR for interchange and storage, but not necessarily settling on a computational model for GPU programs.

@sunfishcode
Copy link
Member

I'm going to close this issue since we appear to have answered the questions and don't have anything actionable remaining here. If someone wants us to do something more on this topic, they're still welcome to re-open or file new issues or pull requests.

@milkowski
Copy link

I just want to point out, that the CPU+GPU unification is already happening (AMD APU, Intel Iris HD, mobile SoCs) and is more and more tight integrated at the system/hardware level (cache coherency, MMU and address space sharing, heterogeneous IR (the RISC-V project is potentially also heterogeneous ISA), etc.), so the question is not IF but HOW to embrace that, because it is been already widely adopted.
Fortunately this is mainly vendor issue, so they need to work hard to provide right solutions, and they already did. I would suggest to consider HSA specification and its ideas while developing current spec, not to necessarily adopting right now, it but at least be aware in design process not to shut in wasm foot with features/concepts that will prevent or make such things harder in the future releases.

@sunfishcode
Copy link
Member

The most helpful way to get things started here would be to file issues pointing out specific ideas, features, or concepts in HSA, SPIR-V, RISC-V, or others, that WebAssembly should consider.

@bhack
Copy link

bhack commented Aug 29, 2015

As pointed by @keryell there was already some common issues discussed in the LLVM mailinglist that impact both LLVM RFC of SPIR-V and WebAssembly

@bhack
Copy link

bhack commented Aug 29, 2015

@keryell Do you know if there is someone at AMD already active in WebAssembly design?

@keryell
Copy link

keryell commented Sep 4, 2015

@bhack I do not know. Actually I no longer work at AMD, so I cannot even talk about it anyway. But I am still working on some similar subjects at Xilinx. So s/GPU/FPGA/g for now on my side. :-)
So what about having WebAssembly on FPGA too? :-)

@bhack
Copy link

bhack commented Sep 4, 2015

@keryell Nice. Are you still working on SPIR-V on llvm? Do you see some overlapping with WebAssembly effort?

@Darelbi
Copy link

Darelbi commented Jun 1, 2016

It should be usefull to put Vulkan into WebAssembly or at least make it use Vulkan but without the need for all the usual initialization stuff. While I think Vulkan is great, I think also that in some places they just added to much complication (while I totally agree in preparing a "Pipeline" object) I don't see the utility in the fuss of managing video cards/ extensions few other things directly. Note that there may be no incentives from GPU vendors in designing a slightly simpler version of Vulkan for WebAssembly (in example how the heck would all intermediate Gaming software layer from NVidia and Intel would integrate with WebAssembly? I can see only AMD would benefit from bringing low level graphics into WebAssembly.)

My opinion is that first there should be some kind of API that simplify the interfacing with Vulkan, then such interface can become a defacto standard for webAssembly. While this would not allow to use the maximal possible GPU power that would be already ways better than current status (too high CPU overhead). That would still leave market for Desktop an Console Games while allowing stunning graphics even in browser (one should note that increasing GPU usage requires very heavy assets, that are unluckily needed to be downloaded).

@bhack
Copy link

bhack commented Feb 18, 2017

People still interested in this topic could follow gpuweb/admin#1 (comment)

@ghost
Copy link

ghost commented Jun 26, 2017

Please, make GPU for WebAssembly, you promised around 2 years.... Khronos...

@skhameneh
Copy link

I would like some standardized extensions, WASI might be a good place to gather information on similar community initiatives.

https://github.com/bytecodealliance/wasmtime/blob/master/docs/WASI-api.md

Alternatively, a SharedMemoryBuffer could be used with a JavaScript rendering library in the interim as a PoC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests