-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFusion does not support wasm32-unknown-unknown target #177
Comments
Polars proof of concept (shows that arrow-rs and datafusion like API can work): https://github.com/ritchie46/polars/blob/master/js-polars/app.js |
#218 is a great step helping this |
@alamb @Dandandan @jorgecarleitao
It would be relatively easy for me to do the two feature flags unless someone has an objection? |
Note that apache/arrow-rs#656 from @PsiACE has removed the
I think lz4 is an optional dependency of parquet: https://github.com/apache/arrow-rs/blob/master/parquet/Cargo.toml#L40 thus perhaps we could just have a lz4 feature flag for datafusion? |
Thanks @alamb . |
I think the biggest hurdles is using |
@ivanceras I am not sure if we are using tokio specific It might be worth trying to replace tokio specific structs with things from the |
@ivanceras what are you experiencing? I have managed to compile to WASM with very slight code modifications. |
@seddonm1 compile and run? I have experimented with that yesterday. I tried wasm32-wasi first and a simple sample works in single threaded mode after disabling some parquet features. See this gist for the example: https://gist.github.com/roee88/91f2b67c3e180fa0dfb688ba8d923dae For wasm32-unknown-unknown adding getrandom with js as a dependency of the sample makes it compile IIRC, but actually running it is a different story. I tried to get a sample working with wasm-pack and it stops execution on the datafusion context creation, I suspect that it uses some sync primitives that are unsupported in wasm32-unknown-unknown but I didn't investigate further. I didn't try wasm32-unknown-emscripten yet since my local rust version is incompatible with my installed emcc version (both latest at the time of this writing). Edit: re tokio, the sample above worked on wasm32-wasi with other executors in single threaded mode including futures 0.3, https://github.com/richardanaya/executor, and async-global-executor. |
I got the basic sample from the gist in the previous message working with wasm-pack (wasm32-unknown-unknown) on single threaded tokio after:
Tested in chrome and seems to work. Again, there are definitely some code paths that lead to panic as not all of std is supported in wasm32 targets and I only tested something basic. Also, for multi-threading to work in the browser some parts of tokio can't be used directly from the datafusion codebase (this is a more complex topic). |
Good news, fellow WebAssembly enthusiasts! It looks like the stars are finally aligning, and with relatively minimal patching, I successfully compiled the code from the gist (create, insert and query a
I pushed the proof-of-concept to a public repository at
In the near future, I intend to cleanup these changes and submit a PR to DataFusion feature-flagging WebAssembly support. In general, the summary of requirements for
for
To get it to run (without a runtime error related to
This is all very messy. I will clean it up and submit a PR to DataFusion once I have a better sense of the most minimal changes required and the proper way to feature flag them. Also, general disclaimer that I'm new to Rust and YMMV, especially on the |
This sounds very cool @milesrichardson - DataFusion should be upgraded to arrow 26.0.0 shortly: #4039. I think @jimexist is in the process of making bzip support optional #3993 In terms of being messy / submitting a PR -- if it is possible I suggest trying to do it incrementally -- like for example we can probably sort out the calls to But all in all this is pretty exciting |
Hello, folks. I'm trying to add WASM support to DataFusion's dependencies. Started with bzip2-rs trifectatechfoundation/bzip2-rs#93 |
Posting an update on trifectatechfoundation/bzip2-rs#93, had a discussion with @alexcrichton
Not sure how to go from here... |
@REASY In my experiment (the one linked above), I put bzip behind a configuration flag and disabled it for the wasm targets. Datafusion still compiled. I don't know enough about DF to say how important bzip is, or which parts of DF would be broken without it, however. It seemed limited in scope, since it should only affect files that are encoded with bzip. |
Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11615
The Arrow crate successfully compiles to WebAssembly (e.g. https://github.com/domoritz/arrow-wasm) but the DataFusion crate currently does not support the
wasm32-unknown-unknown
target.Try out the repository at https://github.com/domoritz/datafusion-wasm/tree/73105fd1b2e3ca6c32ec4652c271fb741bda419a.
{code}
error[E0433]: failed to resolve: could not find
unix
inos
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/dirs-1.0.5/src/lin.rs:41:18
|
41 | use std::os::unix::ffi::OsStringExt;
| ^^^^ could not find
unix
inos
error[E0432]: unresolved import
unix
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/dirs-1.0.5/src/lin.rs:6:5
|
6 | use unix;
| ^^^^ no
unix
in the rooterror[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:98:9
|
98 | sys::duplicate(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:101:9
|
101 | sys::allocated_size(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:104:9
|
104 | sys::allocate(self, len)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:107:9
|
107 | sys::lock_shared(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:110:9
|
110 | sys::lock_exclusive(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:113:9
|
113 | sys::try_lock_shared(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:116:9
|
116 | sys::try_lock_exclusive(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:119:9
|
119 | sys::unlock(self)
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:126:5
|
126 | sys::lock_error()
| ^^^ use of undeclared crate or module
sys
error[E0433]: failed to resolve: use of undeclared crate or module
sys
--> /Users/dominik/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/fs2-0.4.3/src/lib.rs:169:5
|
169 | sys::statvfs(path.as_ref())
| ^^^ use of undeclared crate or module
sys
Compiling num-rational v0.3.2
error: aborting due to 10 previous errors
{code}
The text was updated successfully, but these errors were encountered: