Skip to content

scylladb/scylla-rust-udf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rust helper library for Scylla UDFs

This crate allows writing pure Rust functions that can be used as Scylla UDFs.

Note: this crate is officially supported and ready to use. However, UDFs are still an experimental feature in ScyllaDB, and the crate has not been widely used, which is why it's still in beta and its API is subject to change. We appreciate bug reports and pull requests!

Usage

Note

In Rust versions 1.77 and below, the WASI target was called wasm32-wasi instead of wasm32-wasip1.

For more information, see https://blog.rust-lang.org/2024/04/09/updates-to-rusts-wasi-targets.html.

Prerequisites

To use this helper library in Scylla you'll need:

  • cargo
    • Generating Wasm is also possible with just the rustc compiler, but the guide will assume that cargo is installed
  • Standard library for Rust wasm32-wasip1
  • wasm2wat parser
    • Available in many distributions in the wabt package, which also provides the wasm-strip tool
  • (Optionally) wasm-opt tool for optimizing the Wasm binary
    • Available in many distributions in the binaryen package

Compilation

We recommend a setup with cargo.

  1. Start with a library package:
cargo new --lib my_udf_library
  1. Add the scylla-udf dependency:
cargo add scylla-udf
  1. Add the following lines to the Cargo.toml to set the crate-type to cdylib:
[lib]
crate-type = ["cdylib"]
  1. Implement your package, exporting Scylla UDFs using the scylla_udf::export_udf macro.
  2. Build the package using the wasm32-wasip1 target:
RUSTFLAGS="-C link-args=-zstack-size=131072" cargo build --target=wasm32-wasip1

NOTE: The default size of the stack in WASI (1MB) causes warnings about oversized allocations in Scylla, so we recommend setting the stack size to a lower value. This is done using the RUSTFLAGS environmental variable in the command above for a new size of 128KB, which should be enough for most use cases.

  1. Find the compiled .wasm binary. Let's assume it's target/wasm32-wasip1/debug/my_udf_library.wasm.
  2. (optional) Optimize the binary using wasm-opt -O3 target/wasm32-wasip1/debug/my_udf_library.wasm -o target/wasm32-wasip1/debug/my_udf_library.wasm (can be combined with using cargo build --release profile)
  3. (optional) Reduce the size of the binary using wasm-strip target/wasm32-wasip1/debug/my_udf_library.wasm
  4. Translate the binary into wat:
wasm2wat target/wasm32-wasip1/debug/my_udf_library.wasm > target/wasm32-wasip1/debug/my_udf_library.wat

CQL Statement

The resulting target/wasm32-wasi/debug/my_udf_library.wat code can now be used directly in a CREATE FUNCTION statement. The resulting code will most likely contain ' characters, so it may be necessary to first replace them with '', so that they're usable in a CQL string.

For example, if you have an Rust UDF that joins a list of words using commas, you can create a Scylla UDF using the following statement:

CREATE FUNCTION commas(string list<text>) CALLED ON NULL INPUT RETURNS text LANGUAGE wasm AS ' (module ...) '

NOTE: The LANGUAGE used for Wasm UDFs is xwasm instead of wasm in Scylla versions 5.1 and 5.2.

CQL Type Mapping

The argument and return value types used in functions annotated with #[export_udf] must all map to CQL types used in the CREATE FUNCTION statements used in Scylla, according to the tables below.

If the Scylla function is created with types that do not match the types used in the Rust function, calling the UDF will fail or produce arbitrary results.

Native types

CQL Type Rust type
ASCII String
BIGINT i64
BLOB Vec<u8>
BOOLEAN bool
COUNTER scylla_udf::Counter
DATE chrono::NaiveDate
DECIMAL bigdecimal::Decimal
DOUBLE f64
DURATION scylla_udf::CqlDuration
FLOAT f32
INET std::net::IpAddr
INT i32
SMALLINT i16
TEXT String
TIME scylla_udf::Time
TIMESTAMP scylla_udf::Timestamp
TIMEUUID uuid::Uuid
TINYINT i8
UUID uuid::Uuid
VARCHAR String
VARINT num_bigint::BigInt

Collections

If a CQL type T maps to Rust type RustT, you can use it as a collection parameter:

CQL Type Rust type
LIST<T> Vec<RustT>
MAP<T> std::collections::BTreeMap<RustT>, std::collections::HashMap<RustT>
SET<T> Vec<RustT>, std::collections::BTreeSet<RustT>, std::collections::HashSet<RustT>

Tuples

If CQL types T1, T2, ... map to Rust types RustT1, RustT2, ..., you can use them in tuples:

CQL Type Rust type
TUPLE<T1, T2, ...> (RustT1, RustT2, ...)

Nulls

If a CQL Value of type T that's mapped to type RustT may be a null (all parameter and return types in CALLED ON NULL INPUT UDFs), then the type used in the Rust function should be Option<RustT>.

Contributing

In general, try to follow the same rules as in https://github.com/scylladb/scylla-rust-driver/blob/main/CONTRIBUTING.md

Testing

This crate is meant to be compiled to a wasm32-wasip1 target and ran in a WASM runtime. The tests that use WASM-specific code will most likely not succeed when executed in a different way (in particular, with a simple cargo test command).

For example, if you have the wasmtime runtime installed and in PATH, you can use the following command to run tests:

CARGO_TARGET_WASM32_WASI_RUNNER="wasmtime --allow-unknown-exports" cargo test --target=wasm32-wasip1