Skip to content

Files

Latest commit

4edbdd7 · Apr 24, 2024

History

History
This branch is 2270 commits behind apache/datafusion:main.

datafusion-examples

DataFusion Examples

This crate includes several examples of how to use various DataFusion APIs and help you on your way.

Prerequisites:

Run git submodule update --init to init test files.

Running Examples

To run the examples, use the cargo run command, such as:

git clone https://github.com/apache/datafusion
cd datafusion
# Download test data
git submodule update --init

# Run the `csv_sql` example:
# ... use the equivalent for other examples
cargo run --example csv_sql

Single Process

  • advanced_udaf.rs: Define and invoke a more complicated User Defined Aggregate Function (UDAF)
  • advanced_udf.rs: Define and invoke a more complicated User Defined Scalar Function (UDF)
  • advanced_udwf.rs: Define and invoke a more complicated User Defined Window Function (UDWF)
  • avro_sql.rs: Build and run a query plan from a SQL statement against a local AVRO file
  • catalog.rs: Register the table into a custom catalog
  • csv_sql.rs: Build and run a query plan from a SQL statement against a local CSV file
  • csv_sql_streaming.rs: Build and run a streaming query plan from a SQL statement against a local CSV file
  • custom_datasource.rs: Run queries against a custom datasource (TableProvider)
  • dataframe-to-s3.rs: Run a query using a DataFrame against a parquet file from s3 and writing back to s3
  • dataframe.rs: Run a query using a DataFrame against a local parquet file
  • dataframe_in_memory.rs: Run a query using a DataFrame against data in memory
  • dataframe_output.rs: Examples of methods which write data out from a DataFrame
  • deserialize_to_struct.rs: Convert query results into rust structs using serde
  • expr_api.rs: Create, execute, simplify and analyze Exprs
  • flight_sql_server.rs: Run DataFusion as a standalone process and execute SQL queries from JDBC clients
  • function_factory.rs: Register CREATE FUNCTION handler to implement SQL macros
  • make_date.rs: Examples of using the make_date function
  • memtable.rs: Create an query data in memory using SQL and RecordBatches
  • parquet_sql.rs: Build and run a query plan from a SQL statement against a local Parquet file
  • parquet_sql_multiple_files.rs: Build and run a query plan from a SQL statement against multiple local Parquet files
  • pruning.rs: Use pruning to rule out files based on statistics
  • query-aws-s3.rs: Configure object_store and run a query against files stored in AWS S3
  • query-http-csv.rs: Configure object_store and run a query against files vi HTTP
  • regexp.rs: Examples of using regular expression functions
  • rewrite_expr.rs: Define and invoke a custom Query Optimizer pass
  • simple_udaf.rs: Define and invoke a User Defined Aggregate Function (UDAF)
  • simple_udf.rs: Define and invoke a User Defined Scalar Function (UDF)
  • simple_udfw.rs: Define and invoke a User Defined Window Function (UDWF)
  • sql_dialect.rs: Example of implementing a custom SQL dialect on top of DFParser
  • to_char.rs: Examples of using the to_char function
  • to_timestamp.rs: Examples of using to_timestamp functions

Distributed