Skip to content

SteveLauC/arrow-datafusion

This branch is 2432 commits behind apache/datafusion:main.

Folders and files

NameName
Last commit message
Last commit date
Mar 7, 2024
Feb 29, 2024
Feb 28, 2024
Mar 12, 2022
Mar 11, 2024
Mar 10, 2024
Mar 11, 2024
Feb 27, 2024
Mar 11, 2024
Sep 28, 2023
Aug 30, 2022
Feb 29, 2024
Oct 2, 2023
Aug 8, 2023
May 3, 2021
Apr 14, 2022
Oct 27, 2022
Nov 14, 2021
Feb 16, 2024
Apr 19, 2021
Mar 30, 2021
Jun 10, 2022
May 24, 2021
Jan 29, 2024
Mar 5, 2024
Aug 15, 2022
Jul 15, 2019
Mar 2, 2024
Mar 4, 2024
Aug 18, 2016
Jan 11, 2024
Oct 27, 2021
Feb 20, 2024

Repository files navigation

DataFusion

Crates.io Apache licensed Build Status Discord chat

Website | Guides | API Docs | Chat

logo

DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in Rust, using the Apache Arrow in-memory format. Python Bindings are also available. DataFusion offers SQL and Dataframe APIs, excellent performance, built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.

Here are links to some important information

What can you do with this crate?

DataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more. It lets you start quickly from a fully working engine, and then customize those features specific to your use. Click Here to see a list known users.

Contributing to DataFusion

Please see the contributor guide and communication pages for more information.

Crate features

This crate has several features which can be specified in your Cargo.toml.

Default features:

  • array_expressions: functions for working with arrays such as array_to_string
  • compression: reading files compressed with xz2, bzip2, flate2, and zstd
  • crypto_expressions: cryptographic functions such as md5 and sha256
  • datetime_expressions: date and time functions such as to_timestamp
  • encoding_expressions: encode and decode functions
  • parquet: support for reading the Apache Parquet format
  • regex_expressions: regular expression functions, such as regexp_match
  • unicode_expressions: Include unicode aware functions such as character_length

Optional features:

  • avro: support for reading the Apache Avro format
  • backtrace: include backtrace information in error messages
  • pyarrow: conversions between PyArrow and DataFusion types
  • serde: enable arrow-schema's serde feature

Rust Version Compatibility

Datafusion crate is tested with the minimum required stable Rust version

About

Apache Arrow DataFusion SQL Query Engine

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 99.2%
  • Other 0.8%