You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And also large data files, ideally with all supported data types
Note for the data files, completely random data may not be sufficient, as some encodings take advantage of patterns in the data (e.g. int v2 RLE), so need to keep that in mind if considering generating data for the benchmarks
Could also use something like TPCH or TPCDS data, or NYC taxi, for more variety in data
The text was updated successfully, but these errors were encountered:
Will work on adding a simple benchmark to start us off, with aim of giving visibility over whether refactors positively or negatively impact performance
To make improving performance more measurable, include benchmarks to be run.
Requires benchmark programs (see https://github.com/apache/arrow-rs/tree/master/parquet/benches)
And also large data files, ideally with all supported data types
Note for the data files, completely random data may not be sufficient, as some encodings take advantage of patterns in the data (e.g. int v2 RLE), so need to keep that in mind if considering generating data for the benchmarks
Could also use something like TPCH or TPCDS data, or NYC taxi, for more variety in data
The text was updated successfully, but these errors were encountered: