You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I test in my local M1, seems there is no regression in q16 (which page index prune none data). Set `DATAFUSION_EXECUTION_PARQUET_ENABLE_PAGE_INDEX=true target/release/datafusion-cli`
Form the bytes_scanned you can see second one scan more bytes (reasonable with page Index).
So IMOP , you do the test in google cloud machine which means will have more latency when fetch bytes than local machine.
I found there is one place need improvement, which read ParquetExec: file_groups={10 groups: [[Users/yangjiang/tpch-parquet/partsupwithout filter pushdown still read the pageIndex bytes 🤣
So i want to improve this and retest this in cloud env, @alamb Is this reasonable 🤔 PATL
New Edit, i found even in cloud env, your test data file still on local disk 🤦 , but i think this still need improve
Sorry, now I see that I didn't notice the different name of the timings, because of the changed order (time_elapsed_processing vs time_elapsed_scanning_total). Now it makes sense
But i found
Form the
bytes_scanned
you can see second one scan more bytes (reasonable with page Index).So IMOP , you do the test in
google cloud machine
which means will have more latency when fetch bytes than local machine.From the plan
I found there is one place need improvement, which read
ParquetExec: file_groups={10 groups: [[Users/yangjiang/tpch-parquet/partsup
without filter pushdown still read the pageIndex bytes 🤣So i want to improve this and retest this in cloud env, @alamb Is this reasonable 🤔 PATL
New Edit, i found even in cloud env, your test data file still on local disk 🤦 , but i think this still need improve
Originally posted by @Ted-Jiang in #5099 (comment)
The text was updated successfully, but these errors were encountered: