Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to subtract with overflow panic in max_distinct_count #9006

Closed
gruuya opened this issue Jan 26, 2024 · 0 comments · Fixed by #9007
Closed

Attempt to subtract with overflow panic in max_distinct_count #9006

gruuya opened this issue Jan 26, 2024 · 0 comments · Fixed by #9007
Labels
bug Something isn't working

Comments

@gruuya
Copy link
Contributor

gruuya commented Jan 26, 2024

Describe the bug

There's an edge case in the present max_distinct_count algorithm, whereby there can be an attempt to subtract a potentially larger number of total nulls from a inexact smaller number of total rows to get the distinct values
https://github.com/apache/arrow-datafusion/blob/bee7136a04c60a2c06caa630cf1b72f32f7dc574/datafusion/physical-plan/src/joins/utils.rs#L957-L959

This leads to a panic with attempt to subtract with overflow.

To Reproduce

Extract the three parquet files from files.zip needed for the repro. These were generated using DuckDB with SF=0.01 for TPC-DS benchamrks. The example below is a minimal repro for an issue observed for query 24 from that benchmark.

#[tokio::main]
async fn main() -> Result<()> {
    let ctx = SessionContext::new();

    let file_format = ParquetFormat::default().with_enable_pruning(Some(true));
    let listing_options = ListingOptions::new(Arc::new(file_format))
        .with_file_extension(FileType::PARQUET.get_ext());

    ctx.register_listing_table(
        "store",
        &format!("/path/to/store.parquet"),
        listing_options.clone(),
        None,
        None,
    )
    .await?;
    ctx.register_listing_table(
        "store_sales",
        &format!("/path/to/store_sales.parquet"),
        listing_options.clone(),
        None,
        None,
    )
        .await?;
    ctx.register_listing_table(
        "customer",
        &format!("/path/to/customer.parquet"),
        listing_options,
        None,
        None,
    )
        .await?;

    let df = ctx
        .sql(
            "SELECT c_last_name,
       c_first_name,
       s_store_name,
       s_state
FROM store_sales,
     store,
     customer
WHERE ss_customer_sk = c_customer_sk
  AND ss_store_sk = s_store_sk
  AND s_market_id=8",
        )
        .await?;

    // print the results
    df.show().await?;

    Ok(())
}

The above code panics with:

thread 'main' panicked at datafusion/physical-plan/src/joins/utils.rs:958:40:
attempt to subtract with overflow

Note that you can get a repro with the cli by appending DATAFUSION_EXECUTION_COLLECT_STATISTICS=true DATAFUSION_EXECUTION_TARGET_PARTITIONS=1 to cargo run

Expected behavior

The example shouldn't panic, but instead return an empty result.

Additional context

As for the question how does this situation even occur in the first place, from my brief investigation I'm seeing that:

  1. The FilterExec for the filtering on the store table predicate returns Inexact(0) as the number of rows for it's output statistics, since the predicate refutes all the input rows (in the case of store above there's only a single row with s_market_id equals to 2).
  2. When joining store and store_sales the join cardinality estimate is 0 due to the above filtering, but the column statistic are nonetheless merged as is (meaning an exact null count for the store_sales columns is inherited)
    https://github.com/apache/arrow-datafusion/blob/bee7136a04c60a2c06caa630cf1b72f32f7dc574/datafusion/physical-plan/src/joins/utils.rs#L848-L859
  3. Finally during merging with customer the statistics from step 2 enters into play, and when it reaches max_distinct_count it hits the num_rows being Inexact(0) but stats.null_count being exact and greater than zero edge case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant