Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of bounds error when inserting into MemTable with zero partitions #14010

Closed
tobixdev opened this issue Jan 5, 2025 · 0 comments · Fixed by #14011
Closed

Out of bounds error when inserting into MemTable with zero partitions #14010

tobixdev opened this issue Jan 5, 2025 · 0 comments · Fixed by #14011
Labels
bug Something isn't working

Comments

@tobixdev
Copy link
Contributor

tobixdev commented Jan 5, 2025

Describe the bug

Trying to insert into a MemTable with zero partitions causes and index out of bounds error when executing.

Backtrace:

index out of bounds: the len is 0 but the index is 0
thread 'datasource::memory::tests::test_insert_into_zero_partition' panicked at datafusion/core/src/datasource/memory.rs:373:24:
index out of bounds: the len is 0 but the index is 0
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14
   2: core::panicking::panic_bounds_check
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:276:5
   3: <usize as core::slice::index::SliceIndex<[T]>>::index_mut
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/slice/index.rs:307:14
   4: core::slice::index::<impl core::ops::index::IndexMut<I> for [T]>::index_mut
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/slice/index.rs:28:9
   5: <alloc::vec::Vec<T,A> as core::ops::index::IndexMut<I>>::index_mut
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/alloc/src/vec/mod.rs:2924:9
   6: <datafusion::datasource::memory::MemSink as datafusion_physical_plan::insert::DataSink>::write_all::{{closure}}
...

To Reproduce

The following test case asserts a descriptive message for said error:

    // Test inserting a batch into a MemTable without any partitions
    #[tokio::test]
    async fn test_insert_into_zero_partition() -> Result<()> {
        // Create a new schema with one field called "a" of type Int32
        let schema = Arc::new(Schema::new(vec![Field::new("a", DataType::Int32, false)]));

        // Create a new batch of data to insert into the table
        let batch = RecordBatch::try_new(
            schema.clone(),
            vec![Arc::new(Int32Array::from(vec![1, 2, 3]))],
        )?;
        // Run the experiment and expect an error
        let experiment_result = experiment(schema, vec![], vec![vec![batch.clone()]])
            .await
            .unwrap_err();
        // Ensure that there is a descriptive error message
        assert_eq!(
            "Error during planning: Cannot insert into MemTable with zero partitions.",
            experiment_result.strip_backtrace()
        );
        Ok(())
    }

Expected behavior

I think there are multiple ways to address this.

  1. An error during planning with a descriptive message.
  2. An error during execution with a descriptive message.
  3. Automatically create a single partition during insertion
  4. Do not allow creating MemTables without any partitions

From my point of view, 1. is the preferred solution because this allows creating "Empty MemTables that cannot become non-empty".
However, I am not really familiar with the code base so opinions may vary.

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant