Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using IN list on dictionary encoded data: InList does not support datatype Dictionary(Int32, Utf8). #3936

Closed
alamb opened this issue Oct 24, 2022 · 4 comments · Fixed by #4070
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Oct 24, 2022

Describe the bug
I am writing a query to select some values from a dictionary encoded string column trace_id and I get an error

select * from spans where trace_id   IN ('187dcbbc68a83a0d', '335424324532');

Results in this error

This feature is not implemented: InList does not support datatype Dictionary(Int32, Utf8).

To Reproduce
Run this test:

diff --git a/datafusion/core/tests/sql/predicates.rs b/datafusion/core/tests/sql/predicates.rs
index 07e016a27..5109f0167 100644
--- a/datafusion/core/tests/sql/predicates.rs
+++ b/datafusion/core/tests/sql/predicates.rs
@@ -427,6 +427,32 @@ async fn csv_in_set_test() -> Result<()> {
     Ok(())
 }

+#[tokio::test]
+async fn in_set_string_dictionaries() -> Result<()> {
+    let input = vec![Some("foo"), Some("bar"), None, Some("fazzz")]
+        .into_iter()
+        .collect::<DictionaryArray<Int32Type>>();
+
+    let batch = RecordBatch::try_from_iter(vec![("c1", Arc::new(input) as _)]).unwrap();
+
+    let ctx = SessionContext::new();
+    ctx.register_batch("test", batch)?;
+
+    let sql = "SELECT * FROM test WHERE c1 IN ('foo', 'Bar', 'fazz')";
+    let actual = execute_to_batches(&ctx, sql).await;
+    let expected = vec![
+        "+-------+",
+        "| c1    |",
+        "+-------+",
+        "| foo   |",
+        "| fazzz |",
+        "+-------+",
+    ];
+
+    assert_batches_eq!(expected, &actual);
+    Ok(())
+}
+

Results in

thread 'sql::predicates::in_set_string_dictionaries' panicked at 'called `Result::unwrap()` on an `Err` value: "ArrowError(ExternalError(NotImplemented(\"InList does not support datatype Dictionary(Int32, Utf8).\"))) at ...

Expected behavior
Test should pass

Additional context
Found this in IOx as part of https://github.com/influxdata/influxdb_iox/issues/5959

@alamb alamb added the bug Something isn't working label Oct 24, 2022
@alamb
Copy link
Contributor Author

alamb commented Oct 24, 2022

I plan to work on this later in the week if no one beats me to it

@alamb alamb changed the title Error using IN list on dictionary encoded data Error using IN list on dictionary encoded data: InList does not support datatype Dictionary(Int32, Utf8). Oct 24, 2022
@alamb alamb changed the title Error using IN list on dictionary encoded data: InList does not support datatype Dictionary(Int32, Utf8). Error using IN list on dictionary encoded data: InList does not support datatype Dictionary(Int32, Utf8). Oct 24, 2022
@jackwener
Copy link
Member

related issue: #3766

@alamb
Copy link
Contributor Author

alamb commented Oct 24, 2022

@NGA-TRAN says she plans to work on this

@NGA-TRAN
Copy link
Contributor

I am actively working on this

tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Nov 1, 2022

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Nov 3, 2022
tustvold added a commit that referenced this issue Nov 3, 2022
* Support dictionary in InList (#3936)

* Update datafusion-cli
Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this issue Nov 5, 2022
* Support dictionary in InList (apache#3936)

* Update datafusion-cli
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants