Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array_contains returning unexpected values with column parameter #6972

Closed
maxburke opened this issue Jul 15, 2023 · 3 comments
Closed

array_contains returning unexpected values with column parameter #6972

maxburke opened this issue Jul 15, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@maxburke
Copy link
Contributor

Describe the bug

The array_contains code seems to be overly-flattening input lists and in doing so is generating incorrect data when one of the parameters is a column of List-type.

To Reproduce

I've attached a parquet table containing a column with type List(String).

When use array_contains on this data, I get this result set:

❯ create external table t0 stored as parquet location '/Users/max/tmp/array_contains.parquet';
0 rows in set. Query took 0.017 seconds.
❯ select bid_node_ids from t0 where array_contains(bid_node_ids, ['z+CPVybgUuCXlAE3A3jqyg==']);
+----------------------------+
| bid_node_ids               |
+----------------------------+
| [okwzcOFM3yjUzNFbc/BYBQ==] |
| [DbNysJTF560NzR/HLbAa/Q==] |
| [ivO3+Z+WMRqwhivy85d6KA==] |
+----------------------------+
3 rows in set. Query took 0.076 seconds.
❯

Note that none of the resulting bid_node_ids values contain the queried-for value of z+CPVybgUuCXlAE3A3jqyg==

array_contains.parquet.zip

Expected behavior

I was expecting that there are 861 matching results in the result set, all of which contain the value z+CPVybgUuCXlAE3A3jqyg==

❯ select bid_node_ids from t0 where array_contains(bid_node_ids, ['z+CPVybgUuCXlAE3A3jqyg==']);
+--------------------------------------------------------------------------------+
| bid_node_ids                                                                   |
+--------------------------------------------------------------------------------+
| [wFEkOS2AFYxekv7SzPrkiQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
[....snip...]
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [z+CPVybgUuCXlAE3A3jqyg==]                                                     |
| [iTd7HyShRr0PqSKyqKT0+A==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [O3GAOhhCbfxgXcZEwLI7aQ==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==]                           |
| [edSh3ZpG53UB+JMV875ipg==, z+CPVybgUuCXlAE3A3jqyg==]                           |
+--------------------------------------------------------------------------------+
861 rows in set. Query took 1.069 seconds.

Additional context

I've hacked together a change on our branch that gives us the changes we are expecting: urbanlogiq@a381f10 but I'm not sure if this fix is what is intended by the original author.

@maxburke maxburke added the bug Something isn't working label Jul 15, 2023
@izveigor
Copy link
Contributor

izveigor commented Jul 15, 2023

Thanks for the report, @maxburke.
At the moment this experimental function only works with scalars and its purpose will be changed in the future (See: #6855)
If you need a list of fully completed array functions: #6804.
P.S. it will be replaced with array_has_all function (See ticket: #6973).

@Weijun-H
Copy link
Member

Weijun-H commented Aug 4, 2023

The issue can be closed since the task of array_has_all has been accomplished.

@nevi-me
Copy link
Contributor

nevi-me commented Jan 12, 2024

We've confirmed that array_has_all meets our needs. We can close this.

@nevi-me nevi-me closed this as completed Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants