Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SELECT ... IN does not work on indexed fields that are arrays #281

Closed
cbrake opened this issue Oct 22, 2020 · 1 comment · Fixed by #324
Closed

SELECT ... IN does not work on indexed fields that are arrays #281

cbrake opened this issue Oct 22, 2020 · 1 comment · Fixed by #324
Assignees
Labels
bug Something isn't working sql SQL API and syntax
Milestone

Comments

@cbrake
Copy link

cbrake commented Oct 22, 2020

The SELECT WHERE x IN y works fine with non indexed fields, but as soon as I change to an indexed field, it fails with: IN operator takes an array.

I'm using the latest master branch from today (779c7a6).

Test case:

genji> create table nodes;
genji> insert into nodes VALUES {name:'node1', parents:['n1', 'n2', 'n3']};
genji> select * from nodes;
{
  "name": "node1",
  "parents": [
    "n1",
    "n2",
    "n3"
  ]
}
genji> select * from nodes where 'n1' in parents;
{
  "name": "node1",
  "parents": [
    "n1",
    "n2",
    "n3"
  ]
}
genji> create index nodes_parents_index on nodes(parents);
genji> reindex;
genji> select * from nodes where 'n1' in parents;
IN operator takes an array
genji> 

Background on this -- I'm trying to build a graph of data:

simpleiot/simpleiot#91

Where each node would have an array of parent ids. So I would like to index the parents field to be able to quickly find all children of a node. I'd rather not store a children array in the parent, as it could be quite large.

Open to better ways to do this -- perhaps I should have another table that defines relationships or something like that as then I could traverse the graph both ways. The following project calls them edges:

https://entgo.io/docs/schema-edges/

Thinking through this, that probably makes sense and would be most flexible long term, as then the edge type could hold useful information.

@yaziine yaziine added bug Something isn't working sql SQL API and syntax labels Oct 22, 2020
@tie
Copy link
Contributor

tie commented Oct 22, 2020

Thanks for the report! When IN operator iterates over index, it expects an array of values on the right-hand side to compare against. Looks like a bug in sql/planner since we shouldn’t be using index in this case—we can’t index array elements because currently Genji does not support GIN or GiST-like indexing algorithms (see #21).

genji> CREATE TABLE t;
genji> CREATE INDEX f_idx ON t(f);
genji> INSERT INTO t(f) VALUES ('v1'), ('v2');
genji> EXPLAIN SELECT * FROM t WHERE f IN ('v1', 'v2', '…');
{
  "plan": "Index(f_idx) -> ∏(*)"
}
genji> -- We should only use index for left-hand side value;
genji> -- The following shouldn’t use index (it does now);
genji> EXPLAIN SELECT * FROM t WHERE 'v1' IN f;
{
  "plan": "Index(f_idx) -> ∏(*)"
}

That said, you are right, it’s generally better to represent a graph as two tables of nodes (vertices) and edges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sql SQL API and syntax
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants