Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlchemyFDW: Clause pushdown not working with OR boolop #8

Open
LanceRadioactive opened this issue May 12, 2022 · 9 comments
Open

AlchemyFDW: Clause pushdown not working with OR boolop #8

LanceRadioactive opened this issue May 12, 2022 · 9 comments

Comments

@LanceRadioactive
Copy link

Another issue in AlchemyFDW: using OR in Where clauses involving foreign tables causes Multicorn to respond with the following warning:

unsupported expression for extractClauseFrom
along with refusing to perform pushdown and instead performing a scan over the entire table.

It seems the error happens explicitly with the boolop OR regardless of any operators under it. AND and other boolean operators are supported and pushdown as normal.

@luss
Copy link
Contributor

luss commented May 16, 2022

Can you provide a simple test that shows it on a simple table? I think I know where in the code this is likely happening.

@LanceRadioactive
Copy link
Author

LanceRadioactive commented May 18, 2022

Apologies for the late response again! Here are the reproduction steps for this:

  1. Refer to AlchemyFDW: 'Malformed array literal' error on retrieving data from foreign Postgres array columns #7 for the basic setup; create the second database, install Multicorn on the first one.
  2. At postgres_physical:
create table pushdown_example (id1 int, id2 int);
insert into pushdown_example values (0, 0)
  1. At postgres:
create foreign table pushdown_test
(id1 int, id2 int)
server alchemy_srv
options (db_url 'postgresql://postgres:postgres@localhost/postgres_physical', schema 'public', tablename 'pushdown_example')

explain verbose 
select * from pushdown_test pt 
where pt.id1 = 0 or pt.id2 = 0

creates the following output as well as 4 unsupported expression for extractClauseFrom warnings.

изображение

Removing pt.id2 = 0 or replacing OR with AND proves it otherwise works properly:
изображение

@luss
Copy link
Contributor

luss commented Jun 15, 2022

I added a blurb to the projects main README saying pg14 support is still experimental.

@ShaheedHaque
Copy link
Collaborator

I think I see the same thing. @luss, I presume the code at

default:
needs a case T_BoolOpExpr or similar?

@luss
Copy link
Contributor

luss commented May 11, 2024 via email

@ShaheedHaque
Copy link
Collaborator

ShaheedHaque commented May 13, 2024

So, here is what I have deduced:

  1. As per the OP, and some tests I performed, a T_BoolOpExpr occurs when the where clause qualifier has the form of an OR, or AND NOT (and, I suspect, anything other than an AND) [1].
  2. The analogous code in Postgres' contrib.postgres_fdw implmentation has - not surprisingly - a recursive expansion of the qualifier where we have code for 3 specific cases (equality, test for NULL and IN).

Now, even putting aside writing code to cover this case, the question is whether Multicorn should aspire, in principle, to support a wider set of qualifier expressions, starting with a recursive walk in the T_BoolOpExpr case?

The alternative is to say that (as I understand it) Postgres does not assume pushdown, and must therefore do its own qualifier checking, and few - if any - plausible use cases for Multicorn would need such a general approach, and so, simply make unwanted qualifiers a no-op and silence the messages while gradually adding other wanted qualifiers [2].

[1] Example of BOOLEXPR query:

('SELECT "jobs"."queue" FROM "jobs" WHERE ("jobs"."queue" = %s OR "jobs"."state" = %s)', (param1, param2))

which emits:

WARNING:  unsupported expression for extractClauseFrom
DETAIL:  {
    BOOLEXPR :boolop or :args ({
        OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ({
            RELABELTYPE :arg {
                VAR :varno 1 :varattno 2 :vartype 1043 :vartypmod 44 :varcollid 100 :varlevelsup 0 :varnosyn 1 :varattnosyn 2 :location 41}
               :resulttype 25 :resulttypmod -1 :resultcollid 100 :relabelformat 2 :location -1
            } {
               CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 58 :constvalue 5 [ 20 0 0 0 120 ]
        }) :location 56
    } {
        OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ({
            RELABELTYPE :arg {
                VAR :varno 1 :varattno 6 :vartype 1043 :vartypmod 24 :varcollid 100 :varlevelsup 0 :varnosyn 1 :varattnosyn 6 :location 65} :resulttype 25 :resulttypmod -1 :resultcollid 100 :relabelformat 2 :location -1
        } {
            CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 82 :constvalue 5 [ 20 0 0 0 120 ]
        }) :location 80
    }) :location -1
}

[2] Example of other currently unsupported qualifier:

('SELECT "queues"."vhost" FROM "queues" WHERE (NOT (NOT "queues"."durable") AND "queues"."state" = %s)', (p1,))

which emits:

WARNING:  unsupported expression for extractClauseFrom
DETAIL:  {VAR :varno 1 :varattno 5 :vartype 16 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnosyn 1 :varattnosyn 5 :location 150}

@mfenniak
Copy link
Collaborator

I think it is smart to be skeptical about supporting ORs, as it really complicates building a simple FDW. My own dynamodb_fdw based onto a system that doesn't support OR queries, so it would have to fail, ignore (do a full table scan and let PG handle it), or split the ORs into multiple conditions and make multiple independent internal operations.

That last one is an interesting idea that could be adapted into multicorn... I'd be imagining a solution like this, when an OR is identified by multicorn:

  • An unaltered FDW as exists today would have execute invoked multiple times for each branch; gaining the advantages of pushdown operators in this case, but not increasing the FDW complexity
  • An FDW could opt-in to a more advanced API (execute_advanced?) which gets a tree of quals, rather than a list, if it wants to become more complex

@ShaheedHaque
Copy link
Collaborator

To counter my own argument, I suppose an FDW which connects to a database (or an ORM), as in the case of the OP, is the exception where pushdown for OR and other complex "quals" would actually make sense.

@luss
Copy link
Contributor

luss commented May 23, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants