Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2539] [Bug] CTE SQL Injection in Postgres #7521

Closed
2 tasks done
LouisAuneau opened this issue May 5, 2023 · 5 comments
Closed
2 tasks done

[CT-2539] [Bug] CTE SQL Injection in Postgres #7521

LouisAuneau opened this issue May 5, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@LouisAuneau
Copy link

LouisAuneau commented May 5, 2023

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

We run models using dagster hosted in kubernetes (each run is therefore running in a fresh identical pod). Those model perform transformations on PostgreSQL tables.
Since upgrading to PostgreSQL 14 (from 11), our model runs fails with a Database Error around 10 to 20% of the time. I wasn't able to draw any correlation between the data that is transformed and the fails (our data mostly changes at daytime and the fails occur day and night).

The model structure looks like this:

WITH current_model_subquery AS (
    SELECT ... FROM {{ ref('upstream_ephemeral_model') }}
)

SELECT ... FROM current_model_subquery;

The error seems to be related to the SQL injection and CTE. Indeed, when succeeding, the compiled SQL looks like this:

WITH __dbt__cte__upstream_ephemeral_model( 
         ...
     ),
     current_model_subquery AS (
         SELECT ... FROM __dbt__cte__upstream_ephemeral_model
     )

SELECT ... FROM current_model_subquery;

When failing, the compiled SQL looks like this:

WITH __dbt__cte__upstream_ephemeral_modell( 
    ...
)
WITH current_model_subquery AS (
    SELECT ... FROM __dbt__cte__upstream_ephemeral_model
)
SELECT ... FROM current_model_subquery;

And therefore I get a Postgres error: syntax error at or near \"WITH\"\n.

To be very honest the only way I could reproduce was just to repeat the dbt run until it fails with a dbt clean in between each run.

Expected Behavior

Always generate the main model CTE with the upstream ephemeral CTE.

Steps To Reproduce

  1. Using a PostgreSQL 14 and dbt 1.2.5,
  2. With an upstream ephemeral model and a main model having a CTE referencing the upstream model,
  3. Run dbt run

Relevant log output

Environment

- OS: Debian GNU/Linux 11 (bullseye)
- Python: 3.8.16
- dbt: 1.2.5

Which database adapter are you using with dbt?

postgres

Additional Context

No response

@LouisAuneau LouisAuneau added bug Something isn't working triage labels May 5, 2023
@github-actions github-actions bot changed the title [Bug] CTE SQL Injection in Postgres [CT-2539] [Bug] CTE SQL Injection in Postgres May 5, 2023
@dbeatty10
Copy link
Contributor

Thanks for reporting this @LouisAuneau !

On first blush, the double WITH looks similar to #7396 (comment).

Which version of sqlparse do you have installed? Is it sqlparse==0.4.4, by any chance?

@LouisAuneau
Copy link
Author

Thanks for you quick reply. Exaclty, sqlparse 0.4.4 is installed. Should I downgrade to 0.4.3 as suggested in mentioned issue ?

@dbeatty10
Copy link
Contributor

Sure thing @LouisAuneau 👍

Yep, could you downgrade to sqlparse==0.4.3 and let us how that goes?

@LouisAuneau
Copy link
Author

Looks it solved our issue! Thanks a lot for your support !

@dbeatty10
Copy link
Contributor

You're welcome @LouisAuneau !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants