You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I has also been reported in the dbt-sqlserver repo here but this seems to be actually due to the different behaviour of data tests compared to schema tests in dbt.
Steps To Reproduce
Configure dbt to run with dbt-sqlserver
Create a test in the folder tests with a CTE (example: WITH src AS (SELECT COUNT(*) AS cnt FROM {{ ref('my table') }}) ) SELECT * FROM src )
Run dbt test --data
A database error is raised (see logs below)
Expected behavior
The test should run without any SQL error
Screenshots and log output
From dbt.log
020-07-02 08:39:12,208505 (Thread-1): On test.qwdv.cte: select count(*) as errors from (
WITH src AS (
<my_select_query>
)
SELECT * FROM src
) sbq
2020-07-02 08:39:12,213714 (Thread-1): Database error: ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near the keyword 'WITH'. (156) (SQLExecDirectW)")
2020-07-02 08:39:12,213930 (Thread-1): On test.qwdv.cte: ROLLBACK
System information
Which database are you using dbt with?
postgres
redshift
bigquery
snowflake
other (specify: sqlserver)
The output of dbt --version:
installed version: 0.15.2
latest version: 0.17.0
The operating system you're using:
Mac OS
The output of python --version:
Python 3.8.3
Additional context
Is there any reason why the data tests are wrapped around a SELECT COUNT(*) and could the behavior be changed to the same as schema tests?
The text was updated successfully, but these errors were encountered:
From some brief searching, it appears that SQLServer doesn't support nested with statements, i.e. CTEs within CTEs. (Spark and Presto don't, either.) This would mean that, in addition to data tests that have CTEs, SQLServer isn't happy about ephemeral models that have CTEs—is that right?
In any case, it sounds like we need to enable different implementations on different databases. We should move this SQL, currently defined as a python string (!), into a Jinja adapter macro that can be implemented differently on different plugins. We should do the same with built-in schema tests (see #2415).
In dbt-sqlserver, which doesn't support nested with in CTEs or subqueries, I could imagine needing to implement this very differently. Here's some pseudo-code:
Rationalizing and reconciling schema + data tests is on the 1.0.0 roadmap. Here's the trade-off we're facing:
It is beneficial to have users encode all logic (including select count(*)) directly in the test definition, as they do for schema tests today, because we avoid the syntactical differences that crop up on different databases (= this very issue).
It is also beneficial to have tests return a set of records (select *), as they do for data tests today, so that we can develop tooling to store those test results in the database (see Storing test results in the database #2593).
I'm leaning toward the latter. The adapter macro above would be a prerequisite for that change as well.
Also, we might take this opportunity to change the name of test.injected_sql.
Describe the bug
Using CTEs in data tests while using the dbt-sqlserver adapter raises SQL errors. This is due to the fact that data tests are wrapped around a
SELECT COUNT(*)
https://github.com/fishtown-analytics/dbt/blob/3af8a22d1319431115e7394f1235294613d36b35/core/dbt/task/test.py#L35 and SQL Server doesn't allow the CTEs inside other requests (it needs to start with WITH).I has also been reported in the dbt-sqlserver repo here but this seems to be actually due to the different behaviour of data tests compared to schema tests in dbt.
Steps To Reproduce
WITH src AS (SELECT COUNT(*) AS cnt FROM {{ ref('my table') }}) ) SELECT * FROM src
)Expected behavior
The test should run without any SQL error
Screenshots and log output
From dbt.log
System information
Which database are you using dbt with?
The output of
dbt --version
:The operating system you're using:
Mac OS
The output of
python --version
:Python 3.8.3
Additional context
Is there any reason why the data tests are wrapped around a
SELECT COUNT(*)
and could the behavior be changed to the same as schema tests?The text was updated successfully, but these errors were encountered: