Skip to content
This repository has been archived by the owner on May 29, 2021. It is now read-only.

Seed files throw pyodbc error HY000 #6

Closed
curious-bryan opened this issue Aug 12, 2019 · 6 comments
Closed

Seed files throw pyodbc error HY000 #6

curious-bryan opened this issue Aug 12, 2019 · 6 comments

Comments

@curious-bryan
Copy link
Contributor

Summary

I ran into an error HY000 from pyodbc after running dbt seed while trying to recreate jaffle_shop tutorial for dbt at https://github.com/fishtown-analytics/jaffle_shop

What happened:

  1. Cloned repository above
  2. Created jaffle_shop profile in ~/.dbt/profiles.yml using
    • type: mssql
    • driver: 'ODBC Driver 17 for SQL Server'
    • ...
  3. Changed into jaffle_shop directory
  4. Ensured profile setup correctly with dbt debug <- ALL GREEN SUCCESS OKs
  5. Ran dbt seed <- ERRORs thrown for all three CSVs in jaffle_shop data directory. See below.

What I expected to happen:

I expected the 3 CSVs in /path/to/jaffle_shop/data/ to be loaded into tables on my database.

Errors

Running with dbt=0.14.0
Found 8 models, 20 tests, 0 snapshots, 0 analyses, 113 macros, 0 operations, 3 seed files, 0 sources

11:18:05 | Concurrency: 1 threads (target='dev')
11:18:05 |
11:18:05 | 1 of 3 START seed file dbt_dev.raw_customers..................... [RUN]
11:18:05 | 1 of 3 ERROR loading seed file dbt_dev.raw_customers............. [ERROR in 0.14s]
11:18:05 | 2 of 3 START seed file dbt_dev.raw_orders........................ [RUN]
11:18:05 | 2 of 3 ERROR loading seed file dbt_dev.raw_orders................ [ERROR in 0.02s]
11:18:05 | 3 of 3 START seed file dbt_dev.raw_payments...................... [RUN]
11:18:05 | 3 of 3 ERROR loading seed file dbt_dev.raw_payments.............. [ERROR in 0.02s]
11:18:05 |
11:18:05 | Finished running 3 seeds in 1.45s.

Completed with 3 errors:

Database Error in seed raw_customers (data\raw_customers.csv)
  ('The SQL contains 0 parameter markers, but 300 parameters were supplied', 'HY000')

Database Error in seed raw_orders (data\raw_orders.csv)
  ('The SQL contains 0 parameter markers, but 396 parameters were supplied', 'HY000')

Database Error in seed raw_payments (data\raw_payments.csv)
  ('The SQL contains 0 parameter markers, but 452 parameters were supplied', 'HY000')

Done. PASS=0 ERROR=3 SKIP=0 TOTAL=3

Environment Details

  • local OS: Windows 10
  • remote OS: Windows Server 2016
  • remote MSSQL: MS SQL Server 2017
  • dbt version: 0.14.0
  • let me know if you need more details on python version, pyodbc version etc
@jacobm001
Copy link
Owner

jacobm001 commented Aug 12, 2019

Someone in my office had a similar issue, which was referenced in the latest release, 1.0.3. Please do a pip update and report back if that resolves your issue.

@curious-bryan
Copy link
Contributor Author

@jacobm001, this issue happened with the latest release (1.0.3).

I also ran pip install dbt-mssql --upgrade just in case. Then followed steps above. And I encountered the same HY000 error from pyobdc after running dbt seed.

I also noticed that my profile with type : mssql does not pass dbt-integrations-test. dbt seed fails there too. See errors below from dbt-integration-test.

Error from dbt-integration-test

Test direct copying of source tables    # features/001_basic_materializations.feature
  Test materialized='view' -- @1.1   ......F
--------------------------------------------------------------------------------
FAILURE in step 'I successfully run "dbt seed"' (features/001_basic_materializations.feature:45):
Assertion Failed: Running with dbt=0.14.0
Found 1 models, 1 tests, 0 snapshots, 0 analyses, 225 macros, 0 operations, 1 seed files, 0 sources

14:08:33 | Concurrency: 1 threads (target='dev')
14:08:33 |
14:08:33 | 1 of 1 START seed file dbt_dev.seed.............................. [RUN]
14:08:33 | 1 of 1 ERROR loading seed file dbt_dev.seed...................... [ERROR in 0.11s]
14:08:33 |
14:08:33 | Finished running 1 seeds in 0.89s.

Completed with 1 errors:

Database Error in seed seed (data\seed.csv)
  ('The SQL contains 0 parameter markers, but 30 parameters were supplied', 'HY000')

Done. PASS=0 ERROR=1 SKIP=0 TOTAL=1

Expected: <0>
     but: was <1>

--------------------------------------------------------------------------------


Failing scenarios:
  features/001_basic_materializations.feature:51  Test materialized='view' -- @1.1

1 feature passed, 1 failed, 0 skipped, 1 untested
0 scenarios passed, 1 failed, 0 skipped, 12 untested
6 steps passed, 1 failed, 2 skipped, 0 undefined, 139 untested
Took 0m14.121s

@jacobm001
Copy link
Owner

It looks like I ran into a build tool error and ended up committing a breaking change without realizing it... My apologies.

I am currently traveling for work, but will try to resolve these issues this weekend.

@curious-bryan
Copy link
Contributor Author

@jacobm001 Roger that. Looking forward to it. In the mean time, I have been looking into adapters.sql for dbt-mssql (1.0.3). And I was wondering if you would be interested in collaborating on dbt-mssql. No worries if not.

@curious-bryan
Copy link
Contributor Author

@jacobm001, heads up! When you get back to it this weekend, for Line 1 in dbt-mssql/dbt/include/mssql/macros/materializations/seed/seed.sql, I think there might be a typo for the macro name.

Currently (1.0.3 and also 1.0.2), the macro name is typed as basic_load_csv_rows(...). From the documentation, it appears the name might need to be changed to mssql__load_csv_rows(...) so that when adapter_macro(...) is called from macro load_csv_rows in seed.sql in the core dbt package (See Lines 10-12 here), the macro resolves to dbt-mssql's code for how to load CSV rows.

If so, I think the macro default__load_csv_rows is being called instead of mssql__load_csv_rows. And default__load_csv_rows uses %s as the parameter marker instead of ? which may be throwing the pyobdc error.

I apologize. I haven't had a chance to test locally. However, if that change does need to be made, then Lines 36 through 39 in dbt-mssql/.../seed.sql may need to be re-considered if still needed.

jacobm001 added a commit that referenced this issue Aug 18, 2019
This issue was caused by the `package_data` variable being setup incorrectly in `setup.py`. Apparently the file listing is **not** recursive, which caused the `sdist` files to lack several macro overrides needed to function.
@jacobm001
Copy link
Owner

I have confirmed that this issue is due to my tooling. While the 1.0.3 release on github works fine, the setup.py file didn't correctly grab all the needed macros. That has now been fixed in the branch issue-6. I don't have my test harness configured on this machine, but I will run that branch through the test harness on Monday, and if that passes, I'll merge it into master and do a 1.0.4 release on PyPi.

jacobm001 added a commit that referenced this issue Aug 19, 2019
ThomasLaPiana pushed a commit to degreed/dbt-azuresa that referenced this issue Feb 6, 2020
* Fixed issue jacobm001#6.

This issue was caused by the `package_data` variable being setup incorrectly in `setup.py`. Apparently the file listing is **not** recursive, which caused the `sdist` files to lack several macro overrides needed to function.

* Added a note about the ODBC driver

As noted in issue jacobm001#5, the use of the `driver` variable was not particularly clear. I've added some info to try and explain that configuration better.

* altered mssql__create_view_as macros

Looks like this issue is being caused by the `mssql__create_view_as` macro. It works fine if the sql it's provided does not contain a CTE. If it does, sql server considers it a syntax error.

This commit removes the parenthesis wrapping the `{{ sql }}` portion.

* Fixed CTEs with insert into statements

SQL Server's `insert into` syntax isn't nearly as forgiving as in other databases. In the previous version I had created a cte as apart of the into statement that could later be referenced.

This worked so long as your source model didn't contain a CTE of its own. If it did, that put a CTE declaration inside another CTE which broke everything. I've taken a que from the dbt-sqlserver package and am now creating a "temporary" view to handle the issue.

Thanks @mikaelene for the example.

* incremented version number

* Update README.md

* updated .gitignore

* delt with empty column names

It appears that issue jacobm001#10 is caused by MSSQL not returning default column names for aggregate functions through the odbc library. When the `''` column name hits the agate library, an error is thrown.

To handle this behavior, I've overridden the class method, `get_result_from_cursor()`. The new method loops through all the column names and replaces any instances of `''` with `unnamed_column-{i}`. This should provide a simple work around that the user doesn't really see, but is also very easy for the user to avoid if it's undesired behaivor.

* incremented version

* Update README.md

* initial commit

* added catalog

* added work from https://github.com/norton120/dbt-azuredatawarehouse

* moved to explicit varchar size per https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-data-types

* handle empty string column name from scalar

* update README to match dockerfile

* added sample profile.yml

* clean up git conflict noise in readme

* known issues in readme

* added process_results class method and updated sql connection type

Co-authored-by: Jacob Mastel <[email protected]>
Co-authored-by: Isaac Chavez <[email protected]>
Co-authored-by: Ethan Knox <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants