Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to read simple .db file with sqlite3 #2312

Closed
gmartinonQM opened this issue Feb 14, 2023 · 8 comments
Closed

Fail to read simple .db file with sqlite3 #2312

gmartinonQM opened this issue Feb 14, 2023 · 8 comments
Labels
Community Issue/PR opened by the open-source community Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@gmartinonQM
Copy link

gmartinonQM commented Feb 14, 2023

Description

I have a simple database, a mere data/01_raw/compas.db file. I want to read it with the catalog. I cannot read it properly.

Context

I am just trying to read some open-source data in the form of an SQLite3 database.

Steps to Reproduce

Here is my base/catalog.yml :

compas:
  type: pandas.SQLQueryDataSet
  sql: "SELECT * from compas"
  credentials: compas_credentials

Here is my local/credentials.yml :

compas_credentials:
  con: sqlite:///data/01_raw/compas.db
  1. download the compas.db here : https://github.com/propublica/compas-analysis
  2. install kedro and sqlalchemy
  3. Initiate the kedro template
  4. put the data into data/01_raw in the kedro template
  5. reproduce the above base/catalog.yml and local/credentials.yml
  6. kedro ipython
  7. catalog.load("compas")

Expected Result

This should read the table and return a pandas dataframe.

Actual Result

Launching kedro ipython and then catalog.load("compas"), I get the following error :

DataSetError: Failed while loading data from data set SQLQueryDataSet(execution_options={}, filepath=None, load_args={}, sql=SELECT * from compas;).
'OptionEngine' object has no attribute 'execute'

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

  • Kedro version used (pip show kedro or kedro -V): 0.18/4
  • Python version used (python -V): 3.9.12
  • Operating system and version: MacOS Monterey 12.6
@datajoely
Copy link
Contributor

So this looks like there is a breaking change in the latest version of SQLAlchemy - you can resolve this by downgrading your latest version:
https://levelup.gitconnected.com/how-to-fix-attributeerror-optionengine-object-has-no-attribute-execute-in-pandas-eb635fbb89e4

It looks like it's an active issue on the Pandas side too:
pandas-dev/pandas#40686

For now - pin your SQLAlchemy version manually and I suspect Pandas will push a new release shortly that fixes this regression.

@merelcht
Copy link
Member

@gmartinonQM Have you been able to resolve the issue with the suggestion above or do you need some more help?

@gmartinonQM
Copy link
Author

HI @merelcht, well we circumvented the problem by loading the data outside kedro and saved it to CSV... A hack more than a solution I would say. Still I am interested in a kedro.dataset handling sqlite3 in the future.

@datajoely
Copy link
Contributor

I think the issue is resolved on the Pandas side without us making a change to Kedro:
pandas-dev/pandas#48576

@datajoely
Copy link
Contributor

That being said we may have to enable support for the forthcoming Pandas 2.x release...

@AhdraMeraliQB
Copy link
Contributor

Hi @gmartinonQM, did the pandas change resolve your issue?

@AhdraMeraliQB AhdraMeraliQB added Issue: Bug Report 🐞 Bug that needs to be fixed Community Issue/PR opened by the open-source community labels Mar 21, 2023
@gmartinonQM
Copy link
Author

Hi @AhdraMeraliQB , yes, it works now, without changing anything to the code and config.

@noklam
Copy link
Contributor

noklam commented Mar 21, 2023

@gmartinonQM Awesome, thanks for getting back to us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Issue/PR opened by the open-source community Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
None yet
Development

No branches or pull requests

5 participants