-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEST: add basic postgresql tests #6316
TEST: add basic postgresql tests #6316
Conversation
you should prob add to 3.3 build too |
The first failure is due to checking the type of the boolean column retrieved from the database. But it is checked if it is integer, while in postgresql the returned type is just bool (https://github.com/pydata/pandas/blob/master/pandas/io/tests/test_sql.py#L501). I suppose this is checked as integer for sqlite, which has no bool type. So, should I overwrite this test in the PostgreSQL test class? But this will also be a problem for mysql |
Yes, it checks integer because SQLite has no bool. But it should check for boolean type for the others. If it’s wrong for the MySQL case you should fix for both mySQL and Postgres. Be careful because boolean columns with Nulls should load as object rather than boolean, because pandas can’t store None in a boolean column otherwise (if you coerce to boolean None/NA gets converted to False). I think that’s also tested but good to make sure it works in postgress too. MySQL needs a pymsql requirement for Travis, i think i just never got round to adding it. I made it pymysql specific because it’s pure python (unlike other drivers) so easier to install, also i had trouble pip-installing the official mysql-connector driver. On 10 Feb 2014, at 14:03, Joris Van den Bossche [email protected] wrote:
|
The second failure is due to reading in the iris table:
which are all objects (while it should be floats for the first columns).
|
@jorisvandenbossche wierd. the read_sql version is probably correct due to pandas correctl guessing the datatype. For some reason the explicit type conversion is failing. I don't have time to work on this now, but the place to debug is the |
@mangecoeur Thanks for the pointer. I will have a look at it later. If I change the type from |
Ok then I know the issue. We check for On 10 Feb 2014, at 14:24, Joris Van den Bossche [email protected] wrote:
|
I added The PostgreSQL test that is failing is due to the Boolean column with None which is converted to object type and not float (the |
@jorisvandenbossche The behaviour you describe is correct: Boolean with None should become Object, see http://pandas.pydata.org/pandas-docs/dev/missing_data.html#missing-data-casting-rules-and-indexing However I can see that the tests are confusing - when testing with SQLite we look for Float, not Object. This is because we actually can't know that a column in SQLite is supposed to be treated as a boolean unless we already knew that before hand. Without extra info we have to treat it as numeric, so an integer column with Nulls gets casted to Float and its up to the user to know that their data represent boolean values. However for the Postgres/MySQL case we should check for Object, since we know that a column is supposed to be Boolean. Feel free to update the tests accordingly. |
you guys can do some heuristic stuff that is not that expensive - something like check first few values if 0/1 then it could be bool then in a try except you can astype to bool and compare for equality to original see core/ common/_possibly_downcast |
@jreback I think that's dangerous since you could potentially turn random integer columns into bool if they happen to only have 0s and 1s (and maybe Nulls). Data coming out of an sqlite DB just looks like a column of numbers and an arbitrary name, not much to hint at what it's supposed to be for. We could consider adding a coerce boolean optional arg that would work like parse dates and force columns to be loaded as boolean - would also be handy for other DBs if you have some ugly data where booleans have been stored as the wrong data type (I've seen them stored as ints, floats, and chars - never assume whoever created the dataset knew what they were doing!) |
I think adding a dtype = arg would be fine then; same interface as read_csv |
|
@mangecoeur Would you have time to look at the MySQL failures? It's failing in the
So the things that are wrong: float as object, bool as int and boolwithnull as float instead of object.
|
TEST: add postgresql to travis
One base class for tests with sqlalchemy backend. So test classes for mysql and postgresql don't have to overwrite tests that are different for sqlite.
OK, I removed the activating of MySQL tests (just this commit: jorisvandenbossche@036a3d4) from this PR and leave only the PostgreSQL and little refactoring stuff. So this can be merged. MySQL can be fixed in another PR. |
Travis is happy now! (without the mysql) |
@jorisvandenbossche think we ought to add |
@jorisvandenbossche looks good.. |
Updates to tests looks good. I'm doing some hardcore dogfooding right now with this and postgres so finding a number of issues. Just added support for Datetimes with timezone info in my private branch (since they are supported in postgres). Need to think what is the sane thing to do when saving to a DB without timezone support. |
OK, then I am going to merge this one. @mangecoeur Can you point to the branch with your datetime support? Then I can try it out if it solves the problems I reported (or already open a PR, no problem if it is still work inprogress) |
TEST: add basic postgresql tests
would you guys would like me to add the postgres test stuff to my windows builds ? don';t have too...but could I just setup a postgres server right? with any particular database setup? |
@jreback If you can, certainly welcome I think. |
ok running on 27-32,27-64,33-32,33-64 (mix of sqlalchemy 0.8.1 and 9.2), all the same psycopg2
|
Nice! The test for MySQL are still failing, so we should first fix that before enabling the tests |
@mangecoeur I just copied and adapted the mysql tests and provided some
SQL_STRINGS
that follow the postgresql dialect. Is this what you had in mind how other sql dialects would be added to the tests?@jreback I now added it to requirements-2.7.txt. Should I add it to all test environments? Or if only one, is this the correct one?
@mangecoeur Is it possible that there is a
pymysql
missing in the requirement files to run the MySQL tests (they are all skipped on travis)?At the moment, there are two tests failing with postgresql (see https://travis-ci.org/jorisvandenbossche/pandas/jobs/18576746).