Add dtype keyword to read_sql_query to control per column dtypes. #10285

Gerenuk · 2015-06-05T07:45:39Z

When reading from SQL queries - especially when reading chunk-wise - issues with type conversion can occur.

It would be ideal if one could specify the types of the columns, so that type conversions go right.

A very common case is reading pure float results chunk-wise from a large table. The nuisance comes in when and chunk contains NULL values only in a column. In such cases these NULL values are stored as None rather than float("nan"). I see that this is an issue with inconsistent NULL types in pandas and missing type information in read_sql_query.
Another case could be when text columns happen to contain numbers only in a chunk.

Specifying types for a query manually could resolve these issues.

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2015-08-11T11:36:50Z

This can be done equivalent to the dtype argument in read_csv

avinashpancham · 2020-10-31T21:29:32Z

take

asandeep · 2021-09-03T10:15:55Z

@avinashpancham Looks like this happens with read_sql as well. Are there plans to add a similar dtypes argument to read_sql as well?

jorisvandenbossche added the IO SQL to_sql, read_sql, read_sql_query label Jun 5, 2015

jorisvandenbossche added this to the Someday milestone Aug 11, 2015

jorisvandenbossche added Difficulty Novice labels Aug 11, 2015

jorisvandenbossche mentioned this issue Aug 29, 2015

EuroScipy 2015 pandas sprint #10877

Closed

TomAugspurger added the good first issue label Oct 11, 2017

jreback removed the Difficulty Novice label Dec 15, 2017

TomAugspurger removed the good first issue label Jun 6, 2019

TomAugspurger modified the milestones: Someday, Contributions Welcome Jun 6, 2019

TomAugspurger mentioned this issue Jun 6, 2019

read_sql_query type detection when missing data #14314

Closed

TomAugspurger changed the title ~~Add control over types in read_sql_query (to resolve NULL inconsistency)~~ Add dtype keyword to read_sql_query to control per column dtypes. Jun 6, 2019

This was referenced Jun 6, 2019

pd.read_sql_query() does not convert NULLs to NaN #14319

Closed

read_sql_query converts empty columns to object with no way to override #26682

Closed

jbrockmendel removed the Effort Medium label Oct 21, 2019

mroeschke added the Enhancement label May 16, 2020

github-actions bot assigned avinashpancham Oct 31, 2020

avinashpancham mentioned this issue Oct 31, 2020

ENH: Add dtype argument to read_sql_query (GH10285) #37546

Merged

5 tasks

jreback modified the milestones: Contributions Welcome, 1.3 Dec 23, 2020

jreback closed this as completed in #37546 Dec 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dtype keyword to read_sql_query to control per column dtypes. #10285

Add dtype keyword to read_sql_query to control per column dtypes. #10285

Gerenuk commented Jun 5, 2015

jorisvandenbossche commented Aug 11, 2015

avinashpancham commented Oct 31, 2020

asandeep commented Sep 3, 2021

Add dtype keyword to read_sql_query to control per column dtypes. #10285

Add dtype keyword to read_sql_query to control per column dtypes. #10285

Comments

Gerenuk commented Jun 5, 2015

jorisvandenbossche commented Aug 11, 2015

avinashpancham commented Oct 31, 2020

asandeep commented Sep 3, 2021