Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Document existing functionality of pandas.DataFrame.to_sql() #11886 #26795

Merged
merged 20 commits into from
Aug 30, 2019
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 25 additions & 11 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import operator
import pickle
from textwrap import dedent
from typing import Callable, FrozenSet, List, Set
from typing import Any, Callable, Dict, FrozenSet, Iterator, List, Set, Union
import warnings
import weakref

Expand Down Expand Up @@ -34,6 +34,7 @@
from pandas.core.dtypes.missing import isna, notna

import pandas as pd
from pandas._typing import Dtype
from pandas.core import missing, nanops
import pandas.core.algorithms as algos
from pandas.core.base import PandasObject, SelectionMixin
Expand All @@ -50,6 +51,9 @@
from pandas.io.formats.printing import pprint_thing
from pandas.tseries.frequencies import to_offset

# mypy confuses the `bool()`` method of NDFrame
_bool = bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this is unfortunate and something we've seen before:

#26029 (comment)

The alias is the suggested approach so no change required here I think, but cc @jreback for visibility


# goal is to be able to define the docs close to function, while still being
# able to share
_shared_docs = dict()
Expand Down Expand Up @@ -2458,8 +2462,17 @@ def to_msgpack(self, path_or_buf=None, encoding='utf-8', **kwargs):
return packers.to_msgpack(path_or_buf, self, encoding=encoding,
**kwargs)

def to_sql(self, name, con, schema=None, if_exists='fail', index=True,
index_label=None, chunksize=None, dtype=None, method=None):
# TODO: Replace `Callable[[Any, Any, ...` when SQLTable and sqlalchemy
# can be imported. SQLTable can't be imported due to circular import.
# sqlalchemy can't be imported since it's an optional dependency.
def to_sql(self, name: str, con,
oguzhanogreden marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry should have asked this before but can you put each parameter on a separate line? Will help with readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added Any to con= and added explanation to the note so that less thinking is required later.

schema: str = None, if_exists: str = 'fail',
index: _bool = True, index_label: Union[str, List[str]] = None,
chunksize: int = None,
dtype: Union[Dict[str, Dtype], Dtype] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops sorry missed this but can you just import Dtype from pandas._typing and use that as the annotation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually mentioned Dict[Any, Dtype] but I took the liberty to 'interpret' your comment, since I don't see how:

  1. Dict[Any, Dtype] annotates plain Dtype case and,
  2. Dict[str, Dtype] is more precise for dtype={'column_name': dtype} case.

And now I don't see how dtype: Dtype annotates dtype={'column_name': Dtype} situation. Seeing that you insist on this, I'll suppose I'm really off the mark here and change this as follows:

dtype: Dtype = None

After that I'll take another look at core.dtypes (and probably chase you down on Gitter soon).

method: Union[str, Callable[[Any, Any, List[str],
Iterator[List]], None]] = None
) -> None:
"""
Write records stored in a DataFrame to a SQL database.

Expand All @@ -2468,12 +2481,12 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True,

Parameters
----------
name : string
name : str
Name of SQL table.
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. Legacy support is provided for sqlite3.Connection objects.
schema : string, optional
schema : str, optional
Specify the schema (if database flavor supports this). If None, use
default schema.
if_exists : {'fail', 'replace', 'append'}, default 'fail'
Expand All @@ -2486,18 +2499,19 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True,
index : bool, default True
Write DataFrame index as a column. Uses `index_label` as the column
name in the table.
index_label : string or sequence, default None
index_label : string or sequence, optional
Column label for index column(s). If None is given (default) and
`index` is True, then the index names are used.
A sequence should be given if the DataFrame uses MultiIndex.
chunksize : int, optional
Rows will be written in batches of this size at a time. By default,
all rows will be written at once.
dtype : dict, optional
Specifying the datatype for columns. The keys should be the column
names and the values should be the SQLAlchemy types or strings for
the sqlite3 legacy mode.
method : {None, 'multi', callable}, default None
dtype : dict or scalar, optional
Specifying the datatype for columns. If a dictionary is used, the
keys should be the column names and the values should be the
SQLAlchemy types or strings for the sqlite3 legacy mode. If a
scalar is provided, it will be applied to all columns.
method : {None, 'multi', callable}, optional
Controls the SQL insertion clause used:

* None : Uses standard SQL ``INSERT`` clause (one per row).
Expand Down
19 changes: 10 additions & 9 deletions pandas/io/sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -396,14 +396,14 @@ def to_sql(frame, name, con, schema=None, if_exists='fail', index=True,
Parameters
----------
frame : DataFrame, Series
name : string
name : str
Name of SQL table.
con : SQLAlchemy connectable(engine/connection) or database string URI
or sqlite3 DBAPI2 connection
Using SQLAlchemy makes it possible to use any DB supported by that
library.
If a DBAPI2 object, only sqlite3 is supported.
schema : string, default None
schema : str, optional
Name of SQL schema in database to write to (if database flavor
supports this). If None, use default schema (default).
if_exists : {'fail', 'replace', 'append'}, default 'fail'
Expand All @@ -412,18 +412,19 @@ def to_sql(frame, name, con, schema=None, if_exists='fail', index=True,
- append: If table exists, insert data. Create if does not exist.
index : boolean, default True
Write DataFrame index as a column.
index_label : string or sequence, default None
index_label : str or sequence, optional
Column label for index column(s). If None is given (default) and
`index` is True, then the index names are used.
A sequence should be given if the DataFrame uses MultiIndex.
chunksize : int, default None
chunksize : int, optional
If not None, then rows will be written in batches of this size at a
time. If None, all rows will be written at once.
dtype : single SQLtype or dict of column name to SQL type, default None
Optional specifying the datatype for columns. The SQL type should
be a SQLAlchemy type, or a string for sqlite3 fallback connection.
If all columns are of the same type, one single value can be used.
method : {None, 'multi', callable}, default None
dtype : dict or scalar, optional
oguzhanogreden marked this conversation as resolved.
Show resolved Hide resolved
Specifying the datatype for columns. If a dictionary is used, the
keys should be the column names and the values should be the
SQLAlchemy types or strings for the sqlite3 fallback mode. If a
scalar is provided, it will be applied to all columns.
method : {None, 'multi', callable}, optional
Controls the SQL insertion clause used:

- None : Uses standard SQL ``INSERT`` clause (one per row).
Expand Down