Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysql support #2482

Closed
wants to merge 7 commits into from
Closed

mysql support #2482

wants to merge 7 commits into from

Conversation

danielballan
Copy link
Contributor

I added mysql support and (untested!) support for oracle. Some parts of the code are ready to support other flavors -- it should be easy to extend.

@changhiskhan
Copy link
Contributor

Thanks for the PR!
one thing that will make it easier is if you can make the test cases optional. If you look at pandas/io/tests/test_excel.py for example, xlwt/xlrd/openpyxl are all optional dependencies. If people don't have those drivers installed, the test suite should still pass

@paulproteus
Copy link

The next step here would be for someone to review the pull request and see if you can add a commit on top that makes the test cases optional. See @changhiskhan 's most recent comment for sample code to look at.

@danielballan
Copy link
Contributor Author

OK, I think everything is in order.

@danielballan
Copy link
Contributor Author

bump

@wesm
Copy link
Member

wesm commented Jan 3, 2013

marked for review for 0.10.1

@ghost ghost assigned changhiskhan Jan 19, 2013
changhiskhan pushed a commit that referenced this pull request Jan 21, 2013
@changhiskhan
Copy link
Contributor

@danielballan I merged in the MySQL flavor but took out the other flavors for lack of tests. I made some tweaks to clean things up just a little bit. Unfortunately while I was merging I f'ed up somewhere in the process and the whole thing ended up marked as my commit. I added a line note in each file in that commit. Since there's still other SQL flavors in there, I'm moving this to a later milestone so when you get a chance to implement more test cases, we'll merge in the rest.

@wesm wesm closed this Jan 21, 2013
@changhiskhan
Copy link
Contributor

ok, properly attributed a835118
sorry for the snafu @danielballan

@wesm let's keep this open until someone has a chance to write tests for Postgres/Oracle/odbc?

@changhiskhan changhiskhan reopened this Jan 22, 2013
@wesm
Copy link
Member

wesm commented Jan 22, 2013

okay

@danielballan
Copy link
Contributor Author

Thanks. Good changes.

yarikoptic added a commit to neurodebian/pandas that referenced this pull request Jan 23, 2013
Version 0.10.1

* tag 'v0.10.1': (195 commits)
  RLS: set released to true
  RLS: Version 0.10.1
  TST: skip problematic xlrd test
  Merging in MySQL support pandas-dev#2482
  Revert "Merging in MySQL support pandas-dev#2482"
  BUG: don't let np.prod overflow int64
  RLS: note changed return type in DatetimeIndex.unique
  RLS: more what's new for 0.10.1
  RLS: some what's new for 0.10.1
  API: restore inplace=TRue returns self, add FutureWarnings. re pandas-dev#1893
  Merging in MySQL support pandas-dev#2482
  BUG: fix python 3 dtype issue
  DOC: fix what's new 0.10 doc bug re pandas-dev#2651
  BUG: fix C parser thread safety. verify gil release close pandas-dev#2608
  BUG: usecols bug with implicit first index column. close pandas-dev#2654
  BUG: plotting bug when base is nonzero pandas-dev#2571
  BUG: period resampling bug when all values fall into a single bin. close pandas-dev#2070
  BUG: fix memory error in sortlevel when many multiindex levels. close pandas-dev#2684
  STY: CRLF
  BUG: perf_HEAD reports wrong vbench name when an exception is raised
  ...
@garaud
Copy link
Contributor

garaud commented Jan 24, 2013

I'm interested by the postgre support and testing in Pandas. Do you know if someone work on it? I would like to deep into io.sql & postresql stuff in a few weeks if it's possible. I'll begin to create a branch from changhiskhan@a835118 (better idea?) in order to avoid any painful conflicts and to keep the mysql test framework.

Cheers.

@danielballan
Copy link
Contributor Author

Sounds good to me. Sqlite and MySQL are the flavors I use in my work, so I'm glad to see someone else pick up postgresql. I will direct my efforts to more specific data type detection, as noted in the comments of the current release.

@changhiskhan
Copy link
Contributor

@garaud I don't think I merged @danielballan's postgre support in my fork there. If you don't want to mess with merging, just fork from @danielballan's PR branch and add test cases. It looked perfectly fine to me I just didn't have time to add test cases for it. I can take care of merging into master at the end if you don't want to mess with it.

@garaud
Copy link
Contributor

garaud commented Jan 24, 2013

OK. Thanks. I forked the branch mysql from @danielballan and created a branch postgre from it. I'll keep you posted.

@mangecoeur
Copy link
Contributor

Just wondering, if this is to be an optional module in any case, wouldn't it make more sense to take advantage of SQLAlchemy's well tested DB support and instead add SQLAlchemy as an optional dependancy, with sqlite as a fallback. I'm interested in this approach because i use pandas and sqlalchemy together quite a bit anyway, it allows you to abstract away the differences in SQL flavours.

@danielballan
Copy link
Contributor Author

I was not aware of SQLAlchemy. This looks dead useful. Can you post a gist of an example where you've used them together?

@mangecoeur
Copy link
Contributor

I actually tried doing the integration myself, i have a work in progress here (very very alpha):
https://github.com/mangecoeur/pandas/blob/sqlalchemy-integration/pandas/io/sql.py

I'm currently working in the "write_frame2" function so that i can compare results with the existing "write_frame" function (the idea would be to merge those later). It doesn't work for me yet because I need to find out the best way to convert from numpy dtypes to SQL DB supported types - preferably with the least effort duplication. I'm thinking there might be a way to re-use the CSV read/write parsers.

@jreback
Copy link
Contributor

jreback commented Feb 12, 2013

you guys might useful #2752
(convert datetimes to nan when u astype to object)
and df.blocks property
(which gives u a dict of dtype to homogeneous frame)
these are both new in 0.11 and in current master

@jreback
Copy link
Contributor

jreback commented Feb 12, 2013

you also might want to take a look at pandas/io/pytables.py
Table.create_axes for this kind of type mapping (which is somewhat non-trivial)

@garaud
Copy link
Contributor

garaud commented Jun 20, 2013

Hi,

Is this issue still opened? There are some features with dedicated tests in pandas/io/sql.py. See the commit a835118

However, this feature does not occur in the RELEASE.rst file for the 0.11 release.

@danielballan
Copy link
Contributor Author

@garaud, This is still open because it remains to write tests and writing capabilities for several more flavors of SQL. (See above.) Also see SQLAlchemy discussion. As for the docs, yes, they are incomplete. This was my first PR and I didn't put documentation everywhere that it belongs. I think changhiskhan took care of some of it though. Feel free to elaborate.

@hayd
Copy link
Contributor

hayd commented Jul 8, 2013

Does it make more sense to break this into (several?) new issues, at least these two I think may make more sense being separate:

  • SQLalchemy (how's this coming along @mangecoeur ? Is there much still to do/can we help?)
  • postgres (would this be free with sqlalchemy?)

Or maybe also a global "SQL support" issue with several parts/roadmap?

@danielballan
Copy link
Contributor Author

I think a SQL support issue is the way to go.

  • There are several open bugs.
  • We can support postgresql and oracle with very little additional effort. I'd rather see them tested by people who regularly use those flavors of SQL, but I think I could get them off the ground and minimally test them.
  • That said, I support adopting SQL alchemy and eventually relying on it for all frame writing, if that branch works.

One additional thought: One way or another, we need to infer data types more carefully. jrebeck provides some helpful references in other parts of pandas. Before anyone takes that on, we should decide whether we are ultimately going to toss all of this out in favor of SQLAlchemy.

@hayd hayd mentioned this pull request Jul 8, 2013
20 tasks
@hayd hayd closed this Jul 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants