Use name and type comparising when appending a dataframe into table #14

ghost · 2017-02-26T17:21:25Z

I modified GbqConnector.verify_schema function to parse name and type from the remote schema (basically dropping mode) and include those in the compared fields.

Currently, when appending to a BQ table, comparison between the destination table's schema and a dataframe schema is done over superset of a BQ schema definition (name, type, mode) when _generate_bq_schema parses only name and type from a dataframe.

IMO it would be inconvenient to make the mode check in the module by generating completeness of columns (includes null values or not). So raising a generic GBQ error is more convenient here.

closes #13

…en appending to a table

codecov-io · 2017-02-26T17:24:29Z

Codecov Report

Merging #14 into master will decrease coverage by -37.22%.
The diff coverage is 14.28%.

@@             Coverage Diff             @@
##           master      #14       +/-   ##
===========================================
- Coverage   75.03%   37.81%   -37.22%     
===========================================
  Files           4        4               
  Lines        1450     1457        +7     
===========================================
- Hits         1088      551      -537     
- Misses        362      906      +544

Impacted Files	Coverage Δ
pandas_gbq/gbq.py	`30.97% <0%> (-46.11%)`	❌
pandas_gbq/tests/test_gbq.py	`39.55% <16.66%> (-46.35%)`	❌

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 065eb15...bf8c378. Read the comment docs.

jreback · 2017-02-26T18:44:47Z

can you add some tests?

jreback · 2017-02-26T19:07:04Z

also can you add a release note to here: https://github.com/pydata/pandas-gbq/blob/master/docs/source/changelog.rst (I just added this)

…emas

ghost · 2017-02-26T19:49:31Z

Added a test and supplied changelog.

jreback · 2017-02-26T20:06:50Z

this is going to close #13 right?

ghost · 2017-02-26T20:07:19Z

Yes, that's right.

jreback · 2017-02-26T20:08:03Z

docs/source/changelog.rst

 --------------

+Fixed an issue with appending to a BigQuery table where fields have modes (NULLABLE,REQUIRED,REPEATED). The changes concern solely the comparision of the local (DataFrame) and remote (BQ) schema in GbqConnector.verify_schema function. The fix is to omit other field attributes than name and type.


add (:issue:`13`).

jreback · 2017-02-26T20:09:10Z

is this also the same soln as pandas-dev/pandas#13086 (well is the issue the same)?

ghost · 2017-02-26T20:12:20Z

Yes, it's the same issue. Though there was an issue with schema having descriptions, my case was just with modes – which violations could be made an exception case in some future time.

jreback · 2017-02-26T20:14:24Z

@mremes is there anything you can take / tests from that issue for this to be more robust? (e.g. a test?)

jreback · 2017-02-26T20:19:17Z

pandas_gbq/tests/test_gbq.py

+                                     'type': 'TIMESTAMP'}]}
+
+        self.table.create(TABLE_ID + test_id, test_schema_1)
+        self.assertTrue(self.sut.verify_schema(


just use

assert self.sut.verify_schema(.......), .....

the self.assertTrue was a nose convention, now using pytest so want to switch

OK, I just looked the convention from the tests above.

hah, I was going to change it on merge...but forgot...no worries

ghost · 2017-02-26T20:21:36Z

@jreback the fix should be a solid approach both to mine issue and the pandas-dev issue you referenced. Because the local schema is constructed from DF's column names and types, it's approariate to select only a name,field-subset of BQ fields when comparing.

I don't see a point in e.g. adding multiple discarded fields or anything like that.

jreback · 2017-02-26T20:23:29Z

@mremes ok great. merging.

jreback · 2017-02-26T20:32:31Z

thanks @mremes

all set (though for some reason inter-sphinx links not working)....
https://pandas-gbq.readthedocs.io/en/latest/changelog.html#id2

Use name and type of fields for comparing remote and local schemas wh…

631d66c

…en appending to a table

fix bug with selecting key

5dafd55

mremes added 4 commits February 26, 2017 21:21

Added test for validate_schema ignoring field mode when comparing sch…

66aa616

…emas

Merge remote-tracking branch 'upstream/master'

45826f1

make the syntax of the test flake-pretty

70d08ef

changelog for verify_schema changes

77b1fd5

jreback reviewed Feb 26, 2017

View reviewed changes

added reference to issue #13

bf8c378

jreback reviewed Feb 26, 2017

View reviewed changes

jreback closed this in 89bf82d Feb 26, 2017

jreback added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label Feb 26, 2017

jreback added this to the 0.2.0 milestone Feb 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use name and type comparising when appending a dataframe into table #14

Use name and type comparising when appending a dataframe into table #14

ghost commented Feb 26, 2017 •

edited by jreback

Loading

codecov-io commented Feb 26, 2017 •

edited

Loading

jreback commented Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

jreback Feb 26, 2017 •

edited

Loading

ghost Feb 26, 2017 •

edited by ghost

Loading

jreback Feb 26, 2017

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

jreback commented Feb 26, 2017

		--------------

		Fixed an issue with appending to a BigQuery table where fields have modes (NULLABLE,REQUIRED,REPEATED). The changes concern solely the comparision of the local (DataFrame) and remote (BQ) schema in GbqConnector.verify_schema function. The fix is to omit other field attributes than name and type.

Use name and type comparising when appending a dataframe into table #14

Use name and type comparising when appending a dataframe into table #14

Conversation

ghost commented Feb 26, 2017 • edited by jreback Loading

codecov-io commented Feb 26, 2017 • edited Loading

Codecov Report

jreback commented Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback Feb 26, 2017

Choose a reason for hiding this comment

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

jreback Feb 26, 2017 • edited Loading

Choose a reason for hiding this comment

ghost Feb 26, 2017 • edited by ghost Loading

Choose a reason for hiding this comment

jreback Feb 26, 2017

Choose a reason for hiding this comment

ghost commented Feb 26, 2017

jreback commented Feb 26, 2017

jreback commented Feb 26, 2017

ghost commented Feb 26, 2017 •

edited by jreback

Loading

codecov-io commented Feb 26, 2017 •

edited

Loading

jreback Feb 26, 2017 •

edited

Loading

ghost Feb 26, 2017 •

edited by ghost

Loading