Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't raise exceptions splitting a blank string #20067

Merged
merged 2 commits into from
Mar 17, 2018

Conversation

montanalow
Copy link
Contributor

@montanalow montanalow commented Mar 9, 2018

whatsnew: bug fix on reshaping

@pep8speaks
Copy link

pep8speaks commented Mar 9, 2018

Hello @montanalow! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 17, 2018 at 19:42 Hours UTC

@jreback
Copy link
Contributor

jreback commented Mar 9, 2018

this looks ok, though failing some other tests. you may need to adjust them.

@jreback jreback added the Strings String extension data type and string data label Mar 9, 2018
@@ -1943,6 +1943,13 @@ def test_split(self):
[u('f'), u('g'), u('h')]])
tm.assert_series_equal(result, exp)

# expand blank split
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a separate test

@@ -1943,6 +1943,13 @@ def test_split(self):
[u('f'), u('g'), u('h')]])
tm.assert_series_equal(result, exp)

# expand blank split
values = Series(['a b c', 'a b', '', ' '])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also test on a single string, like the original issue. and add the issue number as a comment.

@codecov
Copy link

codecov bot commented Mar 9, 2018

Codecov Report

Merging #20067 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20067      +/-   ##
==========================================
+ Coverage   91.79%    91.8%   +<.01%     
==========================================
  Files         152      152              
  Lines       49206    49206              
==========================================
+ Hits        45170    45172       +2     
+ Misses       4036     4034       -2
Flag Coverage Δ
#multiple 90.18% <100%> (ø) ⬆️
#single 41.85% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/strings.py 98.32% <100%> (ø) ⬆️
pandas/util/testing.py 83.95% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 083ebac...7fb3297. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a whatsnew note. bug fix on reshaping is ok.

@@ -1992,6 +1992,19 @@ def test_rsplit(self):
exp = Series([['a_b', 'c'], ['c_d', 'e'], NA, ['f_g', 'h']])
tm.assert_series_equal(result, exp)

def test_split_blank_string(self):
# expand blank split GH 20067
values = Series([''])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a name= to the Series so its clear where the name goes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow exactly what you mean. Are you suggesting adding name='test' as an arg to the constructor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes add a name, the point is that this generates the column name for the frame.

@@ -1992,6 +1992,19 @@ def test_rsplit(self):
exp = Series([['a_b', 'c'], ['c_d', 'e'], NA, ['f_g', 'h']])
tm.assert_series_equal(result, exp)

def test_split_blank_string(self):
# expand blank split GH 20067
values = Series([''])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes add a name, the point is that this generates the column name for the frame.

@montanalow
Copy link
Contributor Author

@jreback It looks like travis and appveyor canceled their builds, after I added names per request. Is there a retry mechanism for them?

@jreback
Copy link
Contributor

jreback commented Mar 13, 2018

looks fine. can you rebase on master & add a whatsnew note? ping on green.

@montanalow montanalow force-pushed the blank_strings branch 2 times, most recently from dc2af53 to 6915c33 Compare March 13, 2018 23:58
@montanalow
Copy link
Contributor Author

@jreback all green

@TomAugspurger
Copy link
Contributor

@montanalow can you add a release note to whatsnew/v0.23.0.txt under bug fixes? LGTM otherwise.

@montanalow
Copy link
Contributor Author

@TomAugspurger note added

@TomAugspurger TomAugspurger merged commit a192650 into pandas-dev:master Mar 17, 2018
@TomAugspurger
Copy link
Contributor

Thanks @montanalow !

nehiljain added a commit to nehiljain/pandas that referenced this pull request Mar 21, 2018
…ame_describe

* upstream/master: (158 commits)
  Add link to "Craft Minimal Bug Report" blogpost (pandas-dev#20431)
  BUG: fixed json_normalize for subrecords with NoneTypes (pandas-dev#20030) (pandas-dev#20399)
  BUG: ExtensionArray.fillna for scalar values (pandas-dev#20412)
  DOC" update the Pandas core window rolling count docstring" (pandas-dev#20264)
  DOC: update the pandas.DataFrame.plot.hist docstring (pandas-dev#20155)
  DOC: Only use ~ in class links to hide prefixes. (pandas-dev#20402)
  Bug: Allow np.timedelta64 objects to index TimedeltaIndex (pandas-dev#20408)
  DOC: add disallowing of Series construction of len-1 list with index to whatsnew (pandas-dev#20392)
  MAINT: Remove weird pd file
  DOC: update the Index.isin docstring (pandas-dev#20249)
  BUG: Handle all-NA blocks in concat (pandas-dev#20382)
  DOC: update the pandas.core.resample.Resampler.fillna docstring (pandas-dev#20379)
  BUG: Don't raise exceptions splitting a blank string (pandas-dev#20067)
  DOC: update the pandas.DataFrame.cummax docstring (pandas-dev#20336)
  DOC: update the pandas.core.window.x.mean docstring (pandas-dev#20265)
  DOC: update the api.types.is_number docstring (pandas-dev#20196)
  Fix linter (pandas-dev#20389)
  DOC: Improved the docstring of pandas.Series.dt.to_pytimedelta (pandas-dev#20142)
  DOC: update the pandas.Series.dt.is_month_end docstring (pandas-dev#20181)
  DOC: update the window.Rolling.min docstring (pandas-dev#20263)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.22.0 - Splitting blank string causes exception
4 participants