Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Ignore versionadded directive when checking for periods at docstring end #22423

Merged

Conversation

bengineer19
Copy link
Contributor

Will ignore the versionadded directive when checking for '.' at the end of descriptions.

@WillAyd
Copy link
Member

WillAyd commented Aug 19, 2018

Can you add a test case for this? We also have a versionchanged directive which we should account for here

@TomAugspurger
Copy link
Contributor

And the deprecated directive as well probably.

@codecov
Copy link

codecov bot commented Aug 19, 2018

Codecov Report

Merging #22423 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22423      +/-   ##
==========================================
- Coverage   92.05%   92.04%   -0.01%     
==========================================
  Files         169      169              
  Lines       50714    50740      +26     
==========================================
+ Hits        46684    46705      +21     
- Misses       4030     4035       +5
Flag Coverage Δ
#multiple 90.45% <ø> (-0.01%) ⬇️
#single 42.24% <ø> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/util/_depr_module.py 65.11% <0%> (-2.33%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (-0.63%) ⬇️
pandas/core/arrays/integer.py 94.55% <0%> (-0.12%) ⬇️
pandas/util/testing.py 85.75% <0%> (-0.11%) ⬇️
pandas/core/reshape/merge.py 94.15% <0%> (-0.01%) ⬇️
pandas/core/groupby/grouper.py 98.16% <0%> (-0.01%) ⬇️
pandas/core/generic.py 96.44% <0%> (-0.01%) ⬇️
pandas/core/frame.py 97.24% <0%> (-0.01%) ⬇️
pandas/core/series.py 93.73% <0%> (ø) ⬆️
pandas/io/parsers.py 95.48% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 140c7bb...3039d78. Read the comment docs.

@gfyoung gfyoung added the Docs label Aug 19, 2018
if index < period_check_index or period_check_index is -1:
period_check_index = index

if doc.parameter_desc(param)[period_check_index] != '.':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about creating a new property in the class Docstring that returns the parameter_desc but without directives?

I think the code will be much more readable if we have this logic there, and in this part of the code where all validations happen we simply have something like if doc.parameter_desc_without_directives(param)[-1] != '.':

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that would be more useful and readable; I'll make a property.

@@ -42,6 +42,7 @@


PRIVATE_CLASSES = ['NDFrame', 'IndexOpsMixin']
DIRECTIVES = ['.. versionadded', '.. versionchanged', '.. deprecated']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik, all sphinx directives start with .. so I think it's better to have just the names here.

@@ -193,6 +193,53 @@ def contains(self, pat, case=True, na=np.nan):
"""
pass

def mode(self, axis=0, numeric_only=False, dropna=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a smaller, more directed example would be preferable here (it doesn't need to match the original docstring and probably won't over time anyway). Can you strip this down to just what's important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, sounds good.

@@ -236,6 +237,15 @@ def parameter_type(self, param):
def parameter_desc(self, param):
return self.doc_parameters[param][1]

def parameter_desc_without_directives(self, param):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you just use this logic in parameter_desc instead of as a separate method? I can't think of a case where we want the directives to be considered part of the description, and the way you've coded this works well for the missing period but may not be generalizable to other issues that could come up

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'd probably keep a raw_parameter_desc with what parameter_desc has now, as in the future I'd probably like to have the values of the directives (for example to find the deprecated parameters we need to delete in the next version).

@@ -233,9 +234,19 @@ def correct_parameters(self):
def parameter_type(self, param):
return self.doc_parameters[param][0]

def parameter_desc(self, param):
def raw_parameter_desc(self, param):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista I know you wanted this but I would prefer not to include unless it serves a purpose. May even be better served as a keyword argument going forward instead of a dedicated method so would rather not go down this path until needed.

Outside of that I think this change looks good

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. We can just have parameter_desc for now, and see when we start using the directives what makes more sense.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I was checking some docstrings, and there are couple of cases we didn't consider. Can you take care of them?

@@ -233,9 +234,19 @@ def correct_parameters(self):
def parameter_type(self, param):
return self.doc_parameters[param][0]

def parameter_desc(self, param):
def raw_parameter_desc(self, param):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. We can just have parameter_desc for now, and see when we start using the directives what makes more sense.

Sentence ending in period, followed by multiple directives.

.. versionadded:: 0.1.2
.. deprecated:: 0.00.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about versionadded and versionchanged, but deprecated can have a description after if, for example:

          .. deprecated:: 0.21.0
              Use :func:`pandas.read_csv` instead.

And it can be even multiline. Do you mind adding a test for that? I'm not sure if this is working with the current implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you check the convert_datetime64 of to_records, there are cases where the directives come before the description. I'm happy if we consider only valid having them in one place (before or after the description). But, can we make the script generate a descriptive error for it? I guess with the current implementation we'll report that the parameter has no description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test case for multi-line descriptions.
Directive positioning is a bit more tricky. Enforcing them to be in one place would help, but the problem comes when trying to determine if text after the directive is directive description, or just generic parameter description. We need to make this distinction in order to produce a nice error message.
This is made harder by the fact that we're currently working with doc_parameters, which smooshes the whole description into one single-line string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think enforcement after description is fine. I think @datapythonista is correct in that it will generate an error, albeit with the wrong message. If we wanted to clean that up I'd suggest a separate PR, though @datapythonista I'll leave that decision up to you

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fix most cases, happy to merge on green, and take care of other cases in separate PRs. Thanks @bengineer19

@jreback jreback added this to the 0.24.0 milestone Aug 23, 2018
@datapythonista datapythonista merged commit 0f656f7 into pandas-dev:master Aug 23, 2018
@bengineer19 bengineer19 deleted the validate_docstrings_versionadded branch August 23, 2018 17:57
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider directives when validating docstrings parameters
6 participants