-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docstring checker. #9848
Add docstring checker. #9848
Changes from 12 commits
4ec1d19
c022e83
f7aedcf
ed17573
6cdccfe
ff60d9b
90cbd0f
c53c579
89ebccd
f43e88b
b698d5f
cdb489d
eb87574
cc85360
4379cad
a2a8213
6986f62
1a6c864
3974801
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,333 @@ | ||
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
"""DocstringChecker is used to check python doc string's style.""" | ||
|
||
import six | ||
import astroid | ||
|
||
from pylint.checkers import BaseChecker, utils | ||
from pylint.interfaces import IAstroidChecker | ||
|
||
|
||
def register(linter): | ||
"""Register checkers.""" | ||
linter.register_checker(DocstringChecker(linter)) | ||
|
||
|
||
class Docstring(object): | ||
"""Docstring class holds the parsed doc string elements. | ||
""" | ||
|
||
def __init__(self): | ||
from collections import defaultdict | ||
self.d = defaultdict(list) #name->[] | ||
self.clear() | ||
|
||
def clear(self): | ||
self.d['Args'] = [] | ||
self.d['Examples'] = [] | ||
self.d['Returns'] = [] | ||
self.d['Raises'] = [] | ||
self.args = {} #arg_name->arg_type | ||
|
||
def get_level(self, string, indent=' '): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have we agreed that the indent is 4, instead of 2? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Come from google python style guide. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. SG |
||
level = 0 | ||
unit_size = len(indent) | ||
while string[:unit_size] == indent: | ||
string = string[unit_size:] | ||
level += 1 | ||
|
||
return level | ||
|
||
def parse(self, doc): | ||
"""parse gets sections from doc | ||
Such as Args, Returns, Raises, Examples s | ||
Args: | ||
doc (string): is the astroid node doc string. | ||
Returns: | ||
True if doc is parsed successfully. | ||
""" | ||
self.clear() | ||
|
||
lines = doc.splitlines() | ||
state = ("others", -1) | ||
for l in lines: | ||
c = l.strip() | ||
if len(c) <= 0: | ||
continue | ||
|
||
level = self.get_level(l) | ||
if c.startswith("Args:"): | ||
state = ("Args", level) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to check with our auto-doc generator and see it can parse the new style There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 在我们的文档里边其实规定了参数的风格:https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/api_doc_std_cn.md。 |
||
elif c.startswith("Returns:"): | ||
state = ("Returns", level) | ||
elif c.startswith("Raises:"): | ||
state = ("Raises", level) | ||
elif c.startswith("Examples:"): | ||
state = ("Examples", level) | ||
else: | ||
if level > state[1]: | ||
self.d[state[0]].append(c) | ||
continue | ||
|
||
state = ("others", -1) | ||
self.d[state[0]].append(c) | ||
|
||
self._arg_with_type() | ||
return True | ||
|
||
def get_returns(self): | ||
return self.d['Returns'] | ||
|
||
def get_raises(self): | ||
return self.d['Raises'] | ||
|
||
def get_examples(self): | ||
return self.d['Examples'] | ||
|
||
def _arg_with_type(self): | ||
import re | ||
|
||
for t in self.d['Args']: | ||
m = re.search('([A-Za-z0-9_-]+)\s{0,4}(\(.+\))\s{0,4}:', t) | ||
if m: | ||
self.args[m.group(1)] = m.group(2) | ||
|
||
return self.args | ||
|
||
|
||
class DocstringChecker(BaseChecker): | ||
"""DosstringChecker is pylint checker to | ||
check docstring style. | ||
""" | ||
__implements__ = (IAstroidChecker, ) | ||
|
||
POSITIONAL_MESSAGE_ID = 'str-used-on-positional-format-argument' | ||
KEYWORD_MESSAGE_ID = 'str-used-on-keyword-format-argument' | ||
|
||
name = 'doc-string-checker' | ||
symbol = "doc-string" | ||
priority = -1 | ||
msgs = { | ||
'W9001': ('One line doc string on > 1 lines', symbol + "-one-line", | ||
'Used when a short doc string is on multiple lines'), | ||
'W9002': | ||
('Doc string does not end with "." period', symbol + "-end-with", | ||
'Used when a doc string does not end with a period'), | ||
'W9003': ('All args with their types must be mentioned in doc string', | ||
symbol + "-with-all-args", | ||
'Used when not all arguments are in the doc string '), | ||
'W9005': ('Missing docstring or docstring is too shorter', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shorter->short There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
symbol + "-missing", 'Add docstring longer >=10'), | ||
'W9006': ('Docstring indent error, use 4 space for indent', | ||
symbol + "-indent-error", 'Use 4 space for indent'), | ||
'W9007': ('You should add `Returns` in comments', | ||
symbol + "-with-returns", | ||
'There should be a `Returns` section in comments'), | ||
'W9008': ('You should add `Raises` section in comments', | ||
symbol + "-with-raises", | ||
'There should be a `Raises` section in comments'), | ||
} | ||
options = () | ||
|
||
def visit_functiondef(self, node): | ||
"""visit_functiondef checks Function node docstring style. | ||
Args: | ||
node (astroid.node): The visiting node. | ||
Returns: | ||
True if successful other wise False. | ||
""" | ||
|
||
self.check_doc_string(node) | ||
|
||
if node.tolineno - node.fromlineno <= 10: | ||
return True | ||
|
||
if not node.doc: | ||
return True | ||
|
||
doc = Docstring() | ||
doc.parse(node.doc) | ||
|
||
self.all_args_in_doc(node, doc) | ||
self.with_returns(node, doc) | ||
self.with_raises(node, doc) | ||
|
||
def visit_module(self, node): | ||
self.check_doc_string(node) | ||
|
||
def visit_classdef(self, node): | ||
self.check_doc_string(node) | ||
|
||
def check_doc_string(self, node): | ||
self.missing_doc_string(node) | ||
self.one_line(node) | ||
self.has_period(node) | ||
self.indent_style(node) | ||
|
||
def missing_doc_string(self, node): | ||
if node.tolineno - node.fromlineno <= 10: | ||
return True | ||
|
||
if node.doc is None or len(node.doc) < 10: | ||
self.add_message('W9005', node=node, line=node.fromlineno) | ||
return False | ||
|
||
# FIXME(gongwb): give the docstring line-no | ||
def indent_style(self, node, indent=4): | ||
"""indent_style checks docstring's indent style | ||
Args: | ||
node (astroid.node): The visiting node. | ||
indent (int): The default indent of style | ||
Returns: | ||
True if successful other wise False. | ||
""" | ||
if node.doc is None: | ||
return True | ||
|
||
doc = node.doc | ||
lines = doc.splitlines() | ||
|
||
for l in lines: | ||
cur_indent = len(l) - len(l.lstrip()) | ||
if cur_indent % indent != 0: | ||
self.add_message('W9006', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True | ||
|
||
def one_line(self, node): | ||
"""one_line checks if docstring (len < 40) is on one line. | ||
Args: | ||
node (astroid.node): The node visiting. | ||
Returns: | ||
True if successful otherwise False. | ||
""" | ||
|
||
doc = node.doc | ||
if doc is None: | ||
return True | ||
|
||
if len(doc) > 40: | ||
return True | ||
elif sum(doc.find(nl) for nl in ('\n', '\r', '\n\r')) == -3: | ||
return True | ||
else: | ||
self.add_message('W9001', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True | ||
|
||
def has_period(self, node): | ||
"""has_period checks if one line doc end-with '.' . | ||
Args: | ||
node (astroid.node): the node is visiting. | ||
Returns: | ||
True if successful otherwise False. | ||
""" | ||
if node.doc is None: | ||
return True | ||
|
||
if len(node.doc.splitlines()) > 1: | ||
return True | ||
|
||
if not node.doc.strip().endswith('.'): | ||
self.add_message('W9002', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True | ||
|
||
def with_raises(self, node, doc): | ||
"""with_raises checks if one line doc end-with '.' . | ||
Args: | ||
node (astroid.node): the node is visiting. | ||
doc (Docstring): Docstring object. | ||
Returns: | ||
True if successful otherwise False. | ||
""" | ||
|
||
find = False | ||
for t in node.body: | ||
if not isinstance(t, astroid.Raise): | ||
continue | ||
|
||
find = True | ||
break | ||
|
||
if not find: | ||
return True | ||
|
||
if len(doc.get_raises()) == 0: | ||
self.add_message('W9008', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True | ||
|
||
def with_returns(self, node, doc): | ||
"""with_returns checks if docstring comments what are returned . | ||
Args: | ||
node (astroid.node): the node is visiting. | ||
doc (Docstring): Docstring object. | ||
Returns: | ||
True if successful otherwise False. | ||
""" | ||
|
||
find = False | ||
for t in node.body: | ||
if not isinstance(t, astroid.Return): | ||
continue | ||
|
||
find = True | ||
break | ||
|
||
if not find: | ||
return True | ||
|
||
if len(doc.get_returns()) == 0: | ||
self.add_message('W9007', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True | ||
|
||
def all_args_in_doc(self, node, doc): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's talk offline. I'm excited that you can get args from a python function. We can use this solution to detect API change and enforce API-compatibility. |
||
"""all_args_in_doc checks if arguments are mentioned in doc | ||
Args: | ||
node (astroid.node): the node is visiting. | ||
doc (Docstring): Docstring object | ||
Returns: | ||
True if successful otherwise False. | ||
""" | ||
args = [] | ||
for arg in node.args.get_children(): | ||
if (not isinstance(arg, astroid.AssignName)) \ | ||
or arg.name == "self": | ||
continue | ||
args.append(arg.name) | ||
|
||
if len(args) <= 0: | ||
return True | ||
|
||
parsed_args = doc.args | ||
if len(args) > 0 and len(parsed_args) <= 0: | ||
print "debug:parsed args: ", parsed_args | ||
self.add_message('W9003', node=node, line=node.fromlineno) | ||
return False | ||
|
||
for t in args: | ||
if t not in parsed_args: | ||
print t, " with (type) not in ", parsed_args | ||
self.add_message('W9003', node=node, line=node.fromlineno) | ||
return False | ||
|
||
return True |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/bash | ||
|
||
TOTAL_ERRORS=0 | ||
|
||
|
||
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" | ||
export PYTHONPATH=$DIR:$PYTHONPATH | ||
|
||
# The trick to remove deleted files: https://stackoverflow.com/a/2413151 | ||
for file in $(git diff --cached --name-status | awk '$1 != "D" {print $2}'); do | ||
pylint --disable=all --load-plugins=docstring_checker \ | ||
--enable=doc-string-one-line,doc-string-end-with,doc-string-with-all-args,doc-string-triple-quotes,doc-string-missing,doc-string-indent-error,doc-string-with-returns,doc-string-with-raises $file; | ||
TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?); | ||
done | ||
|
||
#exit $TOTAL_ERRORS | ||
#For now, just warning: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we don't turn on it in CI, but fails locally? (user can still use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我想了一下,是不是给使用者一个一致的认知比较好? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我担心如果本地也不报失败,没有人会去花时间修复问题,理论上CI需要强制验证,不过可能我们可以有一个试运行期,本地报错,CI不强制认证。 |
||
exit 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will be great that pylint only requires the author to polish the code style for the codes he/she has touched, instead of covering the whole file. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a great idea, but maybe we need consistency in the whole Python file? E.g., the old file is 2-space indentation, the rule could be 4-space indentation, we probably want the whole file to have the same 4-space indentation, rather than the modified lines. Some edge case is hard to handle as well, for example the developer changed the second line of a doc string, does the linter have to modify the first line as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. |
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move import to the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.