Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Associate comment of first statement properly in a Module #405

Open
jimmylai opened this issue Oct 20, 2020 · 4 comments
Open

Associate comment of first statement properly in a Module #405

jimmylai opened this issue Oct 20, 2020 · 4 comments
Labels
parsing Converting source code into CST nodes

Comments

@jimmylai
Copy link
Contributor

In a local scope, e.g. inside a function, class, if condition.
The comment is associated with the statement correctly.

In [10]: code = """
    ...: def f():
    ...:   # abc
    ...:   import d
    ...: """

In [11]: cst.parse_module(code)
Out[11]:
Module(
    body=[
        FunctionDef(
            name=Name(
                value='f',
                lpar=[],
                rpar=[],
            ),
            ...
            body=IndentedBlock(
                body=[
                    SimpleStatementLine(
                        body=[
                            Import(
                                names=[
                                    ImportAlias(
                                        name=Name(
                                            value='d',
                                            lpar=[],
                                            rpar=[],
                                        ),
                                        asname=None,
                                        comma=MaybeSentinel.DEFAULT,
                                    ),
                                ],
                                semicolon=MaybeSentinel.DEFAULT,
                                whitespace_after_import=SimpleWhitespace(
                                    value=' ',
                                ),
                            ),
                        ],
                        leading_lines=[
                            EmptyLine(
                                indent=True,
                                whitespace=SimpleWhitespace(
                                    value='',
                                ),
                                comment=Comment(
                                    value='# abc',
                                ),
                                newline=Newline(
                                    value=None,
                                ),
                            ),
                        ],
                        trailing_whitespace=TrailingWhitespace(
                            whitespace=SimpleWhitespace(
                                value='',
                            ),
                            comment=None,
                            newline=Newline(
                                value=None,
                            ),
                        ),
                    ),
                ],
                ...
)

However, in a module, the comment is parsed as module headers. So some codemods may insert statement between the comment and the first statement, e.g. EnsureImportPresentCommand .

In [8]: code = """
   ...: # abc
   ...: import d
   ...: """

In [9]: cst.parse_module(code)
Out[9]:
Module(
    body=[
        SimpleStatementLine(
            body=[
                Import(
                    names=[
                        ImportAlias(
                            name=Name(
                                value='d',
                                lpar=[],
                                rpar=[],
                            ),
                            asname=None,
                            comma=MaybeSentinel.DEFAULT,
                        ),
                    ],
                    semicolon=MaybeSentinel.DEFAULT,
                    whitespace_after_import=SimpleWhitespace(
                        value=' ',
                    ),
                ),
            ],
            leading_lines=[],
            trailing_whitespace=TrailingWhitespace(
                whitespace=SimpleWhitespace(
                    value='',
                ),
                comment=None,
                newline=Newline(
                    value=None,
                ),
            ),
        ),
    ],
    header=[
        EmptyLine(
            indent=True,
            whitespace=SimpleWhitespace(
                value='',
            ),
            comment=None,
            newline=Newline(
                value=None,
            ),
        ),
        EmptyLine(
            indent=True,
            whitespace=SimpleWhitespace(
                value='',
            ),
            comment=Comment(
                value='# abc',
            ),
            newline=Newline(
                value=None,
            ),
        ),
    ],
    footer=[],
    encoding='utf-8',
    default_indent='    ',
    default_newline='\n',
    has_trailing_newline=True,
)

It's probably not easy to differentiate whether a comment is for the first statement or not to handle it properly when parsing a module. It could be solved by building a helper to post processing the CST by giving some hints (e.g. # lint-ignore comment should be associate with the first statement).

CC @zsol @thatch

@Kronuz
Copy link
Contributor

Kronuz commented Oct 20, 2020

What if we take as headers only comments starting with #! or #\s*@ and also comments that are not followed by a statement?, in other words any of the following cases:

#!/usr/bin/env python
# @oncall xxx
# @oncall xxx
# some comments with no immediate statements after
# going to the header.

# statement comment
import xxx

@jimmylai
Copy link
Contributor Author

What if we take as headers only comments starting with #! or #\s*@ and also comments that are not followed by a statement?, in other words any of the following cases:

That sounds like a great idea! We can try implement this and run on existing code to see if it works well.

@thatch
Copy link
Contributor

thatch commented Nov 22, 2020

I don't know that there's a perfect, general, easily-configurable way to handle this. For context, I'm assuming this is trying to solve the special case of Instagram/Fixit#143 in libcst itself, which would be great.

My proposal would be not specialcase @ because of @manual autodeps. Just #! (in the first line) and PEP 263 comments (in the first two lines). Are you already working on a PR for this?

@thatch
Copy link
Contributor

thatch commented Nov 30, 2020

Was pointed at another example today where the first line was a # pyre-fixme... comment being considered module header.

@zsol zsol added the parsing Converting source code into CST nodes label Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parsing Converting source code into CST nodes
Projects
None yet
Development

No branches or pull requests

4 participants