Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support request: how do I get a DocBlock when there are no nodes #36

Closed
mvriel opened this issue Sep 21, 2012 · 9 comments
Closed

Support request: how do I get a DocBlock when there are no nodes #36

mvriel opened this issue Sep 21, 2012 · 9 comments

Comments

@mvriel
Copy link

mvriel commented Sep 21, 2012

Hey @nikic,

I received the following bug report in phpDocumentor/phpDocumentor2: phpDocumentor/phpDocumentor#604

The reporter indicates that the first DocBlock could not be found when there are no nodes. This is verified behaviour since I try to claim the file-level DocBlock in a NodeVisitor using the beforeTraverse method.

Can you provide me with a clue what the best way would be to still obtain the first DocComment in a file using PHP-Parser? (if PHP-Parser supports that)

@nikic
Copy link
Owner

nikic commented Sep 21, 2012

The issue here is that ?> in PHP comes with an implicit semicolon, so from the parsers perspective there is a semicolon-statement (;) after the docblock. As those are no-ops the parser doesn't generate nodes for them, so the doc block is dropped.

There are several ways this could be solved:

  1. Fetching the file doc block using the tokenizer. Code would look similar to this:

    <?php
    function getFileDocComment($sourceCode) {
        foreach (token_get_all($sourceCode) as $token) {
            if ($token[0] === T_DOC_COMMENT) {
                return $token[1];
            } elseif ($token[0] !== T_OPEN_TAG && $token[0] !== T_WHITESPACE) {
                // only allow an opening tag and whitespace before file doc comment
                return false;
            }
        }
    
        return false;
    }

    This has the advantage that it'll work even if there is nothing after the doc comment at all (i.e. the file contains only a doc comment and nothing else).

  2. Instead of just dropping semicolon-statements I could return a node for them. Those would capture the doc comment. (This wouldn't work if the doc comment it the only thing in the file and there isn't even a closing ?> php tag.)

  3. I could probably add the doc comment to the node following the semicolon (not yet sure how this would be done, but there probably is a way). (This wouldn't work if there is only a closing ?> tag and no inline HTML after it.)

All variants seem somewhat ugly, so I'm not really sure what to do about this :/

@mvriel
Copy link
Author

mvriel commented Sep 21, 2012

I agree with you that the above solutions are less than optimal. Let's see if another solution is possible.
For me to venture there I need some additional information regarding current behaviour. I have (mostly) read the Lexer and Parser to try and get an idea how the data is used but especially the parser is too hard to comprehensively study in a short amount of time.

As such, is the following use case correct behaviour?

Code

<?php
// this is a test

/**
 * Short description
 */
/* this is a common multiline comment */
$a * $b == 1 + 2;

According to a var_dump of the nodes will 'Expr_Equal' node, the 'Expr_Mul' node and the 'Expr_Variable' $a ALL have the comment 'this is a test', 'Short description' and 'this is a common multiline' comment.

I would have expected that 'this is a test' and 'this is a common multiline comment' would have belonged to the file (shouldn't the file actually be a top parent node?) and that the Short description would only have belonged to the Expr_Equal node.

Example clarifying my last statement:

<?php
/** @var \SimpleXMLElement $a */
$a * $b == 1 + 2;

The above indicates that the Expr_Equal contains a variable $a which is of type SimpleXMLElement in this specific context. It may be assumed that $a is the same type in subsequent contexts but only here we know for sure.

@mvriel
Copy link
Author

mvriel commented Oct 13, 2012

Hey @nikic,

I'd love to hear your thoughts on the above if you have time,

Thanks in advance.

@nikic
Copy link
Owner

nikic commented Oct 13, 2012

Hey @mvriel. I must have missed the notification for your comment, sorry :/

Regarding the comments, no, this is not the intended behavior and I didn't notice that this happens until now. Though thinking about it again, it's quite logical that it happens: Just like all (also nested) nodes get assigned a line number, nested nodes also get assigned the comments.

Fixing this will be rather hard though, as it can't really be done in a generic way, at least I don't see one. Rather I'd have to assign comments only to statements (and expressions in statement use). But it's probably still worth to do it for the memory savings.

@pscheit
Copy link

pscheit commented Oct 13, 2012

maybe a node / container / something with "lostandfound"-comments ? :) aka "not assignable" - comments?

@mfn
Copy link

mfn commented Nov 13, 2015

Ran into this today too.

All docblocks/comments get accumulated to the next node found or dropped if none.

This makes it really inconvenient to use the same visitor/parser concept for a mixed tree where some files have docblocks without comments, etc.

I wish there was a mode where docblock/comments where just emmited as individual nodes like anything else, without being required to be attached somewhere.

nikic added a commit that referenced this issue Mar 9, 2016
Adding this as an option to avoid breaking people's tests.

Some of the test results show pretty clearly that we are incorrectly
assigning the same comment multiple times for nested nodes (mentioned
in #36).
@nikic
Copy link
Owner

nikic commented Mar 9, 2016

Commit 7eac2cf adds a Nop statement if there any trailing comments in a statement list. This should also include the case of the original report.

I've create a separate issue for the duplicate comment assignment (#253), so we can close this issue at last :)

@nikic nikic closed this as completed Mar 9, 2016
@mvriel
Copy link
Author

mvriel commented Mar 10, 2016

Thanks @nikic! I had missed the addition of the NOP statement so that should enable me to introduce file-level docblocks. Thanks!

@nikic
Copy link
Owner

nikic commented Mar 10, 2016

@mvriel You didn't miss anything, I've only added them yesterday ;) The date info on that commit is off, because I rebased it off an older branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants