Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for generating methodsynopses from stubs #6367

Closed
wants to merge 3 commits into from

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Oct 22, 2020

2nd take of #5895.

@kocsismate kocsismate requested a review from cmb69 October 22, 2020 08:43
@kocsismate kocsismate marked this pull request as draft October 22, 2020 08:43
@kocsismate kocsismate changed the base branch from master to PHP-8.0 October 22, 2020 08:44
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
ext/bcmath/methodsynopses/bcadd.xml Outdated Show resolved Hide resolved
@kocsismate kocsismate marked this pull request as ready for review October 23, 2020 10:38
@kocsismate
Copy link
Member Author

I've just implemented a few things:

I generated the methodsynopses for a few smaller extensions to showcase the different scenarios (e.g. FFI uses namespaces, ext/dba has variadic parameters, ext/tokenizer has the static[] PHPDoc-based type).

There's a smaller caveat I noticed in the meanwhile: currently we don't require magic methods to have a return type, although I believe there's nothing stopping us after the related RFC, except for constructors, right? I'd like to create a separate PR to change this (+ add PHPDOc return type hints).

@nikic
Copy link
Member

nikic commented Oct 23, 2020

There's a smaller caveat I noticed in the meanwhile: currently we don't require magic methods to have a return type, although I believe there's nothing stopping us after the related RFC, except for constructors, right? I'd like to create a separate PR to change this (+ add PHPDOc return type hints).

Right, we only need to exclude __construct and __destruct, everything else can have types.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me, and I think we should already merge what you have here, though more work will be needed to support aliases.

I think one of the main questions I have is how we'll get these methodsynopses into the manual. Ideally we'd want to do that automatically as well, by inserting the new synopsis, extracting the old parameter names and comparing them against the new ones and then automatically renaming any references. But this is probably easier said than done.

build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
@@ -399,6 +510,10 @@ public function getArgInfoName(): string {
return "arginfo_$underscoreName";
}

public function getMethodsynopsisFilename(): string {
return implode('_', $this->name->parts);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is out of place here imho.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was also not very happy with it, however the interface doesn't let me to access $this->name->parts directly. And since the implementation is different in case of the MethodName class, I think we'll need a method similar to getMethodsynopsisFilename in the FunctionOrMethodName interface. Or do you have other suggestion?

build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Outdated Show resolved Hide resolved
build/gen_stub.php Show resolved Hide resolved
@kocsismate
Copy link
Member Author

On question came to my mind: Should I disable the generation of methodsynopsis if a function/method contains an UNKNOWN parameter? I'd say yes, but I'm curious if there's any counter-argument you see?

@nikic
Copy link
Member

nikic commented Oct 27, 2020

@kocsismate I think that's reasonable. If there's UNKNOWN parameters we'll usually want to specify two signatures instead, as the function uses some form of overloading. Maybe those signatures can be generated as well, but it may be more fruitful to handle manually.

@kocsismate kocsismate marked this pull request as draft October 28, 2020 09:40
@kocsismate
Copy link
Member Author

I'm working on something, so I'm marking this as draft again for a while :)

@kocsismate kocsismate marked this pull request as ready for review October 29, 2020 00:07
@kocsismate
Copy link
Member Author

I managed to implement an MVP for the direct methodsynopses replacer tool! Contrarily to my initial plans, I completely did it in gen_stub.php because it's more comfortable for users (they have to run only 1 script), and for me as well, since all the needed information is directly available, there's no need to parse the generated methodsynopses.

Usage is: ./build/gen_stub.php --replace-methodsynopses -f ext/bcmath /path/php-doc-en/reference/bc

Support for aliases is still missing, but I'll continue with it. Also, I haven't yet got time for double-checking the sanity of the code :) Please disregard the entity replacement hack :D

Bonus question: the manual uses 2 spaces at the root level, and 1 space afterwards for indentation, so I followed this with the replacement tool. Is it ok, right?

I've put up the following PR (php/doc-en#168) for introducing the result of the script.

@kocsismate
Copy link
Member Author

I managed to add support for generating methodsynopsis for function-method counterparts, and I made the replacement much more clean (now, it's almost entirely XML-based!). I think I'll be able to replace parameters the same way, rather than by using str_replace.

XML-based replacement has the consequence of causing some unintended changes (mostly whitespace). I think that's tolerable, but it's worth a mention. Besides this, my code tries to compare the original and the new methodsynopses before replacing in order not to overwrite the file if there is no change. This is currently not working, because a whitespace is still left in the original XML even though I normalize it, and try not to preserve whitespaces.

@cmb69
Copy link
Member

cmb69 commented Oct 30, 2020

Did you try to preserve whitespace, and not do any normalization?

@kocsismate
Copy link
Member Author

Did you try to preserve whitespace, and not do any normalization?

Yes, I did. However, I tried another approach now, and it works now! :)

@cmb69
Copy link
Member

cmb69 commented Nov 2, 2020

the manual uses 2 spaces at the root level, and 1 space afterwards for indentation

Hmm, does it? Actually, the root element of every file should not be indented at all, and all children should be indented by 1 space.

@kocsismate
Copy link
Member Author

Hmm, does it? Actually, the root element of every file should not be indented at all, and all children should be indented by 1 space.

Yes, you are right. I think I just can't count :)

I believe I am almost ready with the script, and review is needed to continue. There's also a few things which we should find out:

  • what should the change log entry contain? Currently, it's just a dummy text, but since we have all the necessary information, we could describe all the changes, if we want to. But it's not clear for me how much detail we should mention.
  • I noticed that a few manual pages (e.g. https://www.php.net/manual/en/datetime.construct.php) discusses function/method counterparts together which are not real aliases. I think the only negative effect this has is that the role="procedural/oop" attribute gets lost. I'm not sure how much of a problem it is.
  • Currently, a change log entry gets generated for each methodsynopsis, even for the same page. Should I deduplicate these, right? However, I'm afraid that this has its own edge cases.
  • The code became a spaghetti in the meanwhile, so any ideas for improvement are welcome :)

@kocsismate
Copy link
Member Author

@cmb69 I was thinking about how to display long function signatures, and it came to my mind that we could maybe add an methodsynopsis attribute (e.g. display="long") for functions with many parameters (more than x?), so that we would have something like this:

Screenshot 2020-11-04 at 11 18 07

Does it sound reasonable/feasible?

Of course, the number of parameters is not always the best indicator for long signatures, so we should find a better one if want to go this way.

@nikic
Copy link
Member

nikic commented Nov 4, 2020

I think to move forward here we should have php/doc-en#168 for a single, not too large and not too "complicated" extension and try to land that.

I also think that we should not include the change log generation here and handle that on a case-by-case basis instead (and preferably not as part of the signatures update -- let's keep a side-list of functions where a change log entry should be added instead). A lot of the generated changelogs seem to be "false positives" -- yes, the signature did change, but not in a way that requires explicit mention. Changing an int parameter to bool requires a changelog mention, but a change from string to ?string probably doesn't (both in the sense that it's unnecessarily pedantic, and because often this just documents pre-existing behavior).

@cmb69
Copy link
Member

cmb69 commented Nov 4, 2020

@kocsismate, I don't think that putting each parameter on a separate line would be a real improvement. That would quickly require excessive scrolling to see the rest of the page. If we wanted to go fancy, we could add a button to toggle between different views via JS.

@kocsismate
Copy link
Member Author

I also think that we should not include the change log generation here and handle that on a case-by-case basis instead (and preferably not as part of the signatures update -- let's keep a side-list of functions where a change log entry should be added instead). A lot of the generated changelogs seem to be "false positives" -- yes, the signature did change, but not in a way that requires explicit mention. Changing an int parameter to bool requires a changelog mention, but a change from string to ?string probably doesn't

Ok, you are right that we shouldn't add a change log entry for each tiny little change. However, adding the change log entries manually will be a lot of work, and I think we could spare some manual efforts by generating change log only for the following cases:

  • when the number of the parameters has changed
  • when the type of a parameter has changed so that it wasn't previously or isn't currently mixed, and it doesn't currently include any previous types, not regarding null
  • when a return type has changed so that it wasn't previously or isn't currently mixed, and it doesn't currently include any previous types, not regarding null

These rules would guarantee that we only generate an entry for functions where there were actual changes. E.g. ob_get_flush() wouldn't be affected, even though the return type is changed from string to string|false, neither range() would, where the mixed param types were specified as string|int|float. On the other hand, functions where we changed types (e.g. int to bool), or removed unnecessary parameters would have a change log. Of course, I don't calculate with the cases when the manual currently reports a type plain wrong...

What do you think about this idea? If it still doesn't make much sense, I'm also ok to completely drop the change log generator from the script for now, but it could be merged only to the master branch later.

@cmb69
Copy link
Member

cmb69 commented Nov 7, 2020

I agree with Nikita that we shouldn't worry about the changelog entries right now, but instead start by actually doing a single extension. This should help us to see whether some automatic changelog generation is helpful, and may point out further automation options. And I wouldn't worry too much about the stability of the doc generation from stubs; IMHO some changes can still be applied to 8.0 after 8.0.0 GA.

@kocsismate
Copy link
Member Author

Alright, I restricted generation only to the Zend directory. :) I hope that you'll also find it to be a suitable first candidate.

@MCMic
Copy link
Contributor

MCMic commented Nov 23, 2020

Wonderful work!

I tried it on ext/ldap, the diff seems solid and consistent with the stubs.

The only problem appears with ldap_connect, because it is documented as two signatures: https://www.php.net/manual/en/function.ldap-connect.php

Maybe the deprecated signature should be removed from documentation? Otherwise I’m not sure how to document it, since there is no parameter actually named «$host».

@cmb69
Copy link
Member

cmb69 commented Nov 23, 2020

@MCMic, yes, in this case it is probably best to remove the deprecated signature, and move the info about that signature to the $uri parameter.

@kocsismate
Copy link
Member Author

Can I go ahead with merging this (after fixing the conflicts)? I'd need some of the functionality available in this PR for a future work. Also, I think my last commit ("Add warning when a function is not defined in stubs") should be removed, because it can cause false positive errors when you don't specify the target directory exactly precisely. E.g. the documentation for ext/standard is is in multiple subdirectories inside doc-en/reference, so one can just set doc-en/reference as a target - but then lots of warnings are generated).

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection to merge this. Actually, at this point you're the expert on this, since you already ran it for a lot of extensions, and the results looked good to me.

@kocsismate
Copy link
Member Author

I have no objection to merge this. Actually, at this point you're the expert on this, since you already ran it for a lot of extensions, and the results looked good to me.

Yeah, but there might be coding style or code structuring problems, that's the main reason of the question. :) But since it's not a core part in the narrow sense, I'll merge this PR, and will later fix any issues that arise, whenever @nikic has a look.

Retrieve type info from PHPDoc and add support for proper union types and variadic params

Do not display the modifier for free standing functions

Do not generate return type for constructors and destructors

Removed generated methodsynopses

Address review comments

Exclude functions with unknown param default value

Add support for direct replacement of methodsynopses

Improve replace-methodsynopses subcommand

Add support for change log generation

Remove change log generation

A few tweaks

Address a few feedbacks

Do not generate the role attribute for methodsynopses

Revert "Do not generate the role attribute for methodsynopses"

This reverts commit 22e1f3b.

Add support for generating destructorsynopsis

Fix conditions for generating the role attribute
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants