Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: autosummary usage to avoid warnings and generate extra pages seems incorrect #54454

Closed
1 task done
vyasr opened this issue Aug 7, 2023 · 4 comments · Fixed by #54457
Closed
1 task done

DOC: autosummary usage to avoid warnings and generate extra pages seems incorrect #54454

vyasr opened this issue Aug 7, 2023 · 4 comments · Fixed by #54457
Assignees
Labels

Comments

@vyasr
Copy link
Contributor

vyasr commented Aug 7, 2023

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

There are a few (related) issues. One is with the class templates, especially
https://github.com/pandas-dev/pandas/blob/main/doc/_templates/autosummary/class_without_autosummary.rst

The other issue is present in a handful of independent rst files, such as (using raw links because the problematic pieces are commented out)
https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/source/reference/extensions.rst
or
https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/source/reference/arrays.rst

Documentation problem

There are multiple places that seem to be relying on commented out rst sections containing rst or toctree directives to have some effect. It is not clear to me what effect that these are intended to have. In some places there are comments about how this is meant to either 1) silence Sphinx warnings about pages not being included in toctrees or 2) generate pages that would not otherwise be generated. However, I think both of those would only work if those sections were not commented out. Some of these appear to have been intentionally added, though, which makes me wonder if there is some subtle behavior that I'm missing here.

The class_without_autosummary.rst is particularly confusing. The dev docs say that the purpose of this template is to support classes where only a subset of methods are documented on the class's manually curated summary page (where I differentiate e.g. the manually managed DataFrame summary page) from the autodoc managed DataFrame class page). However, a closer inspection of the default class.rst template shows that in fact the two are the same. The default class page inherits from autosummary's base class page but then overwrites the methods and attributes sections as empty. With that change, class.rst and class_without_autosummary.rst are effectively the same: just a title, a currentmodule directive, and an autoclass directive with no arguments. The real difference between classes that list all their methods and those that only include a subsets is that the latter have Attributes and Methods sections in the class docstring. This works because (AFAIK from testing, can't find it documented) numpydoc generates class summary lists only if none are already present, so by including an explicit list in some classes pandas prevents numpydoc from generating a full list of APIs (e.g. for CategoricalIndex).

Suggested fix for documentation

If I have understood correctly, I think pandas can remove the class_without_autosummary.rst template entirely and use the class.rst template alone for all classes. Furthermore, all the commented out sections can be removed.

I tested out the proposed template changes locally and saw no difference in the rendered docs. I also didn't notice any changes in the number of warnings when I removed a few of the commented out sections, but I did not test that change thoroughly so it's possible there are some edge cases that I missed.

If the devs agree with my suggestions, I'm happy to make a PR since I already have a lot of the changes locally.

@vyasr vyasr added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 7, 2023
@vyasr
Copy link
Contributor Author

vyasr commented Aug 7, 2023

@jorisvandenbossche looks like you may have been the last person to touch some of this stuff (back in 2019 though, so not exactly top of mind). Curious if you have any insights beyond what I've listed.

CC @mroeschke

@vyasr vyasr mentioned this issue Aug 8, 2023
6 tasks
@GieralZiomal
Copy link

take

@vyasr
Copy link
Contributor Author

vyasr commented Aug 8, 2023

It seems like the hacks are indeed somehow working to get autosummary to generate stub pages even when that part is commented out and not visible. Perhaps a simpler option would be to disable numpydoc's generation of the summary tables and rely entirely on autosummary. It might make things a bit more verbose, but would at least be more explicit.

@mroeschke
Copy link
Member

Perhaps a simpler option would be to disable numpydoc's generation of the summary tables and rely entirely on autosummary. It might make things a bit more verbose, but would at least be more explicit.

+1 to this suggestion. I don't think many of the core developers know how/why the current works

@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Aug 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants