Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp and simplify Sphinx documentation process #2454

Closed
astrojuanlu opened this issue Mar 22, 2023 · 7 comments
Closed

Revamp and simplify Sphinx documentation process #2454

astrojuanlu opened this issue Mar 22, 2023 · 7 comments
Assignees
Labels
Component: DevSetup Issue/PR that addresses technical setup of the project repository Component: Documentation 📄 Issue/PR for markdown and API documentation Type: Parent Issue

Comments

@astrojuanlu
Copy link
Member

Lately we've been having some hiccups with the docs (kedro-org/kedro-plugins#134, #2451, #2453) which are difficult to fix because building the full documentation locally without warnings is nearly impossible. In gh-2452 I ended up disabling the strict mode on Read the Docs to try to come back to these with more calm.

The amount of non-standard customization we're doing on our docs/conf.py is quite large. Parts of that are necessary (like installing kedro-datasets to a custom location until 0.19 is released, see gh-2006, gh-1651) but we should try to keep it to the minimum.

And finally, we are using a really old version of Sphinx, which prevents us from profiting from recent quality of life improvements, like sphinx-doc/sphinx#9481 and sphinx-doc/sphinx#7989.

Without the need of boiling the ocean, and in the broader context of gh-2025, I think we should:

  • Remove unused extensions and configuration (low hanging fruit - this includes things like LaTeX conf, man pages conf, and so on)
  • Upgrade to Sphinx 6 and docutils >= 0.18 (which requires updates to the CSS, otherwise the styles are not properly applied)
  • Stop editing docstrins on the fly (make those changes to the docs directly) or replace that functionality with a third party extension, like https://sphinx-codeautolink.readthedocs.io/en/latest/:

kedro/docs/conf.py

Lines 354 to 359 in c144b87

def autolink_replacements(what: str) -> List[Tuple[str, str, str]]:
"""
Create a list containing replacement tuples of the form:
(``regex``, ``replacement``, ``obj``) for all classes and methods which are
imported in ``KEDRO_MODULES`` ``__init__.py`` files. The ``replacement``
is a reStructuredText link to their documentation.

  • Try to remove the need to move files around and remove directories before and after documentation builds:

kedro/docs/conf.py

Lines 496 to 507 in c144b87

def _prepare_build_dir(app, config):
"""Get current working directory to the state expected
by the ReadTheDocs builder. Shortly, it does the same as
./build-docs.sh script except not running `sphinx-build` step."""
build_root = Path(app.srcdir)
build_out = Path(app.outdir)
copy_tree(str(here / "source"), str(build_root))
copy_tree(str(build_root / "api_docs"), str(build_root))
shutil.rmtree(str(build_root / "api_docs"))
shutil.rmtree(str(build_out), ignore_errors=True)
copy_tree(str(build_root / "css"), str(build_out / "_static" / "css"))
shutil.rmtree(str(build_root / "css"))

One more thing that could greatly improve the process, but that need some investigation, is moving away from sphinx.ext.autosummary and use either sphinx-autoapi or sphinx-autodoc2 instead (probably the latter, since it can generate Markdown). The reason is that autosummary needs to import the code, and therefore all the dependencies of all kedro-datasets need to be installed. On the other hand, autoapi and autodoc2 use static analysis, and therefore don't need any of the dependencies installed.

@astrojuanlu astrojuanlu added Component: Documentation 📄 Issue/PR for markdown and API documentation Component: DevSetup Issue/PR that addresses technical setup of the project repository labels Mar 22, 2023
@astrojuanlu astrojuanlu added this to the Sphinx enhancements milestone Mar 22, 2023
@astrojuanlu astrojuanlu moved this to In Progress in Kedro Framework Mar 29, 2023
@astrojuanlu astrojuanlu self-assigned this Mar 29, 2023
@astrojuanlu astrojuanlu moved this from In Progress to To Do in Kedro Framework Mar 30, 2023
@astrojuanlu
Copy link
Member Author

First revamp happened in gh-2459. Pending:

  • Attempt to move to sphinx-autoapi or sphinx-autodoc2 to avoid having to install all the datasets dependencies to build the docs (high priority)
  • Move all API docs to /api/ #2483 (high priority)

Then some low priority stuff:

@stichbury stichbury moved this from To Do to Done in Kedro Framework Mar 31, 2023
@stichbury
Copy link
Contributor

Marking this a parent ticket for #2459

@astrojuanlu
Copy link
Member Author

Maybe you meant some other issue? (That one is the PR we merged last week)

@astrojuanlu
Copy link
Member Author

Note to self: sphinx-autodoc2 does not support Napoleon (Google style) docstrings yet: executablebooks/MyST-Parser#228

@astrojuanlu
Copy link
Member Author

Another note to self: sphinx-autoapi cannot generate one page per class, as sphinx.ext.autosummary does, therefore even if it's technically feasible to migrate to it (did some successful local experiments), it would completely break the current URL structure readthedocs/sphinx-autoapi#226

@astrojuanlu
Copy link
Member Author

Conclusion: it won't be possible to cleanly migrate to statically building our API docs without significant breakage (or redirects toil), significant coding efforts, or both.

I'm abandoning that effort for now (keeping in mind for our upcoming discussions on #2072). @stichbury do you still want to keep this as a parent issue of gh-2483 and gh-2394? I think those two are not so much about the process (assuming unchanged outputs, as we did with gh-2459), but more about the outputs themselves (URLs, and look & feel).

@stichbury
Copy link
Contributor

Agreed, let's close this and keep those issues separate, not children of this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: DevSetup Issue/PR that addresses technical setup of the project repository Component: Documentation 📄 Issue/PR for markdown and API documentation Type: Parent Issue
Projects
Archived in project
Development

No branches or pull requests

2 participants