Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Use a single integrated documentation solution across components #11481

Closed
vyasr opened this issue Aug 5, 2022 · 6 comments · Fixed by #13846
Closed

[FEA] Use a single integrated documentation solution across components #11481

vyasr opened this issue Aug 5, 2022 · 6 comments · Fixed by #13846
Assignees
Labels
doc Documentation feature request New feature or request proposal Change current process or code pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.

Comments

@vyasr
Copy link
Contributor

vyasr commented Aug 5, 2022

Is your feature request related to a problem? Please describe.
Currently the C++ documentation and Python documentation are managed and published completely separately. The Python documentation uses Sphinx, while the C++ documentation uses doxygen. Sphinx docs make it significantly easier to include documentation beyond API docs (e.g. user guides or detailed topic references), while doxygen is much more focused on API documentation alone. #11475 demonstrates how non-API documentation can be integrated with doxygen docs, but this approach is limited relative to the flexibility that Sphinx supports. Additionally, Sphinx styling is easier to modify due to the large number of available themes and the knobs that can be easily turned for them. It would be nice if all of our documentation for the different language libraries (as well as different components like developer docs and API docs) could be centralized and presented in a unified manner.

Describe the solution you'd like
We should consider migrating all of our documentation to use the new OmniVerse documentation system. It provides a single, unified platform for building both C++ and Python documentation into a Sphinx document. It supports the exact sets of documentation that we already use (doxygen for C++ API docs and rST for Python API docs) while also making it easy to add all the extra pages that we might wish (and which already exist for the Python documentation).

Describe alternatives you've considered
One oft-cited benefit of our current documentation layout is that it maintains an alignment with pandas documentation. This makes it easier for users to find the corresponding APIs between the two libraries. While migrating to the OmniVerse documentation system would be a great solution for unifying our documentation and providing a layout and style that is very on-brand for NVIDIA tooling, the different styling may cause some dissonance for readers. If we think this is a significant issue (although I don't anticipate this being the case) we could consider using Breathe directly in our Sphinx docs. Breathe is what allows Sphinx docs to talk to doxygen and parse those API docs (it's what the Omniverse documentation system uses under the hood), and we could leverage it directly in our existing Sphinx documentation. This approach would allow us to have a unified approach to documentation while still retaining the pandas-compatible style.

Additional context
Migrating all of our documentation -- whether to OmniVerse or to Breathe -- is a large change that will need to be synchronized across all of RAPIDS. It will affect both user- and developer-facing documentation, so the effects should be carefully considered. Moreover, we should expect that the combination of Breathe and the PyData Sphinx theme that we use will have some incompatibilities that will need to be addressed, and Breathe may affect formatting in surprising ways so we'll need to do a thorough review. As such, any effort to modify the cudf documentation in this manner should be viewed as a POC to be demoed to and discussed across all of RAPIDS before any changes are finalized.

One additional minor point: we need to make sure that whatever system we choose supports documentation in Cython files appropriately. This shouldn't be a problem for direct usage of Breathe, but I don't know enough about how the Omniverse documentation system works under the hood to be entirely certain that it doesn't make additional assumptions that we would need relaxed to support Cython docstrings.

@vyasr vyasr added feature request New feature or request proposal Change current process or code doc Documentation Python Affects Python cuDF API. labels Aug 5, 2022
@vyasr
Copy link
Contributor Author

vyasr commented Aug 5, 2022

CC @rlratzel @dantegd @harrism @shwina for perspectives from cuGraph/cuML/cuSpatial/RMM. Feel free to tag others as well of course.

@shwina
Copy link
Contributor

shwina commented Aug 5, 2022

It sounds like this is proposing two things:

  1. Migrating C++ documentation from Doxygen to Sphinx via Breathe
  2. In addition, integrating both Python and C++ documentation via the Omniverse documentation system

I'd defer the decision about (1) to yourself, @GregoryKimball and the other libcudf C++ devs.

Regarding (2), I have several questions, but primarily:

  1. Will it be open source?
  2. Is the primary goal uniform branding across the Python and C++ documentation? If so, does this rule out integrating node-rapids or potentially bindings for other languages?
  3. What changes, if any, are required to the Python code and documentation sources?

Overall, I feel like (1) can be done independently, and in support of, (2). So we could forge ahead with (1) and make a decision about (2) later.

@vyasr
Copy link
Contributor Author

vyasr commented Aug 5, 2022

Those are all great questions. I'll address those that I can.

My main reservation with your suggestion to move forward with (1) independently is that I don't know to what extent doing (1) helps with (2). It would probably help us iron out any conflicts between our style and Breathe, but I don't know if that will translate to issues that we run into with the Omniverse template. Ultimately all of these docs boil down to playing with project-specific config files and I don't think the Omniverse ones are that similar.

  1. I am not sure whether the Omniverse docs will eventually become open source. I am also not sure whether that's necessarily a requirement for us, since that would just be the tool that we use to build and publish our documentation. Granting that we would prefer an open source tool, I don't think it's a showstopper if it stays closed source.
  2. I would say that the goals are to 1) unify to the greatest extent possible, and 2) improve the quality of the non-API components of the C++ documentation. It would be great to unify Java and JS documentation as well, but I'm not trying to boil the ocean here. I would say that getting C++ onto Sphinx alone would be a marked improvement in our ability to add additional (non-API) types of documentation much more easily. The C++ code could use user guides, for example. In that respect, moving forward with Breathe even without Omniverse could indeed be beneficial.
  3. I think that the main change would be that a lot of what currently goes in conf.py would now go into the repo.toml file. I don't know if it supports everything that we currently use or need.

@shwina
Copy link
Contributor

shwina commented Aug 5, 2022

I don't think it's a showstopper if it stays closed source.

I'd be very hesitant to replace open-source tooling for something as fundamental as building docs with anything closed-source - especially when we (cuDF/RAPIDS) cannot support the latter. Doc contributions are typically low-hanging fruit for new contributors and I'd love for that to be the case for RAPIDS as well.

@github-actions
Copy link

github-actions bot commented Sep 4, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

rapids-bot bot pushed a commit that referenced this issue Oct 18, 2022
This PR adds a section to the developer documentation about various libcudf design decisions that affect users. These policies are important for us to document and communicate consistently. I am not sure what the best place for this information is, but I think the developer docs are a good place to start since until we address #11481 we don't have a great way to publish any non-API user-facing libcudf documentation. I've created this draft PR to solicit feedback from other libcudf devs about other policies that we should be documenting in a similar manner. Once everyone is happy with the contents, I would suggest that we merge this into the dev docs for now and then revisit a better place once we've tackled #11481.

Partly addresses #5505, #1781.

Resolves #4511.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Jake Hemstad (https://github.com/jrhemstad)
  - Bradley Dice (https://github.com/bdice)
  - David Wendt (https://github.com/davidwendt)

URL: #11853
@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

Having an integrated solution would also be very beneficial as we move towards exposing pylibcudf as a public API. Since pylibcudf functions will all be minimal wrappers around libcudf functions, being able to cross-link libcudf docs from pylibcudf docstrings would be very valuable to help simplify writing those docs.

@vyasr vyasr self-assigned this Dec 20, 2023
rapids-bot bot pushed a commit that referenced this issue Jan 17, 2024
This PR leverages [Breathe](https://breathe.readthedocs.io/en/latest/) to pull the cudf C++ API documentation into the python Sphinx docs build, generating a single unified build of the documentation that supports cross-linking between language libraries and also simplifies cross-linking from other libraries that wish to link here.

This PR also revealed numerous other issues with our doxygen docs. I've submitted those as separate patches to control the diff here, but it's worth noting that Sphinx is much louder with warnings than doxygen and will help us avoid many more issues with broken documentation than doxygen alone could.

Resolves #11481

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Ashwin Srinath (https://github.com/shwina)
  - Karthikeyan (https://github.com/karthikeyann)
  - David Wendt (https://github.com/davidwendt)

URL: #13846
@vyasr vyasr added the pylibcudf Issues specific to the pylibcudf package label May 28, 2024
@vyasr vyasr moved this from Todo to Done in cuDF Python May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation feature request New feature or request proposal Change current process or code pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants