Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize doc builds #14856

Merged
merged 9 commits into from
Jan 26, 2024
Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Jan 24, 2024

Description

cudf docs are generally very slow to build. This problem was exacerbated by the recent addition of libcudf C++ API documentation to the Sphinx build. This PR aims to ameliorate this issue for both local and CI builds by making the following changes:

The net result is roughly a halving of the CI run time for the builds (~40 min to ~20 min). Further potential optimizations:

  • Reenabling parallel builds. We cannot fully revert Disable parallel build #14796 until the theme is fixed, but if we can put in a warning filter we could reenable parallelism and have it work on just the reading steps of the build and not the writes. That would still improve performance.
  • Better caching of notebooks. nbsphinx supports caching, but there are various caveats w.r.t. 1) local vs CI builds, 2) proper cache invalidation, e.g. when notebook source does not change but underlying libraries do, and 3) forcing rebuilds. Alternatively, we could enable some environment variable that allows devs to turn off notebook execution locally. Making it opt-in would make the default behavior safe while providing an escape hatch for power users who want the builds to be fast.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added doc Documentation non-breaking Non-breaking change labels Jan 24, 2024
@vyasr vyasr self-assigned this Jan 24, 2024
@github-actions github-actions bot added the conda label Jan 24, 2024
@github-actions github-actions bot added the ci label Jan 24, 2024
@vyasr vyasr force-pushed the feat/optimize_doc_builds branch from 0c99a05 to 0095b7d Compare January 24, 2024 07:17
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Jan 24, 2024
@vyasr vyasr marked this pull request as ready for review January 24, 2024 21:57
@vyasr vyasr requested a review from a team as a code owner January 24, 2024 21:57
ci/build_docs.sh Outdated Show resolved Hide resolved
ci/build_docs.sh Outdated Show resolved Hide resolved
ci/build_docs.sh Outdated Show resolved Hide resolved
ci/build_docs.sh Outdated Show resolved Hide resolved
Comment on lines +134 to +135
with tempfile.NamedTemporaryFile() as tmp_fn:
tree.write(tmp_fn.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this save time? How is it faster to write to a tempfile, compare to fn, and then write fn versus just writing fn?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that this particular part is slower, but sphinx probably has some make-like timestamp-based "do I rebuild dependents of this file" logic. So by avoiding updating a file with a newer timestamp and the same contents, something downstream does less work.

Copy link
Contributor Author

@vyasr vyasr Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. This change no longer has a meaningful impact in PR CI because we don't build text docs, but in branch/nightly builds (because we build html and then text) or local rebuilds it will make things faster because without this Sphinx tries to read the entire set of files again thinking that they have changed. Instead of calling tree.write here I could also have used os.rename/shutil.move etc but the amount of time taken by this part of the code is negligible and I didn't feel like dealing with issues around manual clean up of the temp file etc.

@wence-
Copy link
Contributor

wence- commented Jan 25, 2024

Alternatively, we could enable some environment variable that allows devs to turn off notebook execution locally. Making it opt-in would make the default behavior safe while providing an escape hatch for power users who want the builds to be fast.

I think this is the right approach. Most of the time when one is updating docs, it's a back-and-forth process to figure out exactly which syntax combination in a single page will result in (say) the correct link. The faster we can make that iteration, the better.

@wence-
Copy link
Contributor

wence- commented Jan 25, 2024

(~40 min to ~20 min). Further potential optimizations:

Wed, 24 Jan 2024 22:31:49 GMT Run ci/build_docs.sh
[...] environment is created
Wed, 24 Jan 2024 22:36:02 GMT |    Build CPP docs    |
[...] doxygen runs
Wed, 24 Jan 2024 22:36:08 GMT |    Build Python docs    |
[...] sphinx runs
Wed, 24 Jan 2024 22:41:24 GMT 10min.ipynb: Executed notebook in 235.79 seconds [mystnb]
[...] sphinx runs more
Wed, 24 Jan 2024 22:43:44 GMT preparing documents... done
[...] conversion to HTML runs
Wed, 24 Jan 2024 22:49:14 GMT writing output... [100%] user_guide/performance-comparisons/performance-comparisons
[...] cudf docs build done
Wed, 24 Jan 2024 22:49:26 GMT |    Build dask-cuDF Sphinx docs  
[...] dask-cudf build + uploading to S3
Wed, 24 Jan 2024 22:51:09 GMT  | ALL DONE

So:

  • 4 mins solving for and building the conda environment
  • 6 seconds to run doxygen
  • 8 mins to parse the docs (~5 mins of which is spent executing 2 notebooks, and 4 mins executing 10 mins to cudf)
  • 6 mins writing output
  • 2 mins to upload everything

If we tacked the docs build on to the end of one of (say) the python-build jobs, we would shave 4 mins (no need to recreate the environment). If we skip notebook execution on PR runs (maybe worth it), we save another 3 mins.

Co-authored-by: Bradley Dice <[email protected]>
@vyasr
Copy link
Contributor Author

vyasr commented Jan 26, 2024

/merge

@rapids-bot rapids-bot bot merged commit a41238f into rapidsai:branch-24.04 Jan 26, 2024
69 checks passed
raydouglass pushed a commit that referenced this pull request May 29, 2024
Reverts #15842

The files the original PR added documentation for appear to contain some
text that is problematic for the Sphinx parser to extract from doxygen.
My best guess is that it's something in a table, since parsing doxygen
tables via Breathe is something I know can be tricky. We didn't catch
this issue because [we currently only build the text docs in nightly
builds, not
PRs](https://github.com/rapidsai/cudf/blob/branch-24.08/ci/build_docs.sh#L49),
and this issue only arises in those text builds. We can revisit adding
these docs in 24.08. For the sake of correctness, I have added back
building text docs in PRs in this PR (see #14856 for context on the
removal).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants