-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize doc builds #14856
Optimize doc builds #14856
Conversation
0c99a05
to
0095b7d
Compare
with tempfile.NamedTemporaryFile() as tmp_fn: | ||
tree.write(tmp_fn.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this save time? How is it faster to write to a tempfile, compare to fn
, and then write fn
versus just writing fn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that this particular part is slower, but sphinx probably has some make-like timestamp-based "do I rebuild dependents of this file" logic. So by avoiding updating a file with a newer timestamp and the same contents, something downstream does less work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. This change no longer has a meaningful impact in PR CI because we don't build text docs, but in branch/nightly builds (because we build html and then text) or local rebuilds it will make things faster because without this Sphinx tries to read the entire set of files again thinking that they have changed. Instead of calling tree.write
here I could also have used os.rename/shutil.move etc but the amount of time taken by this part of the code is negligible and I didn't feel like dealing with issues around manual clean up of the temp file etc.
I think this is the right approach. Most of the time when one is updating docs, it's a back-and-forth process to figure out exactly which syntax combination in a single page will result in (say) the correct link. The faster we can make that iteration, the better. |
So:
If we tacked the docs build on to the end of one of (say) the python-build jobs, we would shave 4 mins (no need to recreate the environment). If we skip notebook execution on PR runs (maybe worth it), we save another 3 mins. |
Co-authored-by: Bradley Dice <[email protected]>
/merge |
Reverts #15842 The files the original PR added documentation for appear to contain some text that is problematic for the Sphinx parser to extract from doxygen. My best guess is that it's something in a table, since parsing doxygen tables via Breathe is something I know can be tricky. We didn't catch this issue because [we currently only build the text docs in nightly builds, not PRs](https://github.com/rapidsai/cudf/blob/branch-24.08/ci/build_docs.sh#L49), and this issue only arises in those text builds. We can revisit adding these docs in 24.08. For the sake of correctness, I have added back building text docs in PRs in this PR (see #14856 for context on the removal).
Description
cudf docs are generally very slow to build. This problem was exacerbated by the recent addition of libcudf C++ API documentation to the Sphinx build. This PR aims to ameliorate this issue for both local and CI builds by making the following changes:
The net result is roughly a halving of the CI run time for the builds (~40 min to ~20 min). Further potential optimizations:
Checklist