Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated content uses a lot of disk space for a small subset of packages. #56

Closed
tfoote opened this issue May 19, 2023 · 3 comments · Fixed by #89
Closed

Generated content uses a lot of disk space for a small subset of packages. #56

tfoote opened this issue May 19, 2023 · 3 comments · Fixed by #89
Labels
bug Something isn't working

Comments

@tfoote
Copy link
Member

tfoote commented May 19, 2023

We got an alert on disk space from our hosts. And a quick look at our documentation usage showed most packages well under 100MB. But there were quite a few with generated content much bigger.

Here's the top offenders in rolling

99M smacc2
100M rmf_traffic
102M rosidl_runtime_c
105M osrf_testing_tools_cpp
106M sm_dance_bot_warehouse_2
132M nav2z_client
133M eigenpy
135M sm_advanced_recovery_1
145M rviz_default_plugins
160M rmw
167M rcl
169M rmf_utils
319M hpp-fcl
869M rclcpp
1.2G sm_multi_stage_1
1.3G sm_pack_ml
1.8G proxsuite
2.0G vitis_common
7.1G fastrtps
22G total

It would be good to understand why these are blowing up and keep that from happening.

@mikeferguson
Copy link

Out of curiosity - I did some poking around here in rviz_default_plugins:

11M	./.doctrees/generated
25M	./.doctrees
3.2M	./_static/css/fonts
3.4M	./_static/css
44K	./_static/collapsible-lists/css
12K	./_static/collapsible-lists/js
64K	./_static/collapsible-lists
28K	./_static/js
3.6M	./_static
2.0M	./_sources/generated
2.0M	./_sources
776K	./generated/doxygen/html/search
12M	./generated/doxygen/html
6.2M	./generated/doxygen/xml
18M	./generated/doxygen
292M	./generated
326M	.

In the generated folder, we have 401 generated files, each of which is at least 692KB in size. That 692KB is the navigation menu on the side of the page - it's nearly the same for every one of those 401 files, other than specifying which portions are default open/closed.

@tfoote
Copy link
Member Author

tfoote commented Mar 28, 2024

I can confirm that this appears to be the problem. There now appears to be 1.3MB of content in the treetoc for every page in rclcpp and most pages have a few dozen other lines for 890 generated files now for rolling.

Looking inside one of them appears to have some 5000 toctree entries spanning 9000 lines of content. With the few dozen other elements of content. It looks like this is coming from the exhale Clickable Higherarchies: https://exhale.readthedocs.io/en/latest/reference/configs.html#clickable-hierarchies

This is one of the core features/values from the system. But finding a way to include the content instead of duplicating it would be very valuable. There may be some options to explore related to this. Hopefully someone else has run into this issue before us.

grep -rI 'class="toctree' docs_output/rclcpp/ | wc -l
4921466
$ grep -rI 'class="toctree' docs_output/sensor_msgs/  | wc -l
66706

@rkent
Copy link
Contributor

rkent commented Mar 31, 2024

The sphinx-rtd-theme page has the following note:

Setting collapse_navigation to False and using a high value for navigation_depth on projects with many files and a deep file structure can cause long compilation times and can result in HTML files that are significantly larger in file size.

That matches our issue to some extent. I'd like to try adjusting those values to see what the effect is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants