Out of memory, process killed #536

ianare · 2018-11-22T19:44:37Z

Machine info:

32 GB of RAM
8-core CPU: Intel(R) Xeon(R) CPU E3-1270 v6 @ 3.80GHz

Running zola v0.5.0

The site I'm generating has a content directory with 105336 files, 105732 sub-folders. Almost all of the files are section markdown.

I understand this is probably not a very common use case, but hopefully will be useful for you in optimizing memory management of Zola.

dmesg -T | grep zola

Out of memory: Kill process 29015 (zola) score 908 or sacrifice child
Killed process 29015 (zola) total-vm:30630456kB, anon-rss:30549904kB, file-rss:0kB, shmem-rss:0kB
oom_reaper: reaped process 29015 (zola), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Grafana chart over 3 unsuccessful runs:

The text was updated successfully, but these errors were encountered:

Keats · 2018-11-23T09:22:48Z

100k files :o
Where does it error? Does it manage to load everything (ie do you see something like -> Creating 43 pages (0 orphan), 1 sections, and processing 0 images)?

caemor · 2018-11-23T16:27:39Z

It seems like the hashmaps for get_taxonomy, get_page and get_section, which are made from register_tera_global_fns, are just getting too big for blogs over 10k files

For huge blog:

For giant blog (bench example with 100k files generated) until it ran out of memory:

about the tool: heaptrack was the first thing I found by searching and therefore I used it for this (+ it is in the arch repos) and one needs to add the following to the main file: (for more infos see: https://speice.io/2018/10/case-study-optimization.html - this also might change the result a little bit)

use std::alloc::System;

#[global_allocator]
static GLOBAL: System = System;

ianare · 2018-11-23T16:36:05Z

100k files :o
Where does it error? Does it manage to load everything (ie do you see something like -> Creating 43 pages (0 orphan), 1 sections, and processing 0 images)?

No it never got to that point, the only console output was "building site", followed by "Killed".

For what it's worth, I was able to generate the site on a machine with 128Gb of RAM.

Usage was around 39 GB on each run.

Keats · 2018-11-23T17:03:14Z

It seems like the hashmaps for get_taxonomy, get_page and get_section, which are made from register_tera_global_fns, are just getting too big for blogs over 10k files

Yeah that was my guess as well but it is tricky to fix: Keats/tera#340
I don't have any good idea on that currently other than rewriting a custom serde format that doesn't clone like serde_json

@ianare
What's the average size of a section?

Keats · 2018-11-27T17:08:20Z

get_page and get_section are redundant now, we can have only get_content so it should be better.
Most of the improvements are likely to be in Tera rather than Zola though.

ianare · 2018-12-06T16:36:20Z

Sorry for the delay.

I've run some more generations and it seems that when the sections are small (~ 5 kb) and not complex the memory problem crops up.

When the sections are more complex and larger (10-15 kb), and processing times are slower, the memory usage is better (even if still having large peaks).

It might be a suitable workaround to be able to specify the number of CPUs used during generation.

This would also be useful on machines that are both generating and serving with a dedicated server like nginx at the same time, so as not to introduce lag on the site during generation. Although this would be better in a dedicated ticket... should I open this?

Also really wanted to thank everyone for taking the time to look into this... and apologies for not having (yet) the ability to help out in the code.

ianare · 2018-12-06T18:24:48Z

I've run some more generations and it seems that when the sections are small (~ 5 kb) and not complex the memory problem crops up.

When the sections are more complex and larger (10-15 kb), and processing times are slower, the memory usage is better (even if still having large peaks).

It might be a suitable workaround to be able to specify the number of CPUs used during generation.

Apologies, there was an error in the configuration and the reason it was OK on the 32GB server was because there was "only" 60K files.

Keats · 2019-01-27T18:28:39Z

@caemor @ianare

I've updated the next branch and it should use way less memory now, can you try again?
Running heaptrack again surfaced another source of allocations which will be fairly easy to fix afaik

Keats · 2019-02-04T20:23:20Z

@caemor @ianare ping

caemor · 2019-02-05T14:06:15Z

I'm gonna try it once more later today. Sorry for the late answer :-D

ianare · 2019-02-07T19:19:27Z

Sorry for the delay, I'm getting an error message related to macros:

--> 4:1
|
4 | {% import "macros/image_list.html" as image_list %}
| ^---
|
= unexpected tag; expected end of input or some content

Keats · 2019-02-07T19:21:08Z

I believe I fixed that in Tera but didnt push a new version yet. Is there an empty line above? Try to remove it if there is one

…

On Thu, 7 Feb 2019, 20:19 ianaré sévi, ***@***.***> wrote: Sorry for the delay, I'm getting an error message related to macros: --> 4:1 | 4 | {% import "macros/image_list.html" as image_list %} | ^--- | = unexpected tag; expected end of input or some content — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#536 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApho6RJJ2QVcpITYe8Ee0CPrjl8aQNoks5vLHxAgaJpZM4Yv1c_> .

caemor · 2019-02-07T20:23:52Z

the next-branch doesn't compile for me with the following errors:

cargo bench bench_loading_huge_blog
.......
Compiling site v0.1.0 (/home/caemor/git/tests/zola/components/site)                                                        
error[E0599]: no method named `pages_values` found for type `std::sync::Arc<std::sync::RwLock<library::Library>>` in the current scope
  --> components/site/benches/site.rs:46:49                                                                                   
   |                                                                                                                          
46 |     b.iter(|| site.render_rss_feed(site.library.pages_values(), None).unwrap());                                         
   |                                                 ^^^^^^^^^^^^                                                             
                                                                                                                              
error[E0599]: no method named `sections_values` found for type `std::sync::Arc<std::sync::RwLock<library::Library>>` in the current scope
  --> components/site/benches/site.rs:64:32                                                                                   
   |                                                                                                                          
64 |     let section = site.library.sections_values()[0];                                                                     
   |                                ^^^^^^^^^^^^^^^                                                                           
                                                                                                                              
error[E0308]: mismatched types                                                                                                
  --> components/site/benches/site.rs:65:55                                                                                   
   |                                                                                                                          
65 |     let paginator = Paginator::from_section(&section, &site.library);                                                    
   |                                                       ^^^^^^^^^^^^^ expected struct `library::Library`, found struct `std::sync::Arc`
   |                                                                                                                          
   = note: expected type `&library::Library`                                                                                  
              found type `&std::sync::Arc<std::sync::RwLock<library::Library>>`                                               
                                                                                                                              
error: aborting due to 3 previous errors                                                                                      
                                                                                                                              
Some errors occurred: E0308, E0599.

Keats · 2019-02-07T20:41:03Z

Looks like I forgot to update those benches, will do later

…

On Thu, 7 Feb 2019, 21:23 Chris, ***@***.***> wrote: the next-branch doesn't compile for me with the following errors: cargo bench bench_loading_huge_blog ....... Compiling site v0.1.0 (/home/caemor/git/tests/zola/components/site) error[E0599]: no method named `pages_values` found for type `std::sync::Arc<std::sync::RwLock<library::Library>>` in the current scope --> components/site/benches/site.rs:46:49 | 46 | b.iter(|| site.render_rss_feed(site.library.pages_values(), None).unwrap()); | ^^^^^^^^^^^^ error[E0599]: no method named `sections_values` found for type `std::sync::Arc<std::sync::RwLock<library::Library>>` in the current scope --> components/site/benches/site.rs:64:32 | 64 | let section = site.library.sections_values()[0]; | ^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> components/site/benches/site.rs:65:55 | 65 | let paginator = Paginator::from_section(&section, &site.library); | ^^^^^^^^^^^^^ expected struct `library::Library`, found struct `std::sync::Arc` | = note: expected type `&library::Library` found type `&std::sync::Arc<std::sync::RwLock<library::Library>>` error: aborting due to 3 previous errors Some errors occurred: E0308, E0599. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#536 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAphow_yx6Lscn1OIr4ZWq9ZnFWrmAyDks5vLItZgaJpZM4Yv1c_> .

Keats · 2019-02-08T18:06:17Z

Both issues should be fixed on the next branch now

caemor · 2019-02-08T19:41:22Z

Looks much better than last time. Approx. only 25% of last times maximum heap usage for the huge blog. And the giant blog which didn't build last time also worked with a max usage of 5.8GB
Great work Keats! 👍

Huge Blog (10000 pages, 0 orphans, 0 sections, 0 images): 579,5MB at max

Giant Blog (100000 pages, 0 orphans, 0 sections, 0 images): 5,8GB at max

In Comparison current master with the same huge blog from above needs 2,3GB at max

Building time also got reduced to about half it's previous value: (zola build)

Type	Without heaptrack	With heaptrack
huge on next	~6s	~70s
huge on master	~15s	~148s
giant on next	~71s	~711s

Keats · 2019-02-08T20:19:34Z

Looks like the RSS feed doesn't have a limit so it serialises and renders all the pages. Can you try to put a realistic limit like 100 for rss_limit in the config?

…

On Fri, 8 Feb 2019, 20:41 Chris, ***@***.***> wrote: Looks much better than last time. Approx. only 25% of last times usage for the huge blog. Great work Keats! 👍 [image: zola_huge_blog_18858] <https://user-images.githubusercontent.com/11088935/52501678-32b6fd80-2be1-11e9-9ba7-205946ccfed7.png> [image: zola_giant_20174] <https://user-images.githubusercontent.com/11088935/52501686-36e31b00-2be1-11e9-924d-a98df4e3c6f0.png> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#536 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApho34ridCR1K_J5-Nm6d1-m7svESGyks5vLdLigaJpZM4Yv1c_> .

caemor · 2019-02-08T20:48:10Z

After setting the rss_limit to 100 the time got down to ~65s/~5.8s (with/without heaptrack) and ~455MB max heap usage for the huge blog on next. Also be aware that the screenshot is only showing the state at its max usage.
Now the serializing of the taxonomies takes the most memory.

@Keats maybe you want to set that limit in the gen.py autogeneration in the future for the benches?

Note for myself: Steps to reproduce these heaptracks later:

pacman -S heaptrack
cd zola/components/site/benches
python gen.py
cd zola/
cargo build --release
cd zola/components/site/benches/huge_blog/
heaptrack ../../../../target/release/zola build

Now use the commandline that heaptrack returns to open and analyze the recorded heaptrack (sth similar to heaptrack --analye "path_to_folder/.../heaptrack.zola.25242.zst")

Keats · 2019-02-09T08:06:22Z

Now the serializing of the taxonomies takes the most memory.

I see that the toc serialization is taking lots of time because it is on page rather than being added to the context of the specific page rendering itself. Probably an easy win as I don't think many people want to show the table of contents while displaying a list of pages but I could be wrong...

Still using a bit too much memory for my taste...

Keats · 2019-02-09T19:57:34Z

I moved the toc out of page and that's a bit better.
The reason the taxonomy rendering takes so much time if that they are not paginated so it is basically serializing all the pages, which is going to take time/memory in those benches.
The blog benches are not super realistic, the huge-kb has 10k pages as well but renders in 3.6s for example.

Keats · 2019-03-25T20:01:28Z

It should be fine with 0.6.0 (currently being built).
Re-open the issue if you still encounter problems!

ianare · 2019-04-10T13:38:09Z

Can confirm that memory usage is much much lower now. Great job!

sr2ds · 2024-03-11T16:14:37Z

Hello guys,
@ianare , how big is your blog now?

I'm dealing with some work like your, my blog now has 104k pages in markdown and 1 image for each post.
I'm not dealing with re-size or something for images, I'm only copying and using the original image.

I'm with the same problem that you had some years ago.
The memory increase is high, not so high, @Keats really made a good job, thanks.

My problem is that, I'm using the Netlify as free version and I have some memory limit in the build, around 6GB.
My build is taking around 7,5GB in my computer, but in Netlify the process can´t be finished.
For now, I'm doing the build here and pushing the statics direclty but I would like try improve this behavior because my blog will growth more and will be impossible deal with builds here soon.

Some ideas about the possible solution:

Has some way for us allow some async process as batch?
Like build X posts by time and the build process can fresh memory while work?

Or another idea is to be able do incremental builds?
Maybe something like:
zola buld --increase

And in this option mode, we need keep the public versioned, but we can only construct the new files and the public is not erased all time.

I can write something in Rust, I didnt start to find some solution yet, I would like some help before start it and found this good chat.

Keats added the bug label Nov 23, 2018

Keats closed this as completed Mar 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory, process killed #536

Out of memory, process killed #536

ianare commented Nov 22, 2018 •

edited

Loading

Keats commented Nov 23, 2018

caemor commented Nov 23, 2018 •

edited

Loading

ianare commented Nov 23, 2018 •

edited

Loading

Keats commented Nov 23, 2018

Keats commented Nov 27, 2018

ianare commented Dec 6, 2018 •

edited

Loading

ianare commented Dec 6, 2018

Keats commented Jan 27, 2019

Keats commented Feb 4, 2019

caemor commented Feb 5, 2019

ianare commented Feb 7, 2019

Keats commented Feb 7, 2019 via email

caemor commented Feb 7, 2019

Keats commented Feb 7, 2019 via email

Keats commented Feb 8, 2019

caemor commented Feb 8, 2019 •

edited

Loading

Keats commented Feb 8, 2019 via email

caemor commented Feb 8, 2019

Keats commented Feb 9, 2019

Keats commented Feb 9, 2019

Keats commented Mar 25, 2019

ianare commented Apr 10, 2019

sr2ds commented Mar 11, 2024

Out of memory, process killed #536

Out of memory, process killed #536

Comments

ianare commented Nov 22, 2018 • edited Loading

Keats commented Nov 23, 2018

caemor commented Nov 23, 2018 • edited Loading

ianare commented Nov 23, 2018 • edited Loading

Keats commented Nov 23, 2018

Keats commented Nov 27, 2018

ianare commented Dec 6, 2018 • edited Loading

ianare commented Dec 6, 2018

Keats commented Jan 27, 2019

Keats commented Feb 4, 2019

caemor commented Feb 5, 2019

ianare commented Feb 7, 2019

Keats commented Feb 7, 2019 via email

caemor commented Feb 7, 2019

Keats commented Feb 7, 2019 via email

Keats commented Feb 8, 2019

caemor commented Feb 8, 2019 • edited Loading

Keats commented Feb 8, 2019 via email

caemor commented Feb 8, 2019

Keats commented Feb 9, 2019

Keats commented Feb 9, 2019

Keats commented Mar 25, 2019

ianare commented Apr 10, 2019

sr2ds commented Mar 11, 2024

ianare commented Nov 22, 2018 •

edited

Loading

caemor commented Nov 23, 2018 •

edited

Loading

ianare commented Nov 23, 2018 •

edited

Loading

ianare commented Dec 6, 2018 •

edited

Loading

caemor commented Feb 8, 2019 •

edited

Loading