Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in CommonMark implementation #308

Closed
Witiko opened this issue Jun 26, 2023 · 4 comments · Fixed by #318
Closed

Fix memory leak in CommonMark implementation #308

Witiko opened this issue Jun 26, 2023 · 4 comments · Fixed by #318
Assignees
Labels
bug commonmark Related to making the syntax of markdown follow the CommonMark spec lua Related to the Lua interface and implementation
Milestone

Comments

@Witiko
Copy link
Owner

Witiko commented Jun 26, 2023

In #226 (comment), we have identified a memory leak in our implementation of CommonMark. #226 (comment) has shown that large data structures are created at every call of markdown.new() and not freed by garbage collection. The issue has been hotfixed in 1c4b844 by adding option singletonCache that caches the result of calling markdown.new(options) to an LRU cache of size 1 keyed by options, but documents with a large number of Markdown fragments typeset with different options are still affected.

Acceptance criteria

First, navigate to the root folder of the git repository and save the following script to a file named memory-leak.lua:

#!/usr/bin/env texlua
local ran_ok, kpse = pcall(require, "kpse")
if ran_ok then kpse.set_program_name("luatex") end
local markdown = require("markdown")

for _ = 1, 100 do
  markdown.new({singletonCache = false})("Hello *world*!")
end

Next, create executable files named bisect.sh, build-docker-image.sh, run-test.sh, and remove-docker-image.sh, as described in #226 (comment), section Experimental setup.

Finally, run the command ./bisect.sh. Here is the expected result:

$ ./bisect.sh
No problems were detected, success.

Afterwards, you may want to clear the build cache of Docker:

$ docker builder prune
@Witiko Witiko added bug lua Related to the Lua interface and implementation commonmark Related to making the syntax of markdown follow the CommonMark spec labels Jun 26, 2023
@Witiko Witiko added this to the 3.0.1 milestone Jun 26, 2023
@Witiko
Copy link
Owner Author

Witiko commented Aug 12, 2023

Here is the average memory leaked at tag 2.23.0 before #226 was merged:

  • M.new(): Average memory leak based on 100 runs: 0.91MiB

Here is the average memory leaked at commit bf6f0c5 from the current branch main, drilled down to the affected call sites:

This shows that there was no major memory leak before #226. Furthermore, this also shows that the memory leak is related to patterns for links, images. and emphasis, and that the brunt of the memory leak happens during copying syntax tables at the end of reader->finalize_grammar().

Experimental details

First, I updated file build-docker-image.sh to prevent discarding my changes to markdown.dtx as follows:

*** build-docker-image.sh.old   2023-08-12 16:29:14.796017962 +0200
--- build-docker-image.sh.new   2023-08-12 15:25:45.246758729 +0200
***************
*** 32,34 ****
--- 32,36 ----
  # Clean up
+ git add markdown.dtx
  git checkout .
+ git restore --staged markdown.dtx
   

Then, I added the following sentinel code into different parts of the Lua code in file markdown.dtx:

-- Log RAM at the beginning of the test.
os.execute("free -m | head -2 | awk '{ print $1,$2,$3,$4 }' | sed 's/ shared//'")
-- Log RAM at the end of the test.
os.execute("free -m | head -2 | tail -1 | awk '{ print $1,$2,$3,$4 }'")

Next, I created a file analyze-profile.py with the following contents:

import statistics
import sys

deltas = []
previous_used = None
for line in sys.stdin:
    if line.startswith('Mem: '):
        _, total, used, free = line.split()
        used = int(used)
        if previous_used is None:
            previous_used = used
            continue
        delta = used - previous_used
        deltas.append(delta)
        previous_used = None
print(f'Average memory leak based on {len(deltas)} runs: {statistics.mean(deltas):.2f}MiB')

Finally, I moved measured the average memory leaked in M.new() and its subsets by running ./bisect.sh 2>/dev/null | python3 analyze-profile.py. I measured different parts of the code by moving the sentinel code. I only listed memory leaks greater than 1MiB unless the call of entire M.new() leaked less that 1MiB.

@Witiko
Copy link
Owner Author

Witiko commented Aug 12, 2023

@lostenderman Above, I have narrowed down the memory leaks to tens of lines of code. However, I have mainly focused on discovery so far and I don't have a precise idea about the nature of the memory leak yet. If you can spare the time, I would appreciate it if you could glance over the affected code and help me brainstorm about the possible causes.

@Witiko
Copy link
Owner Author

Witiko commented Aug 12, 2023

I had a short brainstorming session about the issue with AI-enabled Bing, which suggested the following: ``One possible cause is that you are defining recursive patterns that create reference cycles between them.'' I can see how this could be a problem and how this would prevent garbage collection. I haven't reviewed the code to see if this could actually be the issue yet, but the solution would be to save the cyclic patterns in the grammar and make the references between them indirect using lpeg.V().

@Witiko
Copy link
Owner Author

Witiko commented Aug 16, 2023

10.48MiB > 4.92MiB is likely due to delayed garbage collection

To improve the measurement accuracy, I prefixed both parts of the sentinel code from #308 (comment) with collectgarbage("collect"). As an unintended result, this has caused the memory leak to disappear. Apparently, there is no memory leak, just a lazy garbage collector that fails to reclaim memory before an out-of-memory event. I used this theory to construct the fix in #318.

Witiko added a commit that referenced this issue Jan 6, 2024
As reported by @TeXhackse, testing can take up to 8G RAM with 4 CPUs and
batch size 100, which seems excessive. This is related to #226, #308,
and #318.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug commonmark Related to making the syntax of markdown follow the CommonMark spec lua Related to the Lua interface and implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants