Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DOM parser memory #1211

Merged

Conversation

angelozerr
Copy link
Contributor

@angelozerr angelozerr commented May 31, 2022

Improve DOM parser memory

Signed-off-by: azerr [email protected]

@angelozerr
Copy link
Contributor Author

angelozerr commented May 31, 2022

To test this PR you mut use the large XML file content.xml (38.5 MB) you open it and you type some characters or start the main https://github.com/eclipse/lemminx/blob/master/org.eclipse.lemminx/src/test/java/org/eclipse/lemminx/performance/DOMParserPerformance.java:

Before the PR, it takes around 500MB :

image

After the PR, it takes around 250MB :

image

The improvement is to extract attribute name and value (with substring) only if we need it https://github.com/eclipse/lemminx/pull/1211/files#diff-2dd7761f255392dfab2161385e7b8c870ebe1e60de527b95c3552e98bf2b4e44R142 and not while parsing https://github.com/eclipse/lemminx/pull/1211/files#diff-c84b54a638e856ac5f58de713b3308cac2acf6171f8548c21c959de451b8cf79L229

Indeed when user types several characters the attribute name don't need to extract since DOM document becomes dirty.

@angelozerr angelozerr force-pushed the improve-memory-parser branch from 9241327 to 71e738a Compare May 31, 2022 17:00
@angelozerr
Copy link
Contributor Author

If you start https://github.com/eclipse/lemminx/blob/master/org.eclipse.lemminx/src/test/java/org/eclipse/lemminx/performance/DOMParserPerformance.java in master:

image

You can see that problem comes byte[] because there are for each attribute name and value a substring:

image

with this PR:

image

You can see byte[] uses les memory:

image

@angelozerr angelozerr changed the title Improve memory of parser. Improve DOM parser memory May 31, 2022
@angelozerr angelozerr force-pushed the improve-memory-parser branch from 71e738a to 631127f Compare May 31, 2022 17:07
@angelozerr angelozerr marked this pull request as ready for review May 31, 2022 17:10
@angelozerr angelozerr force-pushed the improve-memory-parser branch from 631127f to 1f7ef1f Compare May 31, 2022 17:17
@rgrunber
Copy link
Contributor

rgrunber commented Jun 1, 2022

I had to set my xml vmargs to about 1G to test content.xml (defaul tis 64MB). It went from a heapsize of ~450MB after opening, to around ~320MB, so that seems about right. I think this is fine to merge.

@angelozerr
Copy link
Contributor Author

@rgrunber the real benefit is when you update the editor.

@angelozerr angelozerr merged commit c0eec05 into eclipse-lemminx:master Jun 1, 2022
@angelozerr
Copy link
Contributor Author

Thanks @rgrunber for you review!

@angelozerr angelozerr added this to the 0.21.0 milestone Jun 1, 2022
@angelozerr angelozerr added the performance This issue or enhancement is related to performance concerns label Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance This issue or enhancement is related to performance concerns
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants