-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extreme memory usage with large blocks of text #627
Comments
Could you provide the code you used to produce this? |
Not sure what's going on with the OP's test. I used the LineIndicatorDemo and pasted ALL of the text from: http://www.gutenberg.org/cache/epub/1661/pg1661.txt until I got over 50,000 lines. Memory usage, as reported by the Windows Task Manager, started at 80MB with 13,000 lines and went up to 205MB with 52,000 lines (after manually scrolling/paging through every line). I didn't notice a drop in performance with scrolling or paging. One thing to note however is that as you page up and down through the text the memory usage never decreases, it will constantly, slowly go up. Not that this means that much with the garbage collector and heap management, the memory manager may just be keeping that memory around but not used. edit: I left the demo running with the 52,000 lines and the memory usage, after rising to 245MB has now fallen to 213MB so the garbage collector is cleaning things up. |
This was about 400,000 lines, so 245/213MB sounds about in line with what I'm seeing (maybe even high). I'd be willing to bet that the garbage collector is the source of the (or much of the) performance issue, but memory usage shouldn't be this high in the first place. When I scroll, memory usage goes up and down pretty wildly. I suspect when you up the data, it will go down also like I'm seeing. Maybe it's not getting high enough in your test to bother with GC. I think I basically just cloned the ContentScalabilityTest and added some text generation. I originally noticed the memory issue loading an 8MB text file, which came to about 400,000 rows of text. I reduced the number of characters it adds per line but it still behaves the same. Please don't be a Linux-only issue.Please don't be a Linux-only issue.Please don't be a Linux-only issue.Please don't be a Linux-only issue.Please don't be a Linux-only issue. 😅😅 package org.fxmisc.richtext.demo;
import java.text.NumberFormat;
import java.util.Collection;
import java.util.Collections;
import java.util.Timer;
import java.util.TimerTask;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.fxmisc.flowless.VirtualizedScrollPane;
import org.fxmisc.richtext.CodeArea;
import org.fxmisc.richtext.LineNumberFactory;
import org.fxmisc.richtext.model.StyleSpans;
import org.fxmisc.richtext.model.StyleSpansBuilder;
import javafx.application.Application;
import javafx.application.Platform;
import javafx.scene.Scene;
import javafx.scene.control.Label;
import javafx.scene.layout.StackPane;
import javafx.scene.layout.VBox;
import javafx.stage.Stage;
public class ScalabilityTest extends Application {
private static final String[] KEYWORDS = new String[] {
"abstract", "assert", "boolean", "break", "byte",
"case", "catch", "char", "class", "const",
"continue", "default", "do", "double", "else",
"enum", "extends", "final", "finally", "float",
"for", "goto", "if", "implements", "import",
"instanceof", "int", "interface", "long", "native",
"new", "package", "private", "protected", "public",
"return", "short", "static", "strictfp", "super",
"switch", "synchronized", "this", "throw", "throws",
"transient", "try", "void", "volatile", "while"
};
private static final String KEYWORD_PATTERN = "\\b(" + String.join("|", KEYWORDS) + ")\\b";
private static final String PAREN_PATTERN = "\\(|\\)";
private static final String BRACE_PATTERN = "\\{|\\}";
private static final String BRACKET_PATTERN = "\\[|\\]";
private static final String SEMICOLON_PATTERN = "\\;";
private static final String STRING_PATTERN = "\"([^\"\\\\]|\\\\.)*\"";
private static final String COMMENT_PATTERN = "//[^\n]*" + "|" + "/\\*(.|\\R)*?\\*/";
private static final Pattern PATTERN = Pattern.compile(
"(?<KEYWORD>" + KEYWORD_PATTERN + ")"
+ "|(?<PAREN>" + PAREN_PATTERN + ")"
+ "|(?<BRACE>" + BRACE_PATTERN + ")"
+ "|(?<BRACKET>" + BRACKET_PATTERN + ")"
+ "|(?<SEMICOLON>" + SEMICOLON_PATTERN + ")"
+ "|(?<STRING>" + STRING_PATTERN + ")"
+ "|(?<COMMENT>" + COMMENT_PATTERN + ")"
);
private final int TEST_ROWS = 400 * 1000;
private String sampleCode;
private final Runtime runtime = Runtime.getRuntime();
private Timer updateTimer;
public static void main(String[] args) {
System.setProperty("prism.lcdtext", "false");
System.setProperty("prism.text", "t2k");
launch(args);
}
public ScalabilityTest() {
StringBuilder sb = new StringBuilder(String.join("\n", new String[] {
"package com.example;",
"",
"import java.util.*;",
"",
"public class Foo extends Bar implements Baz {",
"",
" /*",
" * multi-line comment",
" */",
" public static void main(String[] args) {",
" // single-line comment",
" for(String arg: args) {",
" if(arg.length() != 0)",
" System.out.println(arg);",
" else",
" System.err.println(\"Warning: empty string as argument\");",
" }",
"\n"
}));
for (int i=0 ; i < TEST_ROWS ; i++) {
sb.append(" System.out.println(\"");
sb.append(Math.random());
sb.append("\");\n");
}
sb.append(" }\n");
sb.append("}\n");
sampleCode = sb.toString();
}
@Override
public void start(Stage primaryStage) {
CodeArea codeArea = new CodeArea();
codeArea.setParagraphGraphicFactory(LineNumberFactory.get(codeArea));
codeArea.richChanges()
.filter(ch -> !ch.getInserted().equals(ch.getRemoved())) // XXX
.subscribe(change -> {
codeArea.setStyleSpans(0, computeHighlighting(codeArea.getText()));
});
codeArea.replaceText(0, 0, sampleCode);
VBox vBox = new VBox();
Label memlbl = new Label();
StackPane p = new StackPane(new VirtualizedScrollPane<>(codeArea));
p.setPrefSize(1200, 880);
vBox.getChildren().add(memlbl);
vBox.getChildren().add(p);
Scene scene = new Scene(vBox, 1200, 900);
scene.getStylesheets().add(JavaKeywordsAsync.class.getResource("java-keywords.css").toExternalForm());
primaryStage.setScene(scene);
primaryStage.setTitle("CodeArea Scalability Test");
primaryStage.show();
updateTimer = new Timer(true);
updateTimer.schedule(new TimerTask() {
@Override
public void run() {
Platform.runLater(() -> {
memlbl.setText(memoryStatus());
});
}
}, 0, 1000);
}
public String memoryStatus() {
NumberFormat format = NumberFormat.getInstance();
StringBuilder sb = new StringBuilder();
long maxMemory = runtime.maxMemory();
long allocatedMemory = runtime.totalMemory();
long freeMemory = runtime.freeMemory();
sb.append("Used memory: ");
sb.append(format.format((runtime.totalMemory() - runtime.freeMemory()) / 1024));
sb.append("KB Avail memory: ");
sb.append(format.format((freeMemory + (maxMemory - allocatedMemory)) / 1024));
sb.append("KB");
return sb.toString();
}
private static StyleSpans<Collection<String>> computeHighlighting(String text) {
Matcher matcher = PATTERN.matcher(text);
int lastKwEnd = 0;
StyleSpansBuilder<Collection<String>> spansBuilder
= new StyleSpansBuilder<>();
while(matcher.find()) {
String styleClass =
matcher.group("KEYWORD") != null ? "keyword" :
matcher.group("PAREN") != null ? "paren" :
matcher.group("BRACE") != null ? "brace" :
matcher.group("BRACKET") != null ? "bracket" :
matcher.group("SEMICOLON") != null ? "semicolon" :
matcher.group("STRING") != null ? "string" :
matcher.group("COMMENT") != null ? "comment" :
null; /* never happens */ assert styleClass != null;
spansBuilder.add(Collections.emptyList(), matcher.start() - lastKwEnd);
spansBuilder.add(Collections.singleton(styleClass), matcher.end() - matcher.start());
lastKwEnd = matcher.end();
}
spansBuilder.add(Collections.emptyList(), text.length() - lastKwEnd);
return spansBuilder.create();
}
} |
I'll add one thing here which may quickly explain this when we look into it closer (no time right now). I thought I had tested this before, but I just tried disabling the codeArea.richChanges() block of code in start() which calls the computeHighlighting method. This reduced memory usage drastically, still maybe a bit high (I see 500MB now), but the test code itself might be causing some of the issue. Maybe some optimizations in the test itself are needed. Performance was fine after disabling that richChanges() block. |
Yep sorry, I missed the extra zero. Just tried the test again with 500,000 lines and memory is up to 675MB. Performance seems to be ok, everything is responsive. |
That number seems a little low, but that's waaaaay higher than I'd expect for what we're doing here. That's ~over 1MB per 1,000 lines of text. Not sure how much RAM you're working with but if you get it up to between 1.0GB and 1.5GB (like that test code above does for me), I think you'll start running into what is likely GC pauses. |
I'm not sure what number you would expect though. 1.3KB per line seems ok to me given the number of objects involved. Interestingly, I left my test running overnight and in the morning the memory usage was down to 4MB, so clearly the actual usage is far less and the GC isn't needing to clean things up immediately. I haven't been able to get a test over 1GB on Windows, I can't get gradle to allow the JavaExec task to have enough memory (I don't know gradle very well). As a comparison, I use the Atom editor for my code editing (sorry RichTextFX) and created a buffer with 480,000 lines, the memory usage is now up to 1.7GB and isn't dropping. When I try and get to 500,000 lines it crashes. |
I use Linux, so running this on Linux Mint, I noticed:
Unfortunately, this seems to be a Linux-only issue... |
Removing the |
@garybentley Well, this tells us that Atom/Electron are also a pig:pig:...? :satisfied: (I guess in its defense, it's not meant to load a 500k line file, being a code editor.) Here's the thing, and I know that under the hood we have alot going on object creation wise, but that 400,000 line file was only 8MB of actual data. For editing a file of that size to require well over 1GB of RAM... something incredibly wasteful is going on. I mean, is every glyph getting an object? There isn't anything inherently wrong with making use of memory like that (these days) when the usual data set is small, but it might mean that some alternate architecture is needed for editing large files. For the heck of it I tried opening a structured text 8MB file with Kate text editor (Linux). @JordanMartinez Yep, I noticed the same with removing So I guess I have a few options: |
If I'd have to guess, I'd say that some of the memory usage stems from a single paragraph's segment styles and segments being stored twice.
Thus, it creates a However, since removing the This means the int previousStyleStart = 0;
int previousStyleEnd = area.getLength();
// something like this? ReactFX handles list changes
// a bit differently, so I can't recall the exact API
EventStreams.changesOf(area.visibleParagraphs())
.subscribe(change -> {
// clear only the area where styles spans was applied previously
area.clearStyles(previousStyleStart, previousStyleEnd);
// calculate the style spans for just these visible paragraphs
StyleSpans<S> visibleStyles = calculateSyntaxHighlighting();
// find the start of where to apply them
int firstVisibleIndex = area.getFirstVisibleParIndex();
int start = area.position(firstVisibleIndex, 0).toOffset();
// set the new style spans
area.setStyleSpans(start, visibleStyles);
// store the new previous style start/end for later clearing
previousStyleStart = start;
previousStyleEnd = start + visibleStyles.length();
} Also, perhaps using a different way to calculate the syntax highlighting other than a |
One other side note... Memory usage could be optimized if a |
Also, I believe ReactFX's FingerTree is only a 2-3 FingerTree. One other person commented that memory usage could be improved if it was changed into a 2-3-4 FingerTree. It sounded like supporting that was trivial but hadn't been done because it would require writing a lot of "boilerplate code" based on what's already there and just make it work for a 4-FingerTree. |
Thanks for looking into that. I'll spend some time looking at this next weekend or so. |
Also, do we know if this high memory usage also occurs on Mac? Does it only affect Linux? |
The only crApple product I own is a forgotten original iPod. 😝 But it would be interesting to know. |
I've tested the code again but using a commit that occurred before I decoupled the style from its segment. Here's the results on Linux:
To say the least, it seems like decoupling the style from its segment has come at the cost of performance to a degree. I think this could be minimized by making the |
Here's a way to style only the visible text in an efficient manner. The only flaw is something in Flowless, which seems to change its scrolling values whenever it lays out its content: SuspendableYes allowRichChange = new SuspendableYes();
EventStream<?> plainTextChange = codeArea.richChanges()
.hook(c -> System.out.println("rich change occurred, firing event"))
.successionEnds(Duration.ofMillis(500))
.hook(c -> System.out.println("user has not typed anything for the given duration: firing event"))
.conditionOn(allowRichChange)
.hook(c -> System.out.println("this is a valid user-intitiated event, not us changing the style; so allow it to pass through"))
.filter(ch -> !ch.isIdentity())
.hook(c -> System.out.println("this is a valid plain text change. firing plaintext change event"));
EventStream<?> dirtyViewport = EventStreams
.merge(
codeArea.estimatedScrollXProperty().values(),
codeArea.estimatedScrollYProperty().values())
.hook(e -> System.out.println("y or x property value changed"))
.successionEnds(Duration.ofMillis(200))
.hook(e -> System.out.println("We've waited long enough, now fire an event"));
EventStreams.merge(plainTextChange, dirtyViewport)
.hook(c -> System.out.println("either the viewport has changed or a plain text change has occurred. fire an event"))
.subscribe(dirtyStyles -> {
// rather than clearing the previously-visible text's styles and setting the current
// visible text's styles....
//
// codeArea.clearStyle(startOfLastVisibleTextStyle, endOfLastVisibleTextStyle);
// codeArea.setStyleSpans(offsetIntoContentBeforeReachingCurrentVisibleStyles, newStyles);
//
// do this entire action in one move
//
// codeArea.setStyleSpans(0, styleSpans)
//
// where `styleSpans` parameters are...
// | unstyled | previously styled | currently visible text | previously styled | unstyled |
// | empty list | computeHighlighting() | empty list |
// compute the styles for the currently visible text
StyleSpans<Collection<String>> visibleTextStyles = computeHighlighting(getVisibleText());
// calculate how far into the content is the first part of the visible text
// before modifying the area's styles
int firstVisibleParIdx = codeArea.visibleParToAllParIndex(0);
int startOfVisibleStyles = codeArea.position(firstVisibleParIdx, 0).toOffset();
int lengthFollowingVisibleStyles = codeArea.getLength() - startOfVisibleStyles - visibleTextStyles.length();
StyleSpans<Collection<String>> styleSpans = visibleTextStyles
// 1 single empty list before visible styles
.prepend(new StyleSpan<>(Collections.emptyList(), startOfVisibleStyles))
// 1 single empty list after visible styles
.append(new StyleSpan<>(Collections.emptyList(), lengthFollowingVisibleStyles));
// no longer allow rich changes as setStyleSpans() will emit a rich change event,
// which will lead to an infinite loop that will terminate with a StackOverflowError
allowRichChange.suspendWhile(() ->
codeArea.setStyleSpans(0, styleSpans)
);
}); where public String getVisibleText() {
return codeArea.getVisibleParagraphs().map(Paragraph::getText).reduce((a, b) -> a + "\n" + b).getOrElse("");
} |
It seems a bit better doing the above, at least memory wise. Side effect I see here is that highlighting doesn't work during scrolling and takes a second to engage once scrolling stops. Not ideal but maybe okay. Still kinda sluggish when paging up/down. Another thing I noticed was this Error being thrown (so, maybe I have something screwed up), although it seems to run fine regardless.
ScalabilityTest.java:145: I did remove this bit from the original example above:
I have to admit though, the functional reactive development style is painful for me. Either I'm getting old and stubborn or it's just unfamiliarity, but it really obfuscates tracking what the code is doing, for me. Between that and the fact that I really don't need highlighting (or even editing), maybe I'll just skip it. |
You could adjust the viewport delay (currently 200 ms in the code) to be less or remove it altogether and that might make things better.
This probably throws an error because I assumed (when writing the code) that the area is displayed on one's screen before that method ever gets called. So, it does sound like an initialization bug that could be resolved by adding a check like EventStreams.merge(plainTextChange, dirtyViewport)
.hook(c -> System.out.println("either the viewport has changed or a plain text change has occurred. fire an event"))
// only run when the area is visible since it will always have
// at least 1 paragraph that is visible
.conditionOn(area.visibleProperty())
.subscribe(dirtyStyles -> { /* rest of the code above */
It's probably just unfamiliarity with the style coupled with the frustration that something as simple as this shouldn't take up so much memory (like when compared with your basic already-installed text editors on Windows/Mac/Linux that respond very quickly to such changes) all adding up and compounding one another. |
Hi, I also noticed incredible amounts of RAM being used when I added a lot of lines. After some investigation, I found out there's a memory leak. Each ParagraphText adds various listeners to the Caret, and those listeners directly reference the ParagraphText that adds them. I've been working on a fix using WeakListeners that I'll be pushing shortly. |
…-leak-in-paragraphtext Issue #627 Extreme memory usage with large blocks of text
…ret on dispose instead of waiting for GC (this improves PR FXMisc#779, issue FXMisc#627) without this change, disposed ParagraphText objects still listen to selection and caret and do some useless stuff
…s on disposal (#791) * ParagraphText: immediately remove listeners from SelectionPath and Caret on dispose instead of waiting for GC (this improves PR #779, issue #627) * ParagraphText: since weak listeners are no longer required for selection and caret listeners, remove listener classes again and move listener code to constructor (saves 100 lines of code) * ParagraphText: MapChangeListener.Change.wasAdded() and wasRemoved() may both return true when replacing an item. So check both and handle wasRemoved() before wasAdded(). * Fixed (rare) NPE in class ParagraphText when GC frees object between two WeakReference.get() invocations
Is it normal/expected to see this kind of memory usage? Memory is shown at the top of the screencaps.
The first one is a modified version of one of the demos which adds 4,000 lines. Memory usage on that one spikes to about twice what's shown here, and then drops down. It also functions fine.
This second screenshot is with 400,000 extra rows of text added. Demo becomes unusable with this much text. When I add a single letter to a row, it freezes for a few seconds before showing the character. Scrolling still works okay, but the rendering speed isn't great. Memory usage... 2GB? ❗ ❗ ❗ It also fluctuates wildly (up to 1GB or so) when scrolling, which is only going to make things worse.
I made every line different on purpose since it looks like some optimizations are going on under the hood, but even without that I still see this behavior.
I'm running on Linux. Oracle JRE 8u144
The text was updated successfully, but these errors were encountered: