-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve memory usage when reaching diff limits #2990
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2990 +/- ##
==========================================
- Coverage 33.04% 33.01% -0.03%
==========================================
Files 269 269
Lines 39484 39492 +8
==========================================
- Hits 13047 13039 -8
- Misses 24584 24603 +19
+ Partials 1853 1850 -3
Continue to review full report at Codecov.
|
I will refactor this to use |
7441a6e
to
46d987a
Compare
Okay, I'm not going to rewrite this with |
46d987a
to
3990d7d
Compare
I fixed an issue where it would hard crash when trying to parse https://try.gitea.io/mrexodia/DarkSouls3.TextViewer/commit/629cf9b3d6b295bbcddf76d1f6167259b764d9dc |
LGTM |
} else { | ||
return nil, fmt.Errorf("ReadString: %v", err) | ||
var linebuf bytes.Buffer | ||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell, we only break from this loop when we reach an EOF or new-line. Should we break once linebuf.Len() >= maxLineCharacters
? Otherwise it's not clear to me how this helps memory usage.
Never mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ethantkoenig I think it's right. See line 272.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 272 sets curFile.IsIncomplete
to true, but the loop will still run for more iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so that, we could find the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, first I used break
to stop reading, but that would cause https://try.gitea.io/mrexodia/DarkSouls3.TextViewer/commit/629cf9b3d6b295bbcddf76d1f6167259b764d9dc to crash because it would think the next diff line was at the next character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I just have one minor suggestion. Since the ParsePatch(..)
function is already quite long, could we move the newly-added code to a helper function? Something like
func ReadLineWithMaxLength(reader io.Reader, maxLen int) (string, error) {
...
}
I personally don't mind adding it to a function, but it would become rather ugly:
I could read |
Signed-off-by: Duncan Ogilvie <[email protected]>
3990d7d
to
1031bc2
Compare
Fair enough, LGTM |
@lunny why backport for this? |
Related to #2669, also related to go-gitea/git#93
This change introduces a hand-rolled implementation of
input.ReadString()
that stops reading if the line buffer gets bigger thanmaxLineCharacters
.I thought the performance of this would be terrible, but actually it appears to be
slightly fasterslower when timing theParsePatch
function (with real loose timing so might be good to check out).For commits that are shown completely the memory usage is slightly better, but for commits where files are hidden because they are too big, the memory usage is much, much better (my test was a 100mb one-liner which went from 322mb to 55mb).
The
ParsePatch
function does not behave completely identical becauseline
is now truncated to whatever the user sets as maximum. I checked all usages a little and it does not appear to matter (diffs are only shown to the user and ifIsIncomplete
is set theline
member is not used), however this needs some attention during review.Possible follow-up for this is to completely rewrite the diff functions to use things like
git diff --numstat
to figure out if diffs have to be truncated and which files are part of a diff instead of this ugly parser.