Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking issue: ghostty uses too much memory #254

Closed
andrewrk opened this issue Aug 8, 2023 · 18 comments
Closed

tracking issue: ghostty uses too much memory #254

andrewrk opened this issue Aug 8, 2023 · 18 comments

Comments

@andrewrk
Copy link
Collaborator

andrewrk commented Aug 8, 2023

Here we observe an order of magnitude difference in memory usage:

image

In this picture, ghostty (GTK) uses 812 MiB compared to 84 MiB used by xfce4-terminal (also GTK).

To reproduce, open 20 ghostty terminals and then close all of them except for 1. Do the same for xfce4-terminal. When I had 20 terminal windows open with xfce4-terminal, it still only used 84 MiB of RSS.

For me, I can part with 1 GiB of RAM to use a nice terminal. But I think this is in the realm of limiting use cases for some folks; if someone barely has enough RAM to compile LLVM or something like that, then ghostty might actually prevent that compilation from succeeding.

Once ghostty goes public, you might also find people drinking haterade about the fact that a terminal application, traditionally associated with being slim, is using more memory than an electron application (Signal in the picture above), associated with being bloated. An unnuanced comparison for sure, but nice to avoid that talking point entirely if possible.

I don't know if this is fixable or not, maybe it has to do with GPU rendering or something, but at least there will be an issue to point future users to if this is just one of the tradeoffs of ghostty vs a more bare-bones terminal application.

@mitchellh
Copy link
Contributor

This is a good tracking issue for this.

There are a number of things at play here. GPU-based terminals definitely have to use a bit more RAM. But the biggest issue at the moment is we don’t do any sharing between terminal instances (windows, tabs, etc.). The main culprit here is the font data, we re-rasterize and cache all the font textures separately across all instances. I don’t think that adds up to 900 MB, but I know thats a big one.

Regardless, let’s use this as a tracking issue to improve memory usage.

@andrewrk
Copy link
Collaborator Author

andrewrk commented Aug 8, 2023

I think there's a second issue, which is that it didn't free up any of the resources after closing 19 windows. That does feel like a memory leak to the user (I'll need to exit my nix-shell and restart ghostty if I want that memory back).

@goonzoid
Copy link
Member

goonzoid commented Aug 31, 2023

#364 may account for some of this memory, but almost certainly not all of it.

I took a quick peek at the allocations profiler in MacOS Instruments while opening and closing a bunch of windows, and I see that the rendering and io posix threads for each surface seem to continue to live on in memory after a window is closed. I haven't done enough posix thread programming to know whether that's normal, but a quick search turned up several mentions of a flag that might solve the issue: PTHREAD_CREATE_DETACHED.

Apologies for the very hand wavy info, but I didn't have time to dig in to it tonight, but didn't want to forget to mention what I saw.

@mitchellh
Copy link
Contributor

I took a quick peek at the allocations profiler in MacOS Instruments while opening and closing a bunch of windows, and I see that the rendering and io posix threads for each surface seem to continue to live on in memory after a window is closed. I haven't done enough posix thread programming to know whether that's normal, but a quick search turned up several mentions of a flag that might solve the issue: PTHREAD_CREATE_DETACHED.

This is resolved #375. This is related to general memory usage but has no impact on Linux. I know YOU know this but just saying that for any future issue readers. macOS memory usage is greatly improved when closing windows (by improved I mean... it no longer leaks), but the general memory issues still exist.

@xdBronch
Copy link
Collaborator

was told to add this here, ghostty seems to increase in ram usage a notable amount when resizing, about 90mb -> 120mb for me. this happens on any resize, even when shrinking the window, and its never reduced after that except sometimes when its resized to the exact original dimensions

@mitchellh
Copy link
Contributor

In #754, I clear the renderer memory on resizing (rebuilds on next frame) so that buffers shrink when resizing smaller. This doesn't account for a very large amount of savings (~5MB per surface from full screen to smallest window on my machine) but it something we might as well do.

@mitchellh mitchellh changed the title ghostty uses an order of magnitude more memory than xfce4-terminal tracking issue: ghostty uses too much memory Nov 11, 2023
@mitchellh
Copy link
Contributor

mitchellh commented Mar 14, 2024

Noting some memory usage updates from the paged-terminal branch where I'm actively working on terminal state memory usage.

20 empty windows:

  • Before: 146 MB
  • After: 107 MB

5 windows where each does cat 20MB file (simulating scrollback):

  • Before: 678 MB
  • After: 152 MB

All tests on a 6k display on Linux GTK.

I want to note that the memory usage saved isn't dramatic in all scenarios. For other performance reasons we make some preallocations still and the current code isn't completely stupid and does allocate some things on demand. For example, 5 windows running neovim use identical memory between current and paged-terminal. However, memory regressions should be very rare, only in scenarios where the window is very, very small (I argue impractically small, like 30 columns by 30 rows).

I also want to note that paged-terminal is specifically optimizing terminal state memory usage. There are other parts of Ghostty that still have bloated memory usage (the renderer, font stack) and I plan to look into those separately as part of a future body of work.

The focus of paged-terminal is on memory usage but I've also in the process made the data structures more cache friendly for a lot of common operations (in addition to simply being more cache friendly by being smaller). As a result, IO throughput has generally increased in all scenarios as well. So the branch uses less memory and is much faster [most of the time]. It is only slower in pathological scenarios I haven't ever experienced in real life usage, such as the "doom fire" benchmark.

The state of the paged-terminal branch at the time of authoring this comment is that it passes all existing tests (no tests removed) and is only missing two features: URLs and the terminal inspector. It is stable enough that I'm now using it as my daily driver to finish up the branch to try to catch any more issues. I think it'll still take some weeks to get it to a PR ready state, hopefully 1 or 2 weeks but I'm not pressuring myself on it.

@andrewrk
Copy link
Collaborator Author

but I'm not pressuring myself on it.

impressive, I need to learn this skill

@mitchellh
Copy link
Contributor

mitchellh commented Mar 28, 2024

#1584 has now merged and Ghostty generally uses ~40% less memory.

For various scenarios, Ghostty now uses less memory than GPU-rendered terminals I've tested on both macOS and Linux. Not for all scenarios, but it is extremely competitive. On macOS in general, Ghostty uses less memory than any terminal emulator (including Terminal.app much of the time). Again, it is somewhat situational but competitive. For non-GPU rendered terminals on Linux, Ghostty still uses significantly more memory (i.e. compared to foot, st, etc.).

#1584 focused on terminal state, and I'm confident our terminal state memory usage is very good now. There are other areas of Ghostty that still use too much memory, and I'm not going to close this issue until those are resolved and our memory usage is as competitive as we can get given some of our dependencies (i.e. GTK).

Other areas that need investigation:

  • Font stack. We currently duplicate the font stack for each surface. I haven't measured how much RAM this is but it can't be tiny. (Dedupe font stack for terminals with identical font configuration #1662)
  • Renderer CPU data. We currently store the full list of GPUCell structs in CPU memory per surface. We arguably don't need the GPUCells at all after they're synced to the GPU. The last time I measured this was only a couple megabytes per surface. Not huge but it's something.
  • Font Atlas. We currently have a unique font atlas per surface. This is dynamically sized based on used glyphs, but the default size is:

This is all I can think of at the moment that would use a measurable amount of RSS. If anyone has any other ideas, please share, but these are the things I intend to look at next.

@andrewrk
Copy link
Collaborator Author

For me personally, I think ghostty is going to win against the incumbent now that I've switched to KDE:

image

I don't have ghostty in the list yet due to ziglang/zig#19457

@andrewrk
Copy link
Collaborator Author

andrewrk commented Apr 5, 2024

Well, switching to KDE has put things into perspective:

image

@mitchellh
Copy link
Contributor

@andrewrk Happy to hear it. But, improvements are on the way. 🫡

@mitchellh
Copy link
Contributor

mitchellh commented Apr 6, 2024

Working on font stack memory usage over in fontmem. It's now in a working state (some features missing that don't affect memory usage, like changing font size at runtime). Early results below.

Analysis: The CoreText stack on macOS must be lighter weight. The memory savings in general aren't as big. Still, they're nothing to sniff at. But on Linux the memory savings are pretty significant even in the most conservative case (20 blank tabs), so the fontconfig/freetype/harfbuzz stack must use significantly more memory than CoreText. This is a win.

What does the fontmem branch do? It shares a font stack (discovery, rasterization, shaping) for terminals that have an identical font configuration. Currently on main, each terminal gets its own distinct font stack instance. There are pros/cons. The shared approach is more complex with locking and being careful about data races. It also likely makes the renderer slower with more terminals doing lots of printing due to lock contention. However, long term I think the font stack should be very rarely hit once we add screen dirty tracking so this is a non-issue. And today, if I run htop in 20 tabs, I can't see any slowdown or hiccups.

(Note: sorry the benchmarks below aren't apples to apples between platforms, its all WIP anyways)

macOS

10 windows running htop:

Old RSS: 158 MB
New: 151 MB

5 windows loading a file of emojis:

Old RSS: 170 MB
New: 154MB

Linux

Linux 20 blank tabs:
Old: 145.6 MB
New: 112.5 MB

Linux 5 tabs with 20 emoji on screen:
Old: 110.1 MB
New: 95.4 MB

@marler8997
Copy link
Collaborator

Here's windows for comparison:
image

@mitchellh
Copy link
Contributor

Opened #1662 which improves the Linux 20-empty-window memory benchmark by another 23% on my machine. I'm not sure what we're at since the original issue was opened by Andrew but we're making some good, good progress on this.

@mitchellh
Copy link
Contributor

#1662 merged. Next up is looking at our renderer memory usage. After that, I think I'll feel comfortable closing this issue. Ghostty won't be the lowest memory usage terminal but I think it's firmly in the category of very good and it'd be irrational to say it's bad (relative to the ecosystem of terminals).

@mitchellh
Copy link
Contributor

mitchellh commented Jun 1, 2024

I'm going to close this. At this point I've gone through and optimized the memory decently well for all major subsystems in Ghostty. There is still plenty of room for improvement, but as I stated earlier, I don't think it'd be rational to state that Ghostty's memory is bad. It isn't the best but it isn't bad.

A lot of the baseline memory usage at this point also comes from our usage of native GUI toolkits. i.e. we can't ever be as memory lightweight as say plain old st because spinning up a full GTK app can't compete with the bare minimum X windows interaction. That's a tradeoff we chose as a project to support nicer GUI features. Similarly, on an aarch64 Mac, an empty window Cocoa project uses around 25MB of RAM for me. An empty Ghostty window (also a Cocoa-based app) is now using 45MB at the time of writing this. That's really not bad at all!

Again, I'll repeat: there's plenty more improvements to be made, and I'd love to do them myself or see contributions around them. In any case, I don't think we need a dedicated issue around it anymore.

@andrewrk
Copy link
Collaborator Author

andrewrk commented Jun 1, 2024

Sounds good! The original title for this was "ghostty uses an order of magnitude more memory than xfce4-terminal" and that is no longer the case :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants