Add a DxRenderer based on a glyph atlas #10461

lhecker · 2021-06-19T12:59:41Z

Description of the new feature/enhancement

While the initial layouting and rasterization of glyphs is computationally expensive, the composition using a classic texture atlas and glyph-lookup-texture is extremely fast and, unless new glyphs appear on the screen, can be rendered in a single pass. Such an implementation would provide us with a high-framerate, low-latency renderer.

Proposed technical implementation details

The initial implementation to just render pure, colored, single-glyph-per-code-point text is quite trivial obviously. Guidance for this can be obtained from many sources throughout the web (even WikiBooks!). After an initial implementation has been drafted additional features could be added incrementally over time.
The renderer should live as an optional feature, that can be toggled on if the user wants to.

Further experience can be gained from the alacritty project.

Alternative solutions

DirectWrite uses a glyph atlas internally and we could continue to rely on it, but just optimize our render pass instead.
For instance a hybrid-approach would be feasible: Render everything but glyphs using a shader.

lhecker · 2021-06-19T13:02:07Z

Feedback welcome! 🤗
(I'll keep the issue description updated with any feedback.)

skyline75489 · 2021-06-19T15:34:03Z

Sure. Why not?

We have software renderer option for people who face hardware rendering issue. Adding a “performance” renderer for people who want the most of the performance(& use only English & want none of the Unicode features) sounds reasonable. Also alacritty has done this before, if I’m not mistaken. So there’s at least one popular examples.

I humbly ask for basic CJK support for the initial implementation, because, well, I live in a cave where people use CJK as their primary languages.

cedric-h · 2021-06-19T15:40:37Z

I don't think ClearType in particular has been done before.

Here's a picture I took of Alacritty in Renderdoc. All of the glyphs are rendered in two draw calls, one for background and one for foreground. I believe this is done to facilitate ClearType.

EDIT: Confirmation from Alacritty contributor that ClearType is in use: alacritty/alacritty#2645 (comment)

cedric-h · 2021-06-19T15:51:15Z

Sure. Why not?

We have software renderer option for people who face hardware rendering issue. Adding a “performance” renderer for people who want the most of the performance(& use only English & want none of the Unicode features) sounds reasonable. Also alacritty has done this before, if I’m not mistaken. So there’s at least one popular examples.

I humbly ask for basic CJK support for the initial implementation, because, well, I live in a cave where people use CJK as their primary languages.

The obvious answer to "Why not?" is because the benchmarking experiments done here and here suggest that the naive string parsing is a bottleneck long before rendering comes into play.

With regards to supporting multiple languages, that should not be a significant limitation, it's simply a matter of adding more glyphs to your atlas; see this example created in response to the controversy in this issue

mdtauk · 2021-06-19T16:18:11Z

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

cedric-h · 2021-06-19T16:22:55Z

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely (whenever a never-before-rendered character is encountered) compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator which uses this technique to render.

mdtauk · 2021-06-19T16:24:14Z

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator.

Ah so the Atlas is for the window display itself, not as a texture source off-screen for use when rendering?

cedric-h · 2021-06-19T16:25:00Z

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator.

Ah so the Atlas is for the window display itself, not as a texture source off-screen for use when rendering?

No, that is the opposite of what I intended to convey.

lhecker · 2021-06-19T16:54:46Z

@skyline75489
I believe most of CJK is exactly the "single-glyph-per-char" case I mentioned and should be simple to render.

@cedric-h
Nice find! As far as I can see they only use ClearType for rendering the alpha texture of the glyph, which we can do as well. That doesn't mean the final composition is proper ClearType though (since that requires knowledge of the background color among others). They draw in two passes because they draw background and text separately. But it'd be absolutely awesome if I'm wrong! Can you point me to a comment explaining how they implement ClearType during the final render? I can't find anything in the code nor issues about it unfortunately.

The obvious answer to "Why not?" is because [...]

Please don't be inflammatory, alright? 🤗
I'm sure you know full well that "Why not?" is an idiom.

skyline75489 · 2021-06-19T17:23:02Z

@lhecker For Chinese, most characters I believe belong to the “single-glyph-per-char" category. But there’s other issues for Japanese and Hangul demonstrated in #3546, naming IVS (Ideographic Variation Sequence) for Japanese and complex composite in Hangul.

IVS I believe also exists in Traditional Chinese (#8731) but I don’t know that side of the story much, meaning that I don’t know how important IVS is in Traditional Chinese environment.

skyline75489 · 2021-06-19T17:46:08Z

I don’t know Hangul, but it seems to be even harder when it comes to composition, which requires NFKC composition. See #3578 (comment)

Man, finding those comments bring back memories of the good old days 😅

mmozeiko · 2021-06-19T17:56:30Z

Freetype is dual licenseed - BSD like FTL license and GPL. it is not just GPL license.

Shader from Alacritty does freetype style ClearType rendering using Dual Source Blending that OpenGL or D3D11 both support. Fragment shader outputs three blend factors separately for r/g/b channels. Although I'm not sure if they are doing that in correct colorspace (linear/sRGB), haven't look in their code to understand that completely.

That said - there is no reason to do this kind of blending. As renderer who draws text knows exactly what are both - background and foreground colors. So it can do whatever style blending & colorspace it wants directly in shader, and just output final cleartype color.

lhecker · 2021-06-19T18:06:32Z

@mmozeiko I'm currently considering to implement at least the initial version, as mentioned in the issue.
As you probably know, I cannot legally look at your (unlicensed) shader source code and haven't done so far.
If you ever feel like it, you can accelerate the development of the new renderer by contributing any small demo application that contains your shader and optionally its DirectWrite logic. This would help me skip the experimental phase and I could focus on integrating it into the existing render interface. Of course, please don't feel pressured to do so. 🙂

edit: I have been told that this message "sounds so nice it's borderline sarcastic". 😄
But if you look at my other comments on my profile you'll notice that I almost always write like this!
I'm simply considering this a reset from the mood of the very heated previous issue.

mmozeiko · 2021-06-19T18:14:27Z

No problem.
I donate my shader to public domain, feel free to use it or relicense it however you wish: https://gist.github.com/mmozeiko/c7cd68ba0733a0d9e4f0a97691a50d39

lhecker · 2021-06-20T12:16:26Z

@skyline75489 Yeah I personally agree that it's hard to reason about what's possible and what isn't when unicode comes into play.

But to be fair monospace-only support simplifies most things. For instance our console buffer currently is N characters wide, and should be N glyphs wide (and not just N graphemes either!). Solving this issue seems awfully similar to solving the layout issue for a glyph atlas. If we know what forms entire glyphs I believe there aren't many issues anymore you could possibly have.
In other words: A glyph atlas is entirely capable of drawing almost all of unicode!

Now the nice thing about using DirectWrite for entire runs of glyphs (and not just single glyphs) though is that we know we can trust it to perform entirely correctly no matter what we call it with. Fixing our unicode support in the buffer and non-DirectWrite related rendering code is far more trivial than actually figuring out what forms a glyph and could be done "relatively" soon. This would give us full unicode rendering support, even if the width buffer rows don't match the glyph count (for instance hebrew would look correctly, but you can only have let's say 20 glyphs on an 80-cell wide buffer row). A glyph atlas forces us to solve that prematurely and entirely.

The list of fundamentally unsolved glyph atlas issues that Alacritty has for instance shouldn't be taken lightly.
Many of those issues, again fundamentally, don't exist if you use DirectWrite for entire glyph runs.
Certainly one of the many reasons it was called as hard as a dissertation before. Skia's implementation certainly is...

That said I'm convinced we should make the glyph atlas renderer separate from the current DxRenderer.
While it slightly increases the maintenance burden, both approaches are fundamentally different. Additionally there are some concerns which I'll be mentioning later.

With all the above being said, it doesn’t hurt anyone if we or the enthusiasts from the gaming industry take a shot at this specific project.

Yep! And I'm entirely on it!
It did hurt though, to be an actual human with feelings and stuff, who might get offended if people are rude. 😐
Sarcasm aside: It is on us, for not taking the technical advice at face value, no matter the delivery!
This one of the reasons we've since reopened the gates for @cmuratori et. al. again, after some time of self-reflection.

skyline75489 · 2021-06-20T20:27:43Z

Thanks @lhecker for the explanation. I learned a lot and I guess I still have a lot to learn.

DHowett · 2021-06-21T18:10:21Z

I was clearly mistaken as to how hard this work would be. I'm glad, and I appreciate being corrected.

Terminal cannot turn away valuable performance work simply on ideological grounds.

Anyway-
I want to establish some ground rules:

This renderer should be behind a til::feature (see src/features.xml); you can decide whether it is compiled-in or compiled-out by default. These are not toggleable at runtime, but it will give us the ability to make sure the code doesn't go out in Stable or even Preview until it's ready for people to turn it on.
- (I want to use til::feature for more "out in the open" development, rather than having long-running feature branches)
We need to determine how to expose this switch to the user, and the cost of parallel development on both renderers.
Before we begin in earnest bringing an atlas renderer into the codebase, I'd like to see some progress on reducing the renderer's dependency on the global console lock to be as small as possible. That will help even the venerable GDI renderer.

Fair?

Tyriar · 2021-07-04T03:40:36Z

If you want a reference for this the webgl renderer in xterm.js uses a texture atlas as well. Some thoughts from implementing this:

We only use a single 1024x1024 texture currently as it didn't seem worth it to manage multiple based on real world usage. I was brand new to webgl writing this but would be much more comfortable with this now, it would help high-dpi displays the most
When the atlas is filled we just clear it and start over instead of trying to evict old glyphs compacting
To keep the texture a little more compact we trim white space around the glyphs before we put it into the atlas.
On startup we warm up the cache by writing a subset of ascii in the default color https://github.com/xtermjs/xterm.js/blob/094bcbd81d1f080d204331088aa99f4d7173d56c/addons/xterm-addon-webgl/src/atlas/WebglCharAtlas.ts#L104-L114
Before a render happens we prep the texture by rendering all required glyphs
We use string as a key to store combining characters, this is relatively fast in JS https://github.com/xtermjs/xterm.js/blob/094bcbd81d1f080d204331088aa99f4d7173d56c/addons/xterm-addon-webgl/src/atlas/WebglCharAtlas.ts#L50
Here's our glyph glsl shaders if that's helpful, they're very simple

hfhchan · 2021-08-27T17:44:15Z

Addedum: I think the people implementing know the following points already, and there are valid reasons for not supporting the following scenarios in a quick-path implementation. But I have included them here it because I found some of the points mentioned in previous comments being insensitive to requirements for East Asian scripts. East Asian scripts are typically not even considered complex.

This is done by calling the GetTextComplexity API) because of the existence of “locale based” letter forms(locl), which depends on the font being used and the current locale. This feature is used in languages like Turkish & Polish & etc. Some might think “this is just nonsense. No one really needs this”.

locl support is required for correctly rendering CJK characters when using pan-CJK fonts. Windows does not ship with any fonts that support proper glyphs for Chinese Traditional (Hong Kong), only for Chinese Traditional (Taiwan). The currently most popular way to get fonts that adhere to the Hong Kong government reference orthography is by installing Source Han Sans, which relies on the locl tag to deliver the correct glyphs.

But to be fair monospace-only support simplifies most things. For instance our console buffer currently is N characters wide, and should be N glyphs wide (and not just N graphemes either!). Solving this issue seems awfully similar to solving the layout issue for a glyph atlas. If we know what forms entire glyphs I believe there aren't many issues anymore you could possibly have.

CJK characters are supposed to render as full-width characters, i.e. take up double the space used by a single ASCII character. You should never assume that N characters wide will be N glyphs wide. For certain Latin based scripts such as Vietnamese and Chinese pinyin you need to support combining characters which have no pre-composed character equivalents.

Support for IVS characters with CJK characters are also necessary for Chinese locales apart from the Japanese locale. The Macao Special Administrative Region has already registered an IVD collection including glyphs which need to be supported for Chinese (Traditional) use for Macao users. Other regions have IVD collections in the works.

ndwork · 2021-11-07T15:15:19Z

Any progress here? It’s been four months since the issue was opened and two since the last meaningful comment.

@lhecker

lhecker · 2021-11-07T19:34:30Z

@ndwork It's something that already exists and is being tested internally... 🙂

As I've mentioned in the other issue, I'm aiming for the 1.13 Preview release. I hope you can understand that we had to work on our pre-existing roadmap first over the last few months. But as of about 2 weeks ago I've been working on this most of the time. Once something that's stable, tested and usable has landed in Windows Terminal I'll make sure to let everyone know in this issue.

rbeesley · 2022-01-04T06:34:29Z

I think I'm seeing this problem. I checked and I only have 1.12 Preview, so I can't see if this fixes it for me.

I was running this crt.hlsl experimental pixelShader, and I was looking for anything else I might have missed in the Ansi-Color.cmd tool (hopefully part of a future release, but the attached PR would let you replicate this), and running the plaid.def through this tool, I was seeing a 10% CPU increase. I thought maybe the shader was computationally expensive for some reason, so I also tried grid.hlsl which does nothing more than to render a grid of squares across the terminal, it too showed the same CPU overhead. None of the other definition files were causing this issue, but it is notable that plaid.def uses the block element █ (U+2588), and the shade elements ░ (U+2591), ▒ (U+2592), and ▓ (U+2593), and shows a tight grid of 1024 of these characters (4 each for a 16x16 combination of colors).

So I think what is happening is that rendering these 4 glyphs is very expensive for Terminal and it is costing a lot to render this for each frame of the shader? Is this something I should just wait until 1.13 Preview and file a performance bug if I'm still seeing a CPU hit? It seems like #10362 is the same problem I ran into, but maybe adding the shader just amplifies it the problem for me significantly?

zadjii-msft · 2022-01-04T11:33:23Z

@rbeesley you may be more specifically hitting #6974

rbeesley · 2022-01-04T17:22:02Z

@zadjii-msft, reading through that bug report, it does seem like it fits. I'll put more information there.

lhecker · 2022-02-03T19:04:17Z

Windows Terminal Preview v1.13.10336.0 was released today, which features a new rendering engine. While it doesn't solve all of the performance issues reported here yet, it's a significant improvement nonetheless.

You can enable it this way:

Open settings
Select any profile (including "Defaults")
Select "Advanced" at the bottom
Select "Enable experimental text rendering engine"

The performance should be about the same in the worst case (regular black/white text), but significantly better for highly colored text (text that exceeds 20 distinct colors on a screen). Additionally this engine won't be limited to 60 FPS anymore.
Additionally I'm currently writing a blog post detailing why this issue occurs in Direct2D.

If you find any issues or got any feedback for this new text renderer, please let us know in #9999. 🙂

lhecker added Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-Performance Performance-related issue Product-Terminal The new Windows Terminal. labels Jun 19, 2021

lhecker added this to the Terminal Backlog milestone Jun 19, 2021

ghost added the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jun 19, 2021

DHowett mentioned this issue Jun 19, 2021

Extremely slow performance when processing virtual terminal sequences #10462

Open

This comment has been minimized.

Sign in to view

DHowett removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jul 2, 2021

DHowett assigned lhecker Jul 2, 2021

This was referenced Jul 12, 2021

Windows Terminal is Terrible #10623

Closed

Multithreaded Direct2D for text layout performance #6015

Closed

Text layout caching by run #6300

Closed

zadjii-msft mentioned this issue Jul 30, 2021

Slow terminal rendering, bad utf support and a few other things. #10825

Closed

miniksa mentioned this issue Aug 26, 2021

1px line between powerline glyphs #8993

Closed

lhecker mentioned this issue Oct 15, 2021

Extremely slow performance when processing virtual terminal sequences #10362

Closed

miniksa mentioned this issue Oct 25, 2021

Raster/VGA/Code page 437 font #2013

Closed

zadjii-msft modified the milestones: Terminal Backlog, Terminal v1.13 Jan 4, 2022

DHowett modified the milestones: Terminal v1.13, Terminal v1.14 Jan 27, 2022

zadjii-msft added the Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. label Jan 28, 2022

zadjii-msft modified the milestones: Terminal v1.14, Terminal v1.13 Jan 28, 2022

DHowett modified the milestones: Terminal v1.13, Terminal v1.14 Jan 31, 2022

zadjii-msft modified the milestones: Terminal v1.14, Terminal v1.13 Feb 2, 2022

lhecker closed this as completed Feb 3, 2022

lhecker added the Area-AtlasEngine label Feb 3, 2022

lhecker removed their assignment Feb 4, 2022

LinuxOnTheDesktop mentioned this issue Apr 23, 2023

WT should start up fast: profile the startup path and trim anything that takes a while #5907

Open

3 tasks

matu3ba mentioned this issue Dec 3, 2023

Windows Support ghostty-org/ghostty#437

Closed

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a DxRenderer based on a glyph atlas #10461

Add a DxRenderer based on a glyph atlas #10461

lhecker commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 19, 2021 •

edited

Loading

skyline75489 commented Jun 19, 2021

cedric-h commented Jun 19, 2021 •

edited

Loading

cedric-h commented Jun 19, 2021

mdtauk commented Jun 19, 2021

cedric-h commented Jun 19, 2021 •

edited

Loading

mdtauk commented Jun 19, 2021

cedric-h commented Jun 19, 2021

lhecker commented Jun 19, 2021 •

edited

Loading

skyline75489 commented Jun 19, 2021 •

edited

Loading

skyline75489 commented Jun 19, 2021 •

edited

Loading

mmozeiko commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 19, 2021 •

edited

Loading

mmozeiko commented Jun 19, 2021

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

lhecker commented Jun 20, 2021 •

edited

Loading

skyline75489 commented Jun 20, 2021

DHowett commented Jun 21, 2021

Tyriar commented Jul 4, 2021

hfhchan commented Aug 27, 2021 •

edited

Loading

ndwork commented Nov 7, 2021 •

edited

Loading

lhecker commented Nov 7, 2021 •

edited

Loading

rbeesley commented Jan 4, 2022

zadjii-msft commented Jan 4, 2022

rbeesley commented Jan 4, 2022

lhecker commented Feb 3, 2022

Add a DxRenderer based on a glyph atlas #10461

Add a DxRenderer based on a glyph atlas #10461

Comments

lhecker commented Jun 19, 2021 • edited Loading

Description of the new feature/enhancement

Proposed technical implementation details

Alternative solutions

lhecker commented Jun 19, 2021 • edited Loading

skyline75489 commented Jun 19, 2021

cedric-h commented Jun 19, 2021 • edited Loading

cedric-h commented Jun 19, 2021

mdtauk commented Jun 19, 2021

cedric-h commented Jun 19, 2021 • edited Loading

mdtauk commented Jun 19, 2021

cedric-h commented Jun 19, 2021

lhecker commented Jun 19, 2021 • edited Loading

skyline75489 commented Jun 19, 2021 • edited Loading

skyline75489 commented Jun 19, 2021 • edited Loading

mmozeiko commented Jun 19, 2021 • edited Loading

lhecker commented Jun 19, 2021 • edited Loading

mmozeiko commented Jun 19, 2021

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

lhecker commented Jun 20, 2021 • edited Loading

skyline75489 commented Jun 20, 2021

DHowett commented Jun 21, 2021

Tyriar commented Jul 4, 2021

hfhchan commented Aug 27, 2021 • edited Loading

ndwork commented Nov 7, 2021 • edited Loading

lhecker commented Nov 7, 2021 • edited Loading

rbeesley commented Jan 4, 2022

zadjii-msft commented Jan 4, 2022

rbeesley commented Jan 4, 2022

lhecker commented Feb 3, 2022

lhecker commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 19, 2021 •

edited

Loading

cedric-h commented Jun 19, 2021 •

edited

Loading

cedric-h commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 19, 2021 •

edited

Loading

skyline75489 commented Jun 19, 2021 •

edited

Loading

skyline75489 commented Jun 19, 2021 •

edited

Loading

mmozeiko commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 19, 2021 •

edited

Loading

lhecker commented Jun 20, 2021 •

edited

Loading

hfhchan commented Aug 27, 2021 •

edited

Loading

ndwork commented Nov 7, 2021 •

edited

Loading

lhecker commented Nov 7, 2021 •

edited

Loading