Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font / Emoji rendering (spacing issue) #16852

Closed
Florian-Thake opened this issue Mar 9, 2024 · 27 comments
Closed

Font / Emoji rendering (spacing issue) #16852

Florian-Thake opened this issue Mar 9, 2024 · 27 comments
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting

Comments

@Florian-Thake
Copy link

Windows Terminal version

1.19.10573.0

Windows build number

10.0.19045.4046

Other Software

not required

Steps to reproduce

Open the Windows Terminal (I tried cmd.exe, PC, and Ubuntu 2204 LTS but I believe it does not matter what is used).
The Font is the default Cascadia Mono, I believe the size does not matter (occurs with the default 12 as well as 13).
Paste the Skull emoji ☠ into the command prompt. (Note that the cursor is now in the middle of the skull)
Type a letter, e.g., x

The x is not visible because it is behind the skull.

I don't know if other emojis are affected. I discovered this issue by accident since I use this string "🚀 🍀 ☠ 🔥" for some of my internal UTf-8 testing. From these 4 emojis only the skull is affected.
I believe after the auto update to 1.19.10573.0 this issue occurred. I don't know exactly which version was installed before but since it is maintained by the system it should be the prior released version.
If I remember correctly the skull ☠ was rendered too small in the old version. This is fixed now.
I also noticed that pasting of some emojis now produces a correct echo instead of question marks. Thank you for these fixes!
Maybe the new issue is caused by the bigger rendering? Is the saved size / space information which is used for calculate the start position of the next char still a too small one?

See the screenshot for see how it looks.

WindowsTerminal_EmojiSpacing_marked

Expected Behavior

I can see
☠x

Actual Behavior

I only see

because the x is behind the skull and not right of it.
(See screenshot in the 'steps to reproduce' section)

@Florian-Thake Florian-Thake added Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Mar 9, 2024
Copy link

github-actions bot commented Mar 9, 2024

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

@Florian-Thake
Copy link
Author

Florian-Thake commented Mar 9, 2024

I checked the mentioned issues and I believe this new issue is not a duplicate of one of the mentioned.

@PhMajerus
Copy link

PhMajerus commented Mar 9, 2024

I tested and noticed the skull is a pre-emoji symbol from the Miscellaneous Symbols block, which some symbols have been reused as emojis. I checked the full block and there are several others showing the same issue:
image

There probably is some special handling in a function calculating the width of a character that needs to be updated to return the correct double-width size for those.

Oh, and the x is actually visible in front of the ☠ in your screenshot, you can see it if you look closely, but it just happens to align right between the nasal and right orbital holes, and over the bottom-right bone.

@DHowett
Copy link
Member

DHowett commented Mar 10, 2024

In this case--and for all pre-emoji-standardization iconographic codepoints--a one-cell overlap is correct. This is the same treatment given by iTerm2 and Terminal.app on macOS (one of which I believe was the first terminal emulator to support emoji(?).)

Those characters are expected to occupy a single column in the backing buffer, have a standard emoji representation, and be displayed as though they occupy two columns. This is one of the weird quirks of being correct.

@PhMajerus
Copy link

PhMajerus commented Mar 10, 2024

@DHowett
Interesting, I didn't know we would have to handle some double-width characters as using a single cell and compensate with a space to adjust. Thanks for the explanations.

We'll be facing the opposite issue with MouseText in Unicode 16.0.
They decided to add all the characters needed for the Apple II MouseText enhanced character set, but without duplicating the existing ones.
Unfortunately, one of those is the hourglass:⌛ U+231B that is also an emoji in most fonts and is currently handled as double-width in Terminal, while MouseText would require it to be single-width.
Do you have some insight into that issue?

image
(Ignore the Run/Execute glyphs for now, it should be the running man but I didn't do that glyphs pair yet)

The original characters are the following:
image
image

I have a single column version of the glyph ready, but it is useless as long as it's handled as double-width.

@Florian-Thake
Copy link
Author

In this case--and for all pre-emoji-standardization iconographic codepoints--a one-cell overlap is correct. This is the same treatment given by iTerm2 and Terminal.app on macOS (one of which I believe was the first terminal emulator to support emoji(?).)

Those characters are expected to occupy a single column in the backing buffer, have a standard emoji representation, and be displayed as though they occupy two columns. This is one of the weird quirks of being correct.

Thank you for your input to my issue.

I am not deep into the Unicode specification. So, please apologize if I may say something technical incorrect or I miss some important detail.

I cannot judge if your statement is correct in that way that it really must be rendered like it is now.
But regardless of that, what could be the purpose of it or the intended use case?
From my view point it does not make any sense to draw the next char on top of the last one but starting from the middle of it.
All of the icons I saw in the other comment don't have a useful aspect for it.
I mean, there are other Unicode chars which can be combined together to form something new.
But in this case it produce only garbage on the screen.

I did a quick check of the behavior of other applications.
I did not find any which is doing the same as Windows Terminal.
Notepad++
Notepad++_example
Visual Studio
VisualStudio_Example
MS Word
Word_example

From my point of view the rendering of Windows Terminal is clearly wrong in this case. The affected Unicode signs are IMHO not meant to be and not designed to be overlapping with the next sign.

@DHowett
Copy link
Member

DHowett commented Mar 10, 2024

But regardless of that, what could be the purpose of it or the intended use case?
From my view point it does not make any sense to draw the next char on top of the last one but starting from the middle of it.

Sorry, I was writing my comment up on my phone and didn't put in the right amount of explanation! 😄

The critical difference here is that a terminal emulator (of any type) is expected to act in a consistent way for another application to interface with. A text editor does not have the same requirement, as the only correctness loop exists between the user and the editor.

So, terminals receive text from other applications and display that text on a screen. Those other applications can be microseconds away (on the same machine) or thousands of miles away (running on a remote server, connected over SSH). That immediately imposes a couple constraints on how that text gets displayed:

  1. Apps running inside terminals may not ever know the font their text is being displayed in; that font may not exist on the remote server, nor might the code for parsing it. That application may not even have the concept of a font.
  2. Text measurement takes place twice - once on the application side and again on the terminal side. This is because queries that the terminal has to answer can be quite expensive because they need to be synchronous (the app has to wait for them) and they may be slow (due to distance).

Now, an application can instruct a terminal to position the cursor somewhere with absolute or relative coordinates. The same application can display the "right" amount of text to fill up the screen. Consider this example of an application called "Midnight Commander". It has to be able to predict where every bit of text will be placed on the screen, or it will put the pseudographic characters (for the borders and stuff) in the wrong places.

image

Because of (2), that prediction has to be the same prediction the terminal emulator would have made. And because of (1), it will not be able to use the font to predict those things.

There's this handful of bad C APIs that every application these days seems to use: wcwidth and wcswidth. They fall back on some built-in Unicode tables to tell an app how big a character is.

U+2620 is unfortunately only one cell wide.

That leaves us with three options:

  1. Don't care, and set U+2620 to be 2 cells wide.
  2. Display it 1 cell wide, but shrink it down to fit.
  3. Display it 2 cells wide, but only allocate it one cell of space.

Of all the options, 1 is the most incorrect. This is what happens in Midnight Commander if we do that:

image

wezterm chooses treatment 2, as did we before the new rendering engine:

WezTerm shrinks the skull glyph

iTerm2 and Terminal.app choose treatment 3.

We elected to go with treatment 3 because it also improves the display of text from other languages, where there may be ascenders and descenders that poke out of the top or bottom of the cell. It's not more right, but it's definitely less wrong.

Hope that helps!

Unfortunately, one of those is the hourglass:⌛ U+231B that is also an emoji in most fonts and is currently handled as double-width in Terminal, while MouseText would require it to be single-width.

@PhMajerus - I've got no idea how Unicode expects this to work. Maybe by some manner of variation selector?

@PhMajerus
Copy link

PhMajerus commented Mar 10, 2024

@Florian-Thake

I'm sure DHowett has a much better grasp on all of this than I do, and I don't know how the decision process went internally, but I can provide some context to justify their decision as I see it from the outside, and so he can spend more time on getting things done and just correct me if he sees something I got wrong.

I agree it's counter intuitive and seems wrong, but I don't think the Terminal is wrong in this case.
The problem is terminals are based on a grid of characters.
Other GUI apps "simply" render text using the font renderer, and find out the width of the text using the same font renderer. If an app needs to align things in a window or dialog box, it can request to use whatever the system UI font is and still calculate the width and height or any text string as it draws the UI.

History

Terminals and consoles on the other hand come from a day when text screens were a grid of character cells, OEMs could design their own character sets, but each character would take exactly one cell. The app didn't need to ask a renderer to calculate the width for a string of text to know how to align UI elements, it could calculate everything in term of columns and rows, a 10 characters string would always take 10 columns, regardless of the "font" (character set) and characters it contained.

Then Japan joined the party, and needed both more than 256 characters and wider characters, so we got MBCS (multibyte character set). Fortunately, since characters on PCs were basically half-square rectangles, they could fit their characters in two cells, making square ideograms, and encode those as two characters, extending the number of characters available by basically having a mix of 8-bit and 16-bit character values. Chinese and Korean used the same principle.
It was still easy to compute text width, because one byte would always take one cell, double-width characters would only be two-bytes characters (Note from their point of view, double-width is full-width, and normal-width is half-width).

Then Unicode came along and decided to unify all character sets. One of the original rule of Unicode was that it doesn't care about font or looks, only about characters intent.
They included normal-width and full-width characters, and some characters that were normal-width in some countries and full-width in others were mapped to the same Unicode character because they had the same intent. After all, some font were proportional instead of monospaced, so even latin letters would have different widths, surely any app would always ask the font renderer to calculate the width of specific text strings, not try to guess the width from the text alone.

That was the beginning of the mess to build a terminal or console emulator using Unicode, because a text-based app doesn't use a font renderer, it cannot query how wide some text string is, it knows from the text content how many columns it will use on screen… except now it doesn't anymore because it depends on whether the terminal is in western or CJK (asian) mode.
So some terminals would include an option to choose whether ambiguous-width characters are displayed as single or double cells.

At that point, fonts included some semigraphic characters from legacy code pages such as line drawing, box elements, cards suits, etc…, and these were normal-width characters because they came from original computers character sets.
Then Emojis arrived. Emojis were mostly used in Japan on their mobile phones, and were normal size characters for Japan … which means full-width, so double-width from our point of view.
That could have been all fine and good if Unicode just added the Emojis in their own blocks and everybody knew that these were always double-width, but remember that original rule that Unicode identifies characters by their intent/meaning, not their design in a specific font?
So Emojis that represented the same things as existing characters were not given new code points, instead they reused the existing characters, saying they can be normal-size semigraphic characters or double-width Emojis depending on the font and even context selected by the app, using the font renderer options.

Except that, again, text-mode apps do not use a font renderer, they send code points to display to a terminal emulator and have no idea of fonts and renderer options.
See my screenshot of Apple MouseText characters, they are facing exactly that issue: The hourglass:⌛ was already included as a semigraphic character taking a single cell, then reused as an Emoji. So now expectation depends on the app, and even text-based apps won't agree on whether it takes one or two cells in a terminal, because Apple probably considers the Emoji to be the definitive character, while Apple II emulators or technical documents using text to show screen contents consider the semigraphic to be the definitive character so it aligns properly with the other MouseText characters.
That could be no problem in a Web page or an app that can pick a font, or even specify which one to use, but a text-mode app only sends U+231B to the terminal and expect it to be right… except a modern shell script may expect the Emoji, while an Apple II emulator may expect the semigraphic.

Unicode should probably have encoded them as two different characters but here we are, living with past decisions. It's easy to be wiser in retrospect once you know which issues each choice will bring over 30 years later.
The other solution could be to have an in-band command that a text-mode app can send to the terminal to select semigraphic or Emoji mode for the dual-purpose characters, but the terminal protocol is complex with lot of complex features coming from the last 60 years of its history, and just getting all those legacy features working is already a huge undertaking, just to ensure compatibility with existing text-based apps, before they can even start thinking about adding new features for new problems introduced by a new characters system for some fancy new character.

In the future we'll probably see both some option for a text-based app to tell the terminal to use semigraphics or Emojis, and to query the terminal for the cells-width or a specific text string (I believe DHowett is pushing for this already).

Current solution isn't that bad, probably the best with what we have today

In the meantime, handling these characters as using a single cell is actually a pretty good solution. This means text-mode apps that expect them to be Emojis can just add a space after the Emojis to make the terminal use two cells for it. The proper font will have a double-width glyph for that character that will extend from the first cell into the following space, and it will display as the app intended.
Another text-mode app that expects it to be a single-width semigraphic character can send that character to the terminal knowing that whatever happens, it will not break alignments. If the user picked a font that renders it as an Emoji, it will overlap the next character, but will not ruin the rest of the grid alignment, and the user will notice and can find out if that terminal has an option to switch between Emoji and semigraphic mode, or change to a font that contains semigraphic versions of those characters.

It's far from perfect, I agree, but decades of history with an evolving technology is never perfect, at most it is working in a predictable way.
Terminal emulation comes with decades of baggage and concepts that Unicode probably didn't consider hard enough. I'm actually amazed at how well it works considering all the history involved.
With the current existing features, the current solution seems like the better tradeoff, and if other terminals are behaving the same, the most important, which is compatibility and predictability, is achieved.

@DHowett
Copy link
Member

DHowett commented Mar 10, 2024

Wow, these comments were literally seconds apart! High five!

@PhMajerus
Copy link

@DHowett 🫸🫷

@Florian-Thake
Copy link
Author

@PhMajerus

Wow, thank you very much for your deep and awesome insights to the history and current situation! 😎

@DHowett

To you also a big thank you for the deep explanations! 👍

In the following I response to the comments of both of you.

I understand that from the history point of view.
And, yeah, to stay backward compatible is always an issue and often a burden.

From the user point of view, I personally would expect this:

  • The grid will not break
  • The font will render correct without overlapping signs.

As a summarize of both of your explanations I would say:

To achieve this for the long term there can be 2 options.
Either the skull must be rendered into 1 cell or the size must be changed to 2 cells (important: also in wcwidth)
Both could be correct.
But the Terminal does not necessarily know which option is actually chosen or possible because the font rendering might be decoupled from the grid layout and other things, etc.

This leads me to 3 possible scenarios:

  1. The user of the Terminal App has an option for it in the setting (Maybe can be auto detected by the chosen font / renderer)
  2. The application running inside the Terminal can query and change the setting.
  3. Choose one of the options as the one and only (and possibly breaking older terminals / terminal apps by intention)

Are you agree from your side from the available options and scenarios?

Personally, I would prefer scenario 3 and pick the option to increase the used cells to 2.
Why?
Because from the look and feel these icons are looking a way more better when they use 2 cells.
This will break old apps which are using the old wcwidth. But is it bad?
I would say, for the long term this can be ok. Because we are talking about rendering of skull icons and something similar.
It is not so important to have backward compatibility forever in this particular cases.
If really needed there can be a user option to activate backward compatibility which will use item 2 from @DHowett comment.

So, I think there is the chance at least to choose the best fitting solution for the future.

For the meanwhile, personally I would prefer the old way, to render it in one cell.
This works IMHO for all use cases but the icon itself does not look so nice.

I don't think that insert extra spaces is a way to go. How can I know if I need to insert a space and where?
Then I must remove it again when send the string into a file or to somewhere else.
So, from the programming point of view, I am interested in an API for change the behavior.
Is there something existing already?

(Sorry, I am running out of time now. Maybe I will add something more or clarify at a later time.)

@Florian-Thake
Copy link
Author

I have some additions to my last comment:

@DHowett

Now, after I thought again about your comment, I come to my personal conclusion, that from your mentioned variants item 2 is the best fitting solution:

Display it 1 cell wide, but shrink it down to fit.

This will not break any grid and does not produce any overlapping. So, there is neither something broken nor it produces unreadable text / requires extra work.
The only drawback is, that the icon is smaller. Not nice, but better than overlapping text or a broken grid.
For your well done Midnight Commander examples that would also be the best solution.

For the long term it would be great if the size can be increased to 2 cells (I mean for rendering and the stored information in wcwidth).
I know this is difficult to achieve because it is not only a Windows Terminal thing but Unicode related. But that would be the most clean solution. Older terminal apps must then be rebuild with a new version of the underlying API or they stay broken.
But... what is better for this specific error category? To have unbroken legacy apps but have overlapping and unreadable text in modern apps, or a possible broken grid in legacy apps but everything working in modern apps?
In the long term the new apps will remain. Most of the existing apps can just be updated with an new version, which are using the updated API. Some older apps, which exist only in binary form, will disappear one day.

Are you aware of a Windows variant of wcwidth?
It seems to be it is a function from the posix standard and not available on Windows.

Is there a way to programmatically detect if my app is running in Windows Terminal and which version of it? (For C and C++)
3b) Is there an API for Windows Terminal for my app can interact with it?

I may have discovered a similar issue.
Is this also related to the history of terminals?

If I use the Thai sentence for nicely say hello: สวัสดี ครับ
In Windows Terminal are extra spaces added:
WindowsTerminal_Thai_rendering

This is, I believe, because the Thai letters are sometimes combined from 2 Unicode signs but then forming only 1 letter.
It seems to be, that the second Unicode sign then still occupies one cell, but it must be 0 for look good.
(or is this related to the mentioned "asia mode" by @PhMajerus ?)
Is this an issue which can be addressed?

@lhecker
Copy link
Member

lhecker commented Mar 12, 2024

Display it 1 cell wide, but shrink it down to fit.

This will not break any grid and does not produce any overlapping.

Surprisingly, this isn't true. Over the past decade terminal applications have come to expect Emojis to not be scaled down if they don't fit. We did downscale overly large glyphs before and it was heavily disliked by a lot of people.

For the long term it would be great if the size can be increased to 2 cells

That actually already exists: You simply need to use ☠️ (U+2620 U+FE0F) instead of ☠ (U+2620). That's the "variation selector" that Dustin mentioned. For "ambiguous" emojis like this (technically called "unqualified emoji"), its existence is the difference between whether the glyph is drawn colored (U+FE0F), black/white (U+FE0E), or whether the presentation is unspecified and left open to the text renderer. You can find the full list of such emojis here: https://unicode.org/emoji/charts/emoji-variants.html

However, Windows Terminal up until the current version 1.20 is not really Unicode aware at all, as it relies on something like wcwidth (as mentioned before). The good news is that I'm going to address this in 1.21. Here's a preview for my combining marks support using Wikipedia's example Zalgo:
image

It's not quite there yet (in particular the couple marks that are too far to the left at the start of the line), but it's significantly better than what we have now: It allocates at least 1 cell per character, making a complete mess of the Zalgo text.

In my opinion, kitty does an excellent job when it comes to Unicode in Terminals and I'm trying to replicate its behavior for Windows Terminal. If I succeed, your issue will be consistently gone as long as you ensure to use either minimally or fully qualified emojis. Unqualified emojis will always have these overlap issues, because it's what both terminal applications and its users have come to expect.

@PhMajerus
Copy link

@lhecker Oh, you just solved my problem with the hourglass:⌛ U+231B for MouseText, it is on that list.
So now I just need to find out how to handle the U+FE0E Variation Selector-15 in Cascadia to provide the specific glyph for that black/white version.

@j4james
Copy link
Collaborator

j4james commented Mar 12, 2024

@PhMajerus Note that the VS15 selector can change the rendering from an emoji presentation to a text presentation, but it can't change the width. You can make a narrow character wider with VS16, but you can't make a wide character narrower with VS15. And as far as I understand, U+231B is emoji presentation by default, so will always be wide.

@Florian-Thake
Copy link
Author

@lhecker

Surprisingly, this isn't true. Over the past decade terminal applications have come to expect Emojis to not be scaled down if they don't fit. We did downscale overly large glyphs before and it was heavily disliked by a lot of people.

Yes, sure, I dislike it also.

For the long term it would be great if the size can be increased to 2 cells

That actually already exists: You simply need to use ☠️ (U+2620 U+FE0F) instead of ☠ (U+2620). That's the "variation selector" that Dustin mentioned. For "ambiguous" emojis like this (technically called "unqualified emoji"), its existence is the difference between whether the glyph is drawn colored (U+FE0F), black/white (U+FE0E), or whether the presentation is unspecified and left open to the text renderer. You can find the full list of such emojis here: https://unicode.org/emoji/charts/emoji-variants.html

Ah, I didn't know that this exist. Thank you for providing these details.
Is there some rule which selector is possible for a sign? I wonder why not U+FE00 or U+FE01 is used instead?

The Unicode specification is more complex than I thought. With the current approach you always must check if a selector is present after the last sign of some defined range for not lose it by accident.

OK, so for this issue it means, it is working as designed / specified and can be closed?

Do you have some input for my item 4 of my last comment?

I may have discovered a similar issue.
Is this also related to the history of terminals?

If I use the Thai sentence for nicely say hello: สวัสดี ครับ
In Windows Terminal are extra spaces added:
WindowsTerminal_Thai_rendering
This is, I believe, because the Thai letters are sometimes combined from 2 Unicode signs but then forming only 1 letter.
It seems to be, that the second Unicode sign then still occupies one cell, but it must be 0 for look good.

@lhecker
Copy link
Member

lhecker commented Mar 13, 2024

Is there some rule which selector is possible for a sign?

The emoji-variants.html I linked shows all Emojis that are affected by the selector.

I wonder why not U+FE00 or U+FE01 is used instead?

Wikipedia has a nice list what the other VS are used for: https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)

OK, so for this issue it means, it is working as designed / specified and can be closed?

Yes, if we ignore bugs in our implementation, I'd say it's working as intended. I'll close the issue then. 🙂

Do you have some input for my item 4 of my last comment?

Ah, I apologize! Your text contains 3 non-spacing marks (Unicode category "Mn"): ​ ั, ​ ี, and ​ ั. As the "non-spacing" indicates, they're generally supposed to not take up any space. But Windows Terminal doesn't yet support Unicode beyond basic surrogate pairs. That's why it allocates 1 cell for each non-spacing mark. I'm hoping to fix this issue in version 1.21 that will release in a few months. It already works in my debug build:
image

@lhecker lhecker closed this as not planned Won't fix, can't repro, duplicate, stale Mar 13, 2024
@Florian-Thake
Copy link
Author

@lhecker
Thank you very much for your answers, time and effort 🙂

@mplattner
Copy link

@lhecker, what might be related: Windows Terminal renders 🟥 and 🟩 correctly (as 2-char-width), but ⬜ is rendered as a traditional character (see screenshot), which causes mis-alignments, also as shown on the screenshot. Is it possible to change how ⬜ is displayed? I think it should be consistent to the other square-"emojis"? Thanks!

image

@lhecker
Copy link
Member

lhecker commented Apr 18, 2024

If you paste ⬜ into a cmd.exe prompt, does it still treat it as narrow? On my end this works.
PowerShell's support for emojis is not great at the moment. If you paste ⬜ into its prompt you'll immediately notice that it's treated as narrow and that the prompt will overall behave weirdly. If that table is generated by PowerShell, then this is the most likely cause for the issue.

@lhecker
Copy link
Member

lhecker commented Apr 18, 2024

Oh, right you were probably not asking just about the misalignment...
The ⬜ renders as a empty square, because that's how Cascadia Code has designed it. If you use another font, like Consolas it'll show up as an emoji. I'm not entirely sure why Cascadia contains this glyph and I'll try to ask around if it's intentional.

@mplattner
Copy link

mplattner commented Apr 19, 2024

Thanks @lhecker. I just opened pwsh in Windows Terminal and pasted 🟥🟩, pressed ENTER; then pasted , pressed ENTER; then pasted followed by typing hi.

As you can see on the screenshot below:

  • the is not rendered as the other emojis (I think it should be, as here in the browser), and
  • the cursor is rendered in between h and i, while it should be after the i (I did not use arrow keys.) That's what you mean with "[...] prompt will overall behave weirdly" I guess.

I'm using oh-my-posh (v19.19.0), Windows Terminal (v1.19.10821.0) and the font is: JetBrainsMono Nerd Font, but the rendering/issue is the same with font Consolas.

image

@PhMajerus
Copy link

PhMajerus commented Apr 19, 2024

This is because the black and white large squares predate emojis, they are part of the misc. symbols and arrows:
image
(command: Array.from("🟥🟩⬛⬜").forEach(function(c){ echo(c+" "+c.codePointAt(0).toString(16).toUpperCase()); }))

@mplattner
Copy link

mplattner commented Apr 19, 2024

Thanks, that makes sense, but to my understanding it's up to the application how to display it, e.g., the browser and VS Code use the "modern" version of ⬜. Maybe Windows Terminal can be changed so that it uses the modern version as well. Possible side-effects might occur for apps that expect the "old-style" version to display some kind of terminal user-interface; however, rendering as-is is definitely a bug, see the cursor.

@PhMajerus
Copy link

I believe that's a bug with PowerShell, I can reproduce it with PowerShell, but not with cmd.exe or my ActiveScript Shell in the same Terminal.

@lhecker
Copy link
Member

lhecker commented Apr 19, 2024

  • the cursor is rendered in between h and i, while it should be after the i (I did not use arrow keys.) That's what you mean with "[...] prompt will overall behave weirdly" I guess.
    [...]
    however, rendering as-is is definately a bug, see the cursor.

Actually, that's what I meant with the following:

PowerShell's support for emojis is not great at the moment. If you paste ⬜ into its prompt you'll immediately notice that it's treated as narrow and that the prompt will overall behave weirdly. If that table is generated by PowerShell, then this is the most likely cause for the issue.

PowerShell's support for modern Unicode is not great. They assume that ⬜ is 1 cell wide, but the Unicode spec clearly says it's 2 cells wide (ea=W). This leads to a disagreement between the shell and the terminal and results in incorrect cursor positions, as @PhMajerus said.

BTW this is also why I can't quite agree with you, @PhMajerus, either: Cascadia has designed both glyphs (⬛ and ⬜) to be 1 cell wide, but that doesn't match the East Asian property. I'm not sure why they're designated as wide by Unicode, but I suppose that's beside the point. And so while the 1-cell wide variant works just fine outside of terminals, it looks a little weird inside them. Additionally, as far as I can tell, it doesn't supply a colored version for the Emojis either (i.e. with U+FE0F). Segoe UI on the other hand ships both, a colored and a grayscale version of these glyphs.
I do see why Cascadia wants to ship these glyphs though (since it ships many other symbols/arrows as well), so I'm not sure what to do here...

@PhMajerus
Copy link

@lhecker I didn't mean to imply that Cascadia was right. I just think they included it because it's part of the more "core" symbols, so that explains the discrepancy between the black and white squares and the color emoji-only squares.

I think support for emojis will improve over time, and these code points that existed as symbols and now have emojis versions as well will require some work on Cascadia's side. especially if it tries to support both symbol and emoji versions.
That's something that would be good to document and work with @aaronbell to consider early in Cascadia Next/Reboot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting
Projects
None yet
Development

No branches or pull requests

6 participants