Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle things above U+FFFF in GDI renderer #10477

Closed
alabuzhev opened this issue Jun 21, 2021 · 10 comments · Fixed by #10580
Closed

Handle things above U+FFFF in GDI renderer #10477

alabuzhev opened this issue Jun 21, 2021 · 10 comments · Fixed by #10580
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Impact-Compatibility Sure don't work like it used to. Impact-Correctness It be wrong. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Milestone

Comments

@alabuzhev
Copy link
Contributor

alabuzhev commented Jun 21, 2021

Why not let Windows draw surrogate pairs? It can do that.

Basically, the comment says everything:

// Our GDI renderer hasn't and isn't going to handle things above U+FFFF or sequences.
// So replace anything complicated with a replacement character for drawing purposes.

However, handling things above U+FFFF doesn't really require extra effort.

Proposed technical implementation details (optional)

  • Put all characters to the output buffer
  • Set the first width to cluster width and the rest to 0
  • Sit back and relax while Windows does the rest
diff --git a/src/renderer/gdi/paint.cpp b/src/renderer/gdi/paint.cpp
index 9295d3d6..e436b156 100644
--- a/src/renderer/gdi/paint.cpp
+++ b/src/renderer/gdi/paint.cpp
@@ -328,11 +328,13 @@ using namespace Microsoft::Console::Render;
 
         const auto pPolyTextLine = &_pPolyText[_cPolyText];
 
-        auto& polyString = _polyStrings.emplace_back(cchLine, UNICODE_NULL);
+        auto& polyString = _polyStrings.emplace_back();
+        polyString.reserve(cchLine);
 
         COORD const coordFontSize = _GetFontSize();
 
-        auto& polyWidth = _polyWidths.emplace_back(cchLine, 0);
+        auto& polyWidth = _polyWidths.emplace_back();
+        polyWidth.reserve(cchLine);
 
         // Sum up the total widths the entire line/run is expected to take while
         // copying the pixel widths into a structure to direct GDI how many pixels to use per character.
@@ -343,11 +345,11 @@ using namespace Microsoft::Console::Render;
         {
             const auto& cluster = til::at(clusters, i);
 
-            // Our GDI renderer hasn't and isn't going to handle things above U+FFFF or sequences.
-            // So replace anything complicated with a replacement character for drawing purposes.
-            polyString[i] = cluster.GetTextAsSingle();
-            polyWidth[i] = gsl::narrow<int>(cluster.GetColumns()) * coordFontSize.X;
-            cchCharWidths += polyWidth[i];
+            const auto text = cluster.GetText();
+            polyString += text;
+            polyWidth.push_back(gsl::narrow<int>(cluster.GetColumns()) * coordFontSize.X);
+            cchCharWidths += polyWidth.back();
+            polyWidth.append(text.size() - 1, 0);
         }
 
         // Detect and convert for raster font...
@@ -396,7 +398,7 @@ using namespace Microsoft::Console::Render;
         const auto bottomOffset = _currentLineRendition == LineRendition::DoubleHeightTop ? halfHeight : 0;
 
         pPolyTextLine->lpstr = polyString.data();
-        pPolyTextLine->n = gsl::narrow<UINT>(clusters.size());
+        pPolyTextLine->n = gsl::narrow<UINT>(polyString.size());
         pPolyTextLine->x = ptDraw.x;
         pPolyTextLine->y = ptDraw.y;
         pPolyTextLine->uiFlags = ETO_OPAQUE | ETO_CLIPPED;
@@ -436,10 +438,23 @@ using namespace Microsoft::Console::Render;
 
     if (_cPolyText > 0)
     {
+#if 1
+        assert(_cPolyText == 1);
+        for (size_t i = 0; i != _cPolyText; ++i)
+        {
+            const auto& t = _pPolyText[i];
+            if (!ExtTextOutW(_hdcMemoryContext, t.x, t.y, t.uiFlags, &t.rcl, t.lpstr, t.n, t.pdx))
+            {
+                hr = E_FAIL;
+                break;
+            }
+        }
+#else
         if (!PolyTextOutW(_hdcMemoryContext, _pPolyText, (UINT)_cPolyText))
         {
             hr = E_FAIL;
         }
+#endif
 
         _polyStrings.clear();
         _polyWidths.clear();
  • This patch also includes PolyTextOutW -> ExtTextOutW replacement, discussed in Missing glyph substitution #10472 - it's needed to show these glyphs in the first place.

Testing

@echo off
chcp 65001

echo 𠜎𠝹𠱓𠱸𠲖𠳏𠳕𠴕𠵼𠵿𠸎
echo 👨👩👧👦

Save this as a UTF-8 cmd file and run.

Before the change

image

After the change

image

@alabuzhev alabuzhev added the Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. label Jun 21, 2021
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jun 21, 2021
@lhecker lhecker added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Impact-Compatibility Sure don't work like it used to. Impact-Correctness It be wrong. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase and removed Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. labels Jun 21, 2021
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jun 21, 2021
@lhecker lhecker added this to the Windows vNext milestone Jun 21, 2021
@lhecker
Copy link
Member

lhecker commented Jun 21, 2021

This looks awesome! But it's also a bit scary for me personally. Your other issue is quite easy for me to understand, but the changes you suggest here would add surrogate support, despite us not properly supporting surrogate pairs anywhere else yet... I'm not 100% sure if your change would work in all corner cases.

But granted, I'm relatively new in these parts of our codebase and other maintainers of this project are a better fit to review the changes you suggest.

@DHowett
Copy link
Member

DHowett commented Jun 21, 2021

Surrogates definitely need a little more love on the text buffer side, for sure. Right now, they (perhaps puzzlingly) only work for double-width glyphs because technically they're being stored directly int eh buffer.

This complicates a number of things:

  1. The clipboard code doesn't know what to do if you manage to select half of one.
    • you end up being able to select partial surrogate pair wide glyphs because the codepoint width estimator is asked about the individual code units and probably says "heck, i don't know"
  2. single-width surrogates will be rendered either broken or by removing an entire column from the line (something we measured as two columns (see 1.x above) takes up one, everything after it is offset)

As opposed to #10472 (which I would love to see a pull request for! for attribution's sake 😄) I'm more hesitant on this one. It will make things look like they're working more correctly than they actually are, and that may get us into more compliance trouble than simply rejecting them would.

@DHowett
Copy link
Member

DHowett commented Jun 21, 2021

(We're working on surrogate pairs et al over in #8000)

@alabuzhev
Copy link
Contributor Author

despite us not properly supporting surrogate pairs anywhere else yet...

@lhecker I'd say you do support surrogates more or less acceptably already, except for this rendering gap. This already works in Windows Terminal and, philosophically, as a middleman you don't have to do much at all:

  • display what you've been given using the correct number of cells
  • resize the cursor properly when needed.

Everything else is up to the client.

Here's an example of an app working with surrogate pairs in a patched OpenConsole:
image

As you can see, the cursor occupies two cells already, everything is sized, clipped and positioned as expected.

Indeed, it's a quite complex subject and there are known issues with surrogates that logically occupy one cell etc., but I'd say that a partially garbled output is still better than a bunch of perfectly drawn "�".

@alabuzhev

This comment has been minimized.

@alabuzhev

This comment has been minimized.

@zadjii-msft
Copy link
Member

I'm gonna say if this just so happens to work for the DX renderer today, and a trivial change will make it work in GDI, then lets go for it. It might not be technically correct in the buffer, but hey, it's just as technically incorrect in Terminal as in conhost, and Terminal renders it fine.

@miniksa for a veto vote though.

@zadjii-msft zadjii-msft removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jul 7, 2021
@miniksa
Copy link
Member

miniksa commented Jul 7, 2021

I'm good with @alabuzhev's proposed code. As far as I can recall... the substitution was based on it not working with PolyTextOut and with it working fine with ExtTextOut... I don't really have an excuse.

I concur with the argument that it is better than the existing... anything is better than a substitution glyph... and much of the code can handle basic surrogates already. Sure there will be rough edges... but its not worse than today.

And again... I think real characters over replacements isn't going to stop an assembly line. :P

@alabuzhev
Copy link
Contributor Author

Thanks, I'll turn it into a PR then.

alabuzhev added a commit to alabuzhev/terminal that referenced this issue Jul 7, 2021
@ghost ghost added the In-PR This issue has a related PR label Jul 7, 2021
@ghost ghost closed this as completed in #10580 Jul 8, 2021
ghost pushed a commit that referenced this issue Jul 8, 2021
<!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? -->
## Implementation of #10477 - handle surrogate pairs in GDI renderer.

<!-- Please review the items on the PR checklist before submitting-->
## PR Checklist
* [ ] Closes #10477
* [x] CLA signed.
* [ ] Tests added/passed
* [ ] Documentation updated.
* [ ] Schema updated.
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #10477

<!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here -->
## Detailed Description of the Pull Request / Additional comments
### Why not let Windows draw surrogate pairs? It can do that.

Basically, the comment says everything:
https://github.com/microsoft/terminal/blob/c90de692509b074bfde191910d67154cfe389911/src/renderer/gdi/paint.cpp#L346-L347

However, handling things above U+FFFF doesn't really require extra effort. It's enough to:
- Put *all* characters to the output buffer
- Set the first width to cluster width and the rest to 0
- Sit back and relax while Windows does the rest

<!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well -->
## Validation Steps Performed
```CMD
@echo off
chcp 65001

echo 𠜎𠝹𠱓𠱸𠲖𠳏𠳕𠴕𠵼𠵿𠸎
echo 👨👩👧👦
```
Save this as a UTF-8 cmd file and run.

### Before the change
![image](https://user-images.githubusercontent.com/11453922/122832196-ed438880-d2e2-11eb-93dd-931954efedbf.png)

### After the change
![image](https://user-images.githubusercontent.com/11453922/122832217-f2a0d300-d2e2-11eb-99f0-e129e5544667.png)

An example of a third party app working with surrogate pairs in a patched OpenConsole:
![image](https://user-images.githubusercontent.com/11453922/122838225-837cac00-d2ed-11eb-8faf-dbeb52f77916.png)

As discussed, this change doesn't claim to be the full support for surrogate pairs (there are still corner cases possible), but brings it on par with Terminal with minimal effort.
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Jul 8, 2021
@ghost
Copy link

ghost commented Jul 14, 2021

🎉This issue was addressed in #10580, which has now been successfully released as Windows Terminal Preview v1.10.1933.0.:tada:

Handy links:

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Impact-Compatibility Sure don't work like it used to. Impact-Correctness It be wrong. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants