Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement til::au16 and til::u16a conversion functions & make first use in WriteConsoleAImpl #4493

Closed
wants to merge 32 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
a8bd2df
implement til::au16 and til::u16a conversion functions & update relat…
german-one Feb 6, 2020
b19b066
eliminate redundant code, update comments, rearrange functions
german-one Feb 6, 2020
37324d5
add unit test for DBCS partials
german-one Feb 6, 2020
fcc9bf1
make first use in `WriteConsoleAImpl`
german-one Feb 6, 2020
2571d23
pass static analysis
german-one Feb 6, 2020
c7bdc90
try harder to pass the SA
german-one Feb 6, 2020
7f77579
have fun with C array
german-one Feb 6, 2020
c07cb48
get over the brace barrier?
german-one Feb 7, 2020
20f1d2b
see what til::at makes out of it
german-one Feb 7, 2020
aff87e0
try again to suppress warnings
german-one Feb 7, 2020
519c374
make the CPINFO a private member and only update it if necessary
german-one Feb 7, 2020
26ac753
update `til::at` to make it applicable for arrays
german-one Feb 10, 2020
542bfd2
reverse `til::at` update
german-one Feb 10, 2020
fb74524
add partials handling to GB 18030 and GSM 7 bit codepages
german-one Feb 17, 2020
3003ff2
make sure caching of partials still works if the string consists of a…
german-one Feb 21, 2020
c8735d5
Merge branch 'master' into master
german-one Feb 21, 2020
986b269
Merge branch 'master' into master
german-one Mar 4, 2020
34b6b1d
keep track of u8u16 PR crossfire and findings
german-one Mar 5, 2020
3462298
update `til::at` and use it
german-one Mar 6, 2020
f9acdfc
try to suppress array to pointer decay
german-one Mar 6, 2020
fe60901
funny business: try to call the pointer overload explicitely
german-one Mar 6, 2020
0d83b56
use rvalue reference
german-one Mar 6, 2020
385b735
remove array overload, use pointer
german-one Mar 6, 2020
11602df
make aState a public member of SCREEN_INFORMATION & use it in WriteCo…
german-one Mar 7, 2020
f2a4b3a
spelling
german-one Mar 8, 2020
9008eff
risk on more red X: explicitely exclude arrays in til::at overloads
german-one Mar 9, 2020
bd2be6b
undo, even that didn't work
german-one Mar 9, 2020
79f154e
Merge branch 'master' into master
german-one Mar 25, 2020
b857c3e
add `au` and `GSM` to `whitelist.txt`
german-one Mar 25, 2020
83443da
Merge remote-tracking branch 'upstream/master'
Mar 26, 2020
ab4e0d7
keep initialization style consistent
german-one Mar 26, 2020
4983479
Merge remote-tracking branch 'upstream/master'
Mar 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/actions/spell-check/whitelist/whitelist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ ATest
atg
attr
ATTRCOLOR
au
aumid
Authenticode
AUTOBUDDY
Expand Down Expand Up @@ -976,6 +977,7 @@ Greyscale
gridline
groupbox
gsl
GSM
GTP
guc
gui
Expand Down
2 changes: 1 addition & 1 deletion src/cascadia/TerminalConnection/ConptyConnection.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ namespace winrt::Microsoft::Terminal::TerminalConnection::implementation
wil::unique_static_pseudoconsole_handle _hPC;
wil::unique_threadpool_wait _clientExitWait;

til::u8state _u8State;
til::astate _u8State;
std::wstring _u16Str;
std::array<char, 4096> _buffer;

Expand Down
2 changes: 1 addition & 1 deletion src/host/VtInputThread.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,6 @@ namespace Microsoft::Console
HRESULT _exitResult;

std::unique_ptr<Microsoft::Console::VirtualTerminal::StateMachine> _pInputStateMachine;
til::u8state _u8State;
til::astate _u8State;
};
}
86 changes: 5 additions & 81 deletions src/host/_stream.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1082,94 +1082,18 @@ constexpr unsigned int LOCAL_BUFFER_SIZE = 100;
auto unlock{ wil::scope_exit([&] { UnlockConsole(); }) };

auto& screenInfo{ context.GetActiveBuffer() };
const auto& consoleInfo{ ServiceLocator::LocateGlobals().getConsoleInformation() };
const auto codepage{ consoleInfo.OutputCP };
auto leadByteCaptured{ false };
auto leadByteConsumed{ false };
const auto codepage{ ServiceLocator::LocateGlobals().getConsoleInformation().OutputCP };
std::wstring wstr{};
static til::u8state u8State{};

// Convert our input parameters to Unicode
const auto leadByteConsumed{ !screenInfo.aState.empty(codepage) }; // only used for DBCS
RETURN_IF_FAILED(til::au16(codepage, buffer, wstr, screenInfo.aState));
const auto leadByteCaptured{ !screenInfo.aState.empty(codepage) }; // only used for DBCS

if (codepage == CP_UTF8)
{
RETURN_IF_FAILED(til::u8u16(buffer, wstr, u8State));
read = buffer.size();
}
else
{
// In case the codepage changes from UTF-8 to another,
// we discard partials that might still be cached.
u8State.reset();

int mbPtrLength{};
RETURN_IF_FAILED(SizeTToInt(buffer.size(), &mbPtrLength));

// (buffer.size() + 2) I think because we might be shoving another unicode char
// from screenInfo->WriteConsoleDbcsLeadByte in front
// because we previously checked that buffer.size() fits into an int, +2 won't cause an overflow of size_t
wstr.resize(buffer.size() + 2);

wchar_t* wcPtr{ wstr.data() };
auto mbPtr{ buffer.data() };
size_t dbcsLength{};
if (screenInfo.WriteConsoleDbcsLeadByte[0] != 0 && gsl::narrow_cast<byte>(*mbPtr) >= byte{ ' ' })
{
// there was a portion of a dbcs character stored from a previous
// call so we take the 2nd half from mbPtr[0], put them together
// and write the wide char to wcPtr[0]
screenInfo.WriteConsoleDbcsLeadByte[1] = gsl::narrow_cast<byte>(*mbPtr);

try
{
const auto wFromComplemented{
ConvertToW(codepage, { reinterpret_cast<const char*>(screenInfo.WriteConsoleDbcsLeadByte), ARRAYSIZE(screenInfo.WriteConsoleDbcsLeadByte) })
};

FAIL_FAST_IF(wFromComplemented.size() != 1);
dbcsLength = sizeof(wchar_t);
wcPtr[0] = wFromComplemented.at(0);
mbPtr++;
}
catch (...)
{
dbcsLength = 0;
}

// this looks weird to be always incrementing even if the conversion failed, but this is the
// original behavior so it's left unchanged.
wcPtr++;
mbPtrLength--;

// Note that we used a stored lead byte from a previous call in order to complete this write
// Use this to offset the "number of bytes consumed" calculation at the end by -1 to account
// for using a byte we had internally, not off the stream.
leadByteConsumed = true;
}

screenInfo.WriteConsoleDbcsLeadByte[0] = 0;

// if the last byte in mbPtr is a lead byte for the current code page,
// save it for the next time this function is called and we can piece it
// back together then
if (mbPtrLength != 0 && CheckBisectStringA(const_cast<char*>(mbPtr), mbPtrLength, &consoleInfo.OutputCPInfo))
{
screenInfo.WriteConsoleDbcsLeadByte[0] = gsl::narrow_cast<byte>(mbPtr[mbPtrLength - 1]);
mbPtrLength--;

// Note that we captured a lead byte during this call, but won't actually draw it until later.
// Use this to offset the "number of bytes consumed" calculation at the end by +1 to account
// for taking a byte off the stream.
leadByteCaptured = true;
}

if (mbPtrLength != 0)
{
// convert the remaining bytes in mbPtr to wide chars
mbPtrLength = sizeof(wchar_t) * MultiByteToWideChar(codepage, 0, mbPtr, mbPtrLength, wcPtr, mbPtrLength);
}

wstr.resize((dbcsLength + mbPtrLength) / sizeof(wchar_t));
}

// Hold the specific version of the waiter locally so we can tinker with it if we must to store additional context.
std::unique_ptr<WriteData> writeDataWaiter{};
Expand Down
3 changes: 1 addition & 2 deletions src/host/screenInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,9 @@ SCREEN_INFORMATION::SCREEN_INFORMATION(
HWheelDelta{ 0 },
_textBuffer{ nullptr },
Next{ nullptr },
WriteConsoleDbcsLeadByte{ 0, 0 },
FillOutDbcsLeadChar{ 0 },
german-one marked this conversation as resolved.
Show resolved Hide resolved
ConvScreenInfo{ nullptr },
ScrollScale{ 1ul },
aState{},
_pConsoleWindowMetrics{ pMetrics },
_pAccessibilityNotifier{ pNotifier },
_stateMachine{ nullptr },
Expand Down
4 changes: 2 additions & 2 deletions src/host/screenInfo.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -172,14 +172,14 @@ class SCREEN_INFORMATION : public ConsoleObjectHeader, public Microsoft::Console

public:
SCREEN_INFORMATION* Next;
BYTE WriteConsoleDbcsLeadByte[2];
BYTE FillOutDbcsLeadChar;

// non ownership pointer
ConversionAreaInfo* ConvScreenInfo;

UINT ScrollScale;

til::astate aState{};

bool IsActiveScreenBuffer() const;

const Microsoft::Console::VirtualTerminal::StateMachine& GetStateMachine() const;
Expand Down
51 changes: 40 additions & 11 deletions src/inc/til/at.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,48 @@

#pragma once

// The at function declares that you've already sufficiently checked that your array access
// is in range before retrieving an item inside it at an offset.
// This is to save double/triple/quadruple testing in circumstances where you are already
// pivoting on the length of a set and now want to pull elements out of it by offset
// without checking again.
// gsl::at will do the check again. As will .at(). And using [] will have a warning in audit.

namespace til
{
// The at function declares that you've already sufficiently checked that your array access
// is in range before retrieving an item inside it at an offset.
// This is to save double/triple/quadruple testing in circumstances where you are already
// pivoting on the length of a set and now want to pull elements out of it by offset
// without checking again.
// gsl::at will do the check again. As will .at(). And using [] will have a warning in audit.
template<class T>
constexpr auto at(T& cont, const size_t i) -> decltype(cont[cont.size()])
// Routine Description:
// - Takes a reference to a sequence of constant data and returns the item at the given offset.
// NOTE: The function relies on a sufficient check that the access is in range.
// Arguments:
// - sequence - Reference to constant data in a range of data accessible using the subscript operator.
// Such like constant STL strings, containers, bitsets, random-access iterators, or pointers.
// - index - Offset of the element to be returned.
// Return Value:
// - Element in the sequence at the offset given by the index parameter.
template<class T, class U>
constexpr auto at(const T& sequence, const U index) -> typename std::enable_if<std::is_integral<U>::value, decltype(sequence[0])>::type
{
#pragma warning(push)
#pragma warning(suppress : 26481 26482 26446) // Suppress checks for pointer arithmetic, indexing with constant expressions, and subscript operator.
return sequence[index];
#pragma warning(pop)
}

// Routine Description:
// - Takes a reference to a sequence of data and returns the item at the given offset.
// NOTE: The function relies on a sufficient check that the access is in range.
// Arguments:
// - sequence - Reference to constant data in a range of data accessible using the subscript operator.
// Such like STL strings, containers, bitsets, random-access iterators, or pointers.
// - index - Offset of the element to be returned.
// Return Value:
// - Element in the sequence at the offset given by the index parameter.
template<class T, class U>
constexpr auto at(T& sequence, const U index) -> typename std::enable_if<std::is_integral<U>::value, decltype(sequence[0])>::type
{
#pragma warning(suppress : 26482) // Suppress bounds.2 check for indexing with constant expressions
#pragma warning(suppress : 26446) // Suppress bounds.4 check for subscript operator.
return cont[i];
#pragma warning(push)
#pragma warning(suppress : 26481 26482 26446) // Suppress checks for pointer arithmetic, indexing with constant expressions, and subscript operator.
return sequence[index];
#pragma warning(pop)
}
}
Loading