Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Support Unicode in the prompt #66312

Merged
merged 1 commit into from
Sep 14, 2023

Conversation

JDevlieghere
Copy link
Member

Account for Unicode when computing the prompt column width. Previously, the string length (i.e. number of bytes) rather than the width of the Glyph was used to compute the cursor position. The result was that the cursor would be offset to the right when using a prompt containing Unicode.

Account for Unicode when computing the prompt column width. Previously,
the string length (i.e. number of bytes) rather than the width of the
Glyph was used to compute the cursor position. The result was that the
cursor would be offset to the right when using a prompt containing
Unicode.
@JDevlieghere JDevlieghere requested a review from a team as a code owner September 14, 2023 01:59
@llvmbot llvmbot added the lldb label Sep 14, 2023
@JDevlieghere
Copy link
Member Author

Related to #66218

@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2023

@llvm/pr-subscribers-lldb

Changes Account for Unicode when computing the prompt column width. Previously, the string length (i.e. number of bytes) rather than the width of the Glyph was used to compute the cursor position. The result was that the cursor would be offset to the right when using a prompt containing Unicode. -- Full diff: https://github.com//pull/66312.diff

4 Files Affected:

  • (modified) lldb/include/lldb/Host/Editline.h (+2-3)
  • (modified) lldb/packages/Python/lldbsuite/test/lldbpexpect.py (+2)
  • (modified) lldb/source/Host/common/Editline.cpp (+16-9)
  • (modified) lldb/test/API/terminal/TestEditline.py (+11)
diff --git a/lldb/include/lldb/Host/Editline.h b/lldb/include/lldb/Host/Editline.h
index 35fd179abb509c3..64c384151facc91 100644
--- a/lldb/include/lldb/Host/Editline.h
+++ b/lldb/include/lldb/Host/Editline.h
@@ -248,9 +248,8 @@ class Editline {
   void SetCurrentLine(int line_index);
 
   /// Determines the width of the prompt in characters.  The width is guaranteed
-  /// to be the same for
-  /// all lines of the current multi-line session.
-  int GetPromptWidth();
+  /// to be the same for all lines of the current multi-line session.
+  size_t GetPromptWidth();
 
   /// Returns true if the underlying EditLine session's keybindings are
   /// Emacs-based, or false if
diff --git a/lldb/packages/Python/lldbsuite/test/lldbpexpect.py b/lldb/packages/Python/lldbsuite/test/lldbpexpect.py
index 85102db7d0a0536..9fedfa70c3f65dc 100644
--- a/lldb/packages/Python/lldbsuite/test/lldbpexpect.py
+++ b/lldb/packages/Python/lldbsuite/test/lldbpexpect.py
@@ -26,6 +26,7 @@ def launch(
         dimensions=None,
         run_under=None,
         post_spawn=None,
+        encoding=None,
         use_colors=False,
     ):
         logfile = getattr(sys.stdout, "buffer", sys.stdout) if self.TraceOn() else None
@@ -58,6 +59,7 @@ def launch(
             timeout=timeout,
             dimensions=dimensions,
             env=env,
+            encoding=encoding,
         )
         self.child.ptyproc.delayafterclose = timeout / 10
         self.child.ptyproc.delayafterterminate = timeout / 10
diff --git a/lldb/source/Host/common/Editline.cpp b/lldb/source/Host/common/Editline.cpp
index 544804db3879eb5..e84b451435a999d 100644
--- a/lldb/source/Host/common/Editline.cpp
+++ b/lldb/source/Host/common/Editline.cpp
@@ -25,6 +25,7 @@
 #include "lldb/Utility/Timeout.h"
 
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Locale.h"
 #include "llvm/Support/Threading.h"
 
 using namespace lldb_private;
@@ -101,6 +102,10 @@ bool IsOnlySpaces(const EditLineStringType &content) {
   return true;
 }
 
+static size_t ColumnWidth(llvm::StringRef str) {
+  return llvm::sys::locale::columnWidth(str);
+}
+
 static int GetOperation(HistoryOperation op) {
   // The naming used by editline for the history operations is counter
   // intuitive to how it's used in LLDB's editline implementation.
@@ -328,14 +333,16 @@ std::string Editline::PromptForIndex(int line_index) {
   std::string continuation_prompt = prompt;
   if (m_set_continuation_prompt.length() > 0) {
     continuation_prompt = m_set_continuation_prompt;
-
     // Ensure that both prompts are the same length through space padding
-    while (continuation_prompt.length() < prompt.length()) {
-      continuation_prompt += ' ';
-    }
-    while (prompt.length() < continuation_prompt.length()) {
-      prompt += ' ';
-    }
+    const size_t prompt_width = ColumnWidth(prompt);
+    const size_t cont_prompt_width = ColumnWidth(continuation_prompt);
+    const size_t padded_prompt_width =
+        std::max(prompt_width, cont_prompt_width);
+    if (prompt_width < padded_prompt_width)
+      prompt += std::string(padded_prompt_width - prompt_width, ' ');
+    else if (cont_prompt_width < padded_prompt_width)
+      continuation_prompt +=
+          std::string(padded_prompt_width - cont_prompt_width, ' ');
   }
 
   if (use_line_numbers) {
@@ -353,7 +360,7 @@ void Editline::SetCurrentLine(int line_index) {
   m_current_prompt = PromptForIndex(line_index);
 }
 
-int Editline::GetPromptWidth() { return (int)PromptForIndex(0).length(); }
+size_t Editline::GetPromptWidth() { return ColumnWidth(PromptForIndex(0)); }
 
 bool Editline::IsEmacs() {
   const char *editor;
@@ -441,7 +448,7 @@ void Editline::DisplayInput(int firstIndex) {
 int Editline::CountRowsForLine(const EditLineStringType &content) {
   std::string prompt =
       PromptForIndex(0); // Prompt width is constant during an edit session
-  int line_length = (int)(content.length() + prompt.length());
+  int line_length = (int)(content.length() + ColumnWidth(prompt));
   return (line_length / m_terminal_width) + 1;
 }
 
diff --git a/lldb/test/API/terminal/TestEditline.py b/lldb/test/API/terminal/TestEditline.py
index 87fb3cba610c83c..4e766cff286f63a 100644
--- a/lldb/test/API/terminal/TestEditline.py
+++ b/lldb/test/API/terminal/TestEditline.py
@@ -44,3 +44,14 @@ def test_left_right_arrow(self):
             )
 
         self.quit()
+
+    @skipIfAsan
+    @skipIfEditlineSupportMissing
+    def test_prompt_unicode(self):
+        """Test that we can use Unicode in the LLDB prompt."""
+        self.launch(use_colors=True, encoding="utf-8")
+        self.child.send('settings set prompt "🐛 "\n')
+        # Check that the cursor is at position 4 ([4G)
+        # Prompt: 🐛 _
+        # Column: 1..4
+        self.child.expect(re.escape("🐛 \x1b[0m\x1b[4G"))

@JDevlieghere JDevlieghere merged commit 850e90c into llvm:main Sep 14, 2023
@JDevlieghere JDevlieghere deleted the jdevlieghere/unicode-prompt branch September 14, 2023 03:08
kstoimenov pushed a commit to kstoimenov/llvm-project that referenced this pull request Sep 14, 2023
Account for Unicode when computing the prompt column width. Previously,
the string length (i.e. number of bytes) rather than the width of the
Glyph was used to compute the cursor position. The result was that the
cursor would be offset to the right when using a prompt containing
Unicode.
@gulfemsavrun
Copy link
Contributor

We started seeing a test failure after this commit.

UNRESOLVED: lldb-api :: terminal/TestEditline.py (2430 of 2430)
******************** TEST 'lldb-api :: terminal/TestEditline.py' FAILED ********************
Script:
--
/b/s/w/ir/x/w/lldb_install/python3/bin/python3 /b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/b/s/w/ir/x/w/cipd/bin/llvm-ar --env OBJCOPY=/b/s/w/ir/x/w/cipd/bin/llvm-objcopy --env LLVM_LIBS_DIR=/b/s/w/ir/x/w/llvm_build/./lib --env LLVM_INCLUDE_DIR=/b/s/w/ir/x/w/llvm_build/include --env LLVM_TOOLS_DIR=/b/s/w/ir/x/w/llvm_build/./bin --libcxx-include-dir /b/s/w/ir/x/w/llvm_build/include/c++/v1 --libcxx-include-target-dir /b/s/w/ir/x/w/llvm_build/include/x86_64-unknown-linux-gnu/c++/v1 --libcxx-library-dir /b/s/w/ir/x/w/llvm_build/./lib/x86_64-unknown-linux-gnu --arch x86_64 --build-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex --lldb-module-cache-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /b/s/w/ir/x/w/llvm_build/./bin/lldb --compiler /b/s/w/ir/x/w/llvm_build/./bin/clang --dsymutil /b/s/w/ir/x/w/llvm_build/./bin/dsymutil --llvm-tools-dir /b/s/w/ir/x/w/llvm_build/./bin --lldb-libs-dir /b/s/w/ir/x/w/llvm_build/./lib /b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/terminal -p TestEditline.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 18.0.0git (https://llvm.googlesource.com/a/llvm-project revision 850e90c47b3caaadb78581fb7a93655a8a60cbe0)
  clang revision 850e90c47b3caaadb78581fb7a93655a8a60cbe0
  llvm revision 850e90c47b3caaadb78581fb7a93655a8a60cbe0
Skipping the following test categories: ['dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
PASS: LLDB (/b/s/w/ir/x/w/llvm_build/bin/clang-x86_64) :: test_left_right_arrow (TestEditline.EditlineTest)
FAIL: LLDB (/b/s/w/ir/x/w/llvm_build/bin/clang-x86_64) :: test_prompt_unicode (TestEditline.EditlineTest)
======================================================================
ERROR: test_prompt_unicode (TestEditline.EditlineTest)

https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8770012743174630257/+/u/lldb/test/stdout

@mysterymath
Copy link
Contributor

I dug into this a little bit; it looks like the autodetection for LLDB_EDITLINE_USE_WCHAR was broken on our build, so it was turned OFF, even though our libedit supports the wchar API. I can fix this on our end, but more generally, is there a way to condition this test on the value of that define? It seems like this wouldn't work otherwise, although I'm very fuzzy on the precise semantics of Unicode in editline/LLDB.

@JDevlieghere
Copy link
Member Author

I added a configuration entry and corresponding decorator in #66447.

ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
Account for Unicode when computing the prompt column width. Previously,
the string length (i.e. number of bytes) rather than the width of the
Glyph was used to compute the cursor position. The result was that the
cursor would be offset to the right when using a prompt containing
Unicode.
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Oct 16, 2023
Account for Unicode when computing the prompt column width. Previously,
the string length (i.e. number of bytes) rather than the width of the
Glyph was used to compute the cursor position. The result was that the
cursor would be offset to the right when using a prompt containing
Unicode.

(cherry picked from commit 850e90c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants