Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Support frame recognizer regexp on mangled names. #105756

Closed
wants to merge 1 commit into from

Conversation

adrian-prantl
Copy link
Collaborator

Instead of doing the coarse-grained initial matching of frame recognizers on fully demangled names, it can be much more efficient and reliable to filter on all functions of a particular language by discriminating on the mangled symbol name. This way a recognizer can be registered that should run on all functions of a particular language by matching on its mangling prefix(es).

Instead of doing the coarse-grained initial matching of frame
recognizers on fully demangled names, it can be much more efficient
and reliable to filter on all functions of a particular language by
discriminating on the mangled symbol name. This way a recognizer can
be registered that should run on all functions of a particular
language by matching on its mangling prefix(es).
@llvmbot
Copy link
Member

llvmbot commented Aug 23, 2024

@llvm/pr-subscribers-lldb

Author: Adrian Prantl (adrian-prantl)

Changes

Instead of doing the coarse-grained initial matching of frame recognizers on fully demangled names, it can be much more efficient and reliable to filter on all functions of a particular language by discriminating on the mangled symbol name. This way a recognizer can be registered that should run on all functions of a particular language by matching on its mangling prefix(es).


Full diff: https://github.com/llvm/llvm-project/pull/105756.diff

8 Files Affected:

  • (modified) lldb/include/lldb/Target/StackFrameRecognizer.h (+8-1)
  • (modified) lldb/source/Commands/CommandObjectFrame.cpp (+35-19)
  • (modified) lldb/source/Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.cpp (+1)
  • (modified) lldb/source/Target/AssertFrameRecognizer.cpp (+1)
  • (modified) lldb/source/Target/StackFrameRecognizer.cpp (+14-9)
  • (modified) lldb/source/Target/VerboseTrapFrameRecognizer.cpp (+2-1)
  • (modified) lldb/test/API/commands/frame/recognizer/TestFrameRecognizer.py (+3-3)
  • (modified) lldb/unittests/Target/StackFrameRecognizerTest.cpp (+2-1)
diff --git a/lldb/include/lldb/Target/StackFrameRecognizer.h b/lldb/include/lldb/Target/StackFrameRecognizer.h
index 8acebc12c4b1dc..a1757530870a02 100644
--- a/lldb/include/lldb/Target/StackFrameRecognizer.h
+++ b/lldb/include/lldb/Target/StackFrameRecognizer.h
@@ -109,15 +109,21 @@ class StackFrameRecognizerManager {
                      ConstString module, llvm::ArrayRef<ConstString> symbols,
                      bool first_instruction_only = true);
 
+  /// Add a new recognizer that triggers on a symbol regex.
+  ///
+  /// \param symbol_mangling controls whether the regex should apply
+  /// to the mangled or demangled name.
   void AddRecognizer(lldb::StackFrameRecognizerSP recognizer,
                      lldb::RegularExpressionSP module,
                      lldb::RegularExpressionSP symbol,
+                     Mangled::NamePreference symbol_mangling,
                      bool first_instruction_only = true);
 
   void ForEach(std::function<
                void(uint32_t recognizer_id, std::string recognizer_name,
                     std::string module, llvm::ArrayRef<ConstString> symbols,
-                    bool regexp)> const &callback);
+                    Mangled::NamePreference name_reference, bool regexp)> const
+                   &callback);
 
   bool RemoveRecognizerWithID(uint32_t recognizer_id);
 
@@ -142,6 +148,7 @@ class StackFrameRecognizerManager {
     lldb::RegularExpressionSP module_regexp;
     std::vector<ConstString> symbols;
     lldb::RegularExpressionSP symbol_regexp;
+    Mangled::NamePreference symbol_mangling;
     bool first_instruction_only;
   };
 
diff --git a/lldb/source/Commands/CommandObjectFrame.cpp b/lldb/source/Commands/CommandObjectFrame.cpp
index 46c75e3dd159c0..2a9605d2072dae 100644
--- a/lldb/source/Commands/CommandObjectFrame.cpp
+++ b/lldb/source/Commands/CommandObjectFrame.cpp
@@ -899,7 +899,8 @@ void CommandObjectFrameRecognizerAdd::DoExecute(Args &command,
     auto func =
         RegularExpressionSP(new RegularExpression(m_options.m_symbols.front()));
     GetTarget().GetFrameRecognizerManager().AddRecognizer(
-        recognizer_sp, module, func, m_options.m_first_instruction_only);
+        recognizer_sp, module, func, Mangled::NamePreference::ePreferDemangled,
+        m_options.m_first_instruction_only);
   } else {
     auto module = ConstString(m_options.m_module);
     std::vector<ConstString> symbols(m_options.m_symbols.begin(),
@@ -927,6 +928,32 @@ class CommandObjectFrameRecognizerClear : public CommandObjectParsed {
   }
 };
 
+static void
+PrintRecognizerDetails(Stream &strm, const std::string &module,
+                       llvm::ArrayRef<lldb_private::ConstString> symbols,
+                       Mangled::NamePreference preference, bool regexp) {
+  if (!module.empty())
+    strm << ", module " << module;
+  for (auto &symbol : symbols) {
+    strm << ", ";
+    if (!regexp)
+      strm << "symbol";
+    else
+      switch (preference) {
+      case Mangled::NamePreference ::ePreferMangled:
+        strm << "mangled symbol regexp";
+        break;
+      case Mangled::NamePreference ::ePreferDemangled:
+        strm << "demangled symbol regexp";
+        break;
+      case Mangled::NamePreference ::ePreferDemangledWithoutArguments:
+        strm << "demangled (no args) symbol regexp";
+        break;
+      }
+    strm << " " << symbol;
+  }
+}
+
 class CommandObjectFrameRecognizerDelete : public CommandObjectParsed {
 public:
   CommandObjectFrameRecognizerDelete(CommandInterpreter &interpreter)
@@ -947,19 +974,13 @@ class CommandObjectFrameRecognizerDelete : public CommandObjectParsed {
     GetTarget().GetFrameRecognizerManager().ForEach(
         [&request](uint32_t rid, std::string rname, std::string module,
                    llvm::ArrayRef<lldb_private::ConstString> symbols,
-                   bool regexp) {
+                   Mangled::NamePreference preference, bool regexp) {
           StreamString strm;
           if (rname.empty())
             rname = "(internal)";
 
           strm << rname;
-          if (!module.empty())
-            strm << ", module " << module;
-          if (!symbols.empty())
-            for (auto &symbol : symbols)
-              strm << ", symbol " << symbol;
-          if (regexp)
-            strm << " (regexp)";
+          PrintRecognizerDetails(strm, module, symbols, preference, regexp);
 
           request.TryCompleteCurrentArg(std::to_string(rid), strm.GetString());
         });
@@ -1016,22 +1037,17 @@ class CommandObjectFrameRecognizerList : public CommandObjectParsed {
   void DoExecute(Args &command, CommandReturnObject &result) override {
     bool any_printed = false;
     GetTarget().GetFrameRecognizerManager().ForEach(
-        [&result, &any_printed](
-            uint32_t recognizer_id, std::string name, std::string module,
-            llvm::ArrayRef<ConstString> symbols, bool regexp) {
+        [&result,
+         &any_printed](uint32_t recognizer_id, std::string name,
+                       std::string module, llvm::ArrayRef<ConstString> symbols,
+                       Mangled::NamePreference preference, bool regexp) {
           Stream &stream = result.GetOutputStream();
 
           if (name.empty())
             name = "(internal)";
 
           stream << std::to_string(recognizer_id) << ": " << name;
-          if (!module.empty())
-            stream << ", module " << module;
-          if (!symbols.empty())
-            for (auto &symbol : symbols)
-              stream << ", symbol " << symbol;
-          if (regexp)
-            stream << " (regexp)";
+          PrintRecognizerDetails(stream, module, symbols, preference, regexp);
 
           stream.EOL();
           stream.Flush();
diff --git a/lldb/source/Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.cpp b/lldb/source/Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.cpp
index c60200ab186d09..a24cc884e5799c 100644
--- a/lldb/source/Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.cpp
+++ b/lldb/source/Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.cpp
@@ -82,6 +82,7 @@ CPPLanguageRuntime::CPPLanguageRuntime(Process *process)
     process->GetTarget().GetFrameRecognizerManager().AddRecognizer(
         StackFrameRecognizerSP(new LibCXXFrameRecognizer()), {},
         std::make_shared<RegularExpression>("^std::__1::"),
+        Mangled::NamePreference::ePreferDemangledWithoutArguments,
         /*first_instruction_only*/ false);
 }
 
diff --git a/lldb/source/Target/AssertFrameRecognizer.cpp b/lldb/source/Target/AssertFrameRecognizer.cpp
index da7c102645c014..1fc8f77d6f15ce 100644
--- a/lldb/source/Target/AssertFrameRecognizer.cpp
+++ b/lldb/source/Target/AssertFrameRecognizer.cpp
@@ -112,6 +112,7 @@ void RegisterAssertFrameRecognizer(Process *process) {
       std::make_shared<AssertFrameRecognizer>(),
       std::make_shared<RegularExpression>(std::move(module_re)),
       std::make_shared<RegularExpression>(std::move(symbol_re)),
+      Mangled::ePreferMangled,
       /*first_instruction_only*/ false);
 }
 
diff --git a/lldb/source/Target/StackFrameRecognizer.cpp b/lldb/source/Target/StackFrameRecognizer.cpp
index 44411afc65dda9..2664c91fdc16d0 100644
--- a/lldb/source/Target/StackFrameRecognizer.cpp
+++ b/lldb/source/Target/StackFrameRecognizer.cpp
@@ -63,25 +63,28 @@ void StackFrameRecognizerManager::BumpGeneration() {
 void StackFrameRecognizerManager::AddRecognizer(
     StackFrameRecognizerSP recognizer, ConstString module,
     llvm::ArrayRef<ConstString> symbols, bool first_instruction_only) {
-  m_recognizers.push_front({(uint32_t)m_recognizers.size(), recognizer, false,
-                            module, RegularExpressionSP(), symbols,
-                            RegularExpressionSP(), first_instruction_only});
+  m_recognizers.push_front(
+      {(uint32_t)m_recognizers.size(), recognizer, false, module,
+       RegularExpressionSP(), symbols, RegularExpressionSP(),
+       Mangled::NamePreference::ePreferMangled, first_instruction_only});
   BumpGeneration();
 }
 
 void StackFrameRecognizerManager::AddRecognizer(
     StackFrameRecognizerSP recognizer, RegularExpressionSP module,
-    RegularExpressionSP symbol, bool first_instruction_only) {
+    RegularExpressionSP symbol, Mangled::NamePreference symbol_mangling,
+    bool first_instruction_only) {
   m_recognizers.push_front({(uint32_t)m_recognizers.size(), recognizer, true,
                             ConstString(), module, std::vector<ConstString>(),
-                            symbol, first_instruction_only});
+                            symbol, symbol_mangling, first_instruction_only});
   BumpGeneration();
 }
 
 void StackFrameRecognizerManager::ForEach(
     const std::function<void(uint32_t, std::string, std::string,
-                             llvm::ArrayRef<ConstString>, bool)> &callback) {
-  for (auto entry : m_recognizers) {
+                             llvm::ArrayRef<ConstString>,
+                             Mangled::NamePreference, bool)> &callback) {
+  for (const auto &entry : m_recognizers) {
     if (entry.is_regexp) {
       std::string module_name;
       std::string symbol_name;
@@ -92,11 +95,13 @@ void StackFrameRecognizerManager::ForEach(
         symbol_name = entry.symbol_regexp->GetText().str();
 
       callback(entry.recognizer_id, entry.recognizer->GetName(), module_name,
-               llvm::ArrayRef(ConstString(symbol_name)), true);
+               llvm::ArrayRef(ConstString(symbol_name)), entry.symbol_mangling,
+               true);
 
     } else {
       callback(entry.recognizer_id, entry.recognizer->GetName(),
-               entry.module.GetCString(), entry.symbols, false);
+               entry.module.GetCString(), entry.symbols, entry.symbol_mangling,
+               false);
     }
   }
 }
diff --git a/lldb/source/Target/VerboseTrapFrameRecognizer.cpp b/lldb/source/Target/VerboseTrapFrameRecognizer.cpp
index fe72c8aec570d3..3d20eb2136b5ad 100644
--- a/lldb/source/Target/VerboseTrapFrameRecognizer.cpp
+++ b/lldb/source/Target/VerboseTrapFrameRecognizer.cpp
@@ -116,7 +116,8 @@ void RegisterVerboseTrapFrameRecognizer(Process &process) {
       std::make_shared<VerboseTrapFrameRecognizer>();
 
   process.GetTarget().GetFrameRecognizerManager().AddRecognizer(
-      srf_recognizer_sp, module_regex_sp, symbol_regex_sp, false);
+      srf_recognizer_sp, module_regex_sp, symbol_regex_sp,
+      Mangled::ePreferMangled, false);
 }
 
 } // namespace lldb_private
diff --git a/lldb/test/API/commands/frame/recognizer/TestFrameRecognizer.py b/lldb/test/API/commands/frame/recognizer/TestFrameRecognizer.py
index 6174ac61a709dd..df2ad60fddf088 100644
--- a/lldb/test/API/commands/frame/recognizer/TestFrameRecognizer.py
+++ b/lldb/test/API/commands/frame/recognizer/TestFrameRecognizer.py
@@ -45,7 +45,7 @@ def test_frame_recognizer_1(self):
         self.expect(
             "frame recognizer list",
             substrs=[
-                "1: recognizer.MyOtherFrameRecognizer, module a.out, symbol bar (regexp)",
+                "1: recognizer.MyOtherFrameRecognizer, module a.out, demangled symbol regexp bar",
                 "0: recognizer.MyFrameRecognizer, module a.out, symbol foo",
             ],
         )
@@ -56,7 +56,7 @@ def test_frame_recognizer_1(self):
         self.expect(
             "frame recognizer list",
             substrs=[
-                "1: recognizer.MyOtherFrameRecognizer, module a.out, symbol bar (regexp)"
+                "1: recognizer.MyOtherFrameRecognizer, module a.out, demangled symbol regexp bar"
             ],
         )
         self.expect(
@@ -79,7 +79,7 @@ def test_frame_recognizer_1(self):
         self.expect(
             "frame recognizer list",
             substrs=[
-                "1: recognizer.MyOtherFrameRecognizer, module a.out, symbol bar (regexp)"
+                "1: recognizer.MyOtherFrameRecognizer, module a.out, demangled symbol regexp bar"
             ],
         )
         self.expect(
diff --git a/lldb/unittests/Target/StackFrameRecognizerTest.cpp b/lldb/unittests/Target/StackFrameRecognizerTest.cpp
index 695f091227d6a0..450e4babe0cd08 100644
--- a/lldb/unittests/Target/StackFrameRecognizerTest.cpp
+++ b/lldb/unittests/Target/StackFrameRecognizerTest.cpp
@@ -55,7 +55,7 @@ void RegisterDummyStackFrameRecognizer(StackFrameRecognizerManager &manager) {
   StackFrameRecognizerSP dummy_recognizer_sp(new DummyStackFrameRecognizer());
 
   manager.AddRecognizer(dummy_recognizer_sp, module_regex_sp, symbol_regex_sp,
-                        false);
+                        Mangled::NamePreference::ePreferDemangled, false);
 }
 
 } // namespace
@@ -72,6 +72,7 @@ TEST_F(StackFrameRecognizerTest, NullModuleRegex) {
   manager.ForEach([&any_printed](uint32_t recognizer_id, std::string name,
                                  std::string function,
                                  llvm::ArrayRef<ConstString> symbols,
+                                 Mangled::NamePreference preference,
                                  bool regexp) { any_printed = true; });
 
   EXPECT_TRUE(any_printed);


} else {
callback(entry.recognizer_id, entry.recognizer->GetName(),
entry.module.GetCString(), entry.symbols, false);
entry.module.GetCString(), entry.symbols, entry.symbol_mangling,
false);
}
}
}
Copy link
Member

@vogelsgesang vogelsgesang Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict, the matching in GetRecognizerForFrame also needs to be adjusted.

In #105695, you can find an implementation of pretty much the same idea. However, I implemented only GetRecognizerForFrame and skipped, e.g., ForEach

Copy link
Member

@vogelsgesang vogelsgesang Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to copy-paste / adjust whichever code from #105695 you might find useful. I will rebase that commit later on, after your changes landed. (I cannot land #105695 before #104523 gets re-applied, anyway)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah looks like we had the same idea!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let you commit your PR instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I relanded everything in 11d2de4.

Mangled::NamePreference preference, bool regexp) {
if (!module.empty())
strm << ", module " << module;
for (auto &symbol : symbols) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be fancy and use an llvm::interleaveComma here

@adrian-prantl
Copy link
Collaborator Author

Retiring in favor of #105695.

@vogelsgesang
Copy link
Member

@adrian-prantl Now that #105695 is merged, do you still want to open a follow-up pull request to change the assert-recognizer and the verbose-trap-recognizer to use ePreferMangled?

@adrian-prantl
Copy link
Collaborator Author

@adrian-prantl Now that #105695 is merged, do you still want to open a follow-up pull request to change the assert-recognizer and the verbose-trap-recognizer to use ePreferMangled?

That would be a nice improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants