Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang] [NFC] Introduce DynamicRecursiveASTVisitor #105195

Closed
wants to merge 32 commits into from

Conversation

Sirraide
Copy link
Member

@Sirraide Sirraide commented Aug 20, 2024

This pr has been split into multiple parts. The first one is here: #110040.

First, please see #93462 if you’re not already familiar w/ what this is about.

This introduces a variant of RecursiveASTVisitor (RAV), called DynamicRecursiveASTVisitor (DRAV), that uses virtual functions instead of CRTP in an attempt to combat long compile times and increasing binary size incurred by stamping out dozens of RAV instantiations.

This pr may look big, but most of the changes are very repetitive (it’s mostly just adding override like 200 places or so and changing : RecursiveASTVisitor<Foo> to : DynamicRecursiveASTVisitor).

This pr does NOT remove the old RAV, because

  1. Downstream forks may be reliant on it, and
  2. There are some things that the DRAV currently does not support, mostly because they are either
    1. complicated or impossible to implement using virtual functions (post-order traversal); or
    2. basically no-one is using them (e.g. overriding WalkUpFromX or getStmtChildren(); there are about 5 or so visitors
      in the entire codebase that override those functions; also post-order traversal) and making them virtual might have a negative impact on performance; or
    3. I candidly have no idea what to do with them (LexicallyOrderedRecursiveASTVisitor; I’m still not sure I understand how that one works).

There is TableGen support for traversing attributes, but for the same reasons as 2ii and 2iii above, it isn’t currently supported.

Implementation

The DRAV is currently based on the RAV, mostly using a bunch of X macros. This means that, when a new AST node is added, the code for traversing it only needs to be added to the CRTP-based RAV.

The DRAV is based on the principle that no template instantiation (of the RAV) should be necessary to use it (otherwise, that wouldn’t really solve the problem as the problem is all the template instantiation). To that end, the DRAV is customised using virtual functions instead.

The inner workings of the DRAV are a bit arcane (at least it took me a while to figure out a way to get this to work; I’ll be adding some more comments for this soon. Also, if anyone can think of a better approach, please let me know): The .cpp file for the DRAV contains a struct called Impl that derives from the RAV like usual using CTRP. This Impl struct contains a pointer to an instance of (a derived class of) the DRAV and overrides every member function of the RAV that is marked virtual in the DRAV to delegate to it instead via a virtual call.

This will then either execute the implementation of that function in the derived class; if the base class function of DRAV itself is called (either by the derived class or because the derived class doesn’t override it), then it will statically create a new Impl instance, passing in itself (which is a no-op because it has a single reference member and the constructor does nothing), and then call the RAV’s base implementation of the corresponding function (which then executes the code in the old RAV and so on).

Results

To see how effective this would be, a lot of visitors all over the codebase have been migrated to use the dynamic visitor. The results of this refactor can be viewed here (if you look at the branch in llvm-compile-time-tracker and see that there are a few commits where everything got magically faster; those are merge commits, so please ignore those; the link here compares the current state of this branch to the commit on trunk that it is based on to eliminate any effects that unrelated changes might have had).

However, it is worth noting that using virtual functions for this does seem to be measurably slower than using CRTP. I have more or less indiscriminately migrated visitors regardless of how often they are used to see how much of a slowdown this would incur. Furthermore, I also checked how often each visitor is being instantiated when compiling Clang itself. The results of that are shown below.

I’ve only checked how often a few select functions were being called; reverting one of the most commonly used visitors (by any of the three tables below) doesn’t really seem to change anything, though (the only thing it does is make Clang’s binary size go up). If anyone has a better idea as to how to determine what visitors are causing (most of) the performance loss, please let me know.

Visitor Instances created
MarkReferencedDecls 142,088,373
FindCXXThisExpr 82,500,552
CollectUnexpandedParameterPacksVisitor 67,408,408
FallthroughMapper 30,640,329
ImmediateCallVisitor 2,779,953
LocalTypedefNameReferencer 841,018
MarkUsedTemplateParameterVisitor 2,448
Visitor Calls to TraverseStmt()
FallthroughMapper 426,518,146
CollectUnexpandedParameterPacksVisitor 31,363,671
ImmediateCallVisitor 8,163,769
FindCXXThisExpr 3,776,540
MarkReferencedDecls 105,877
LocalTypedefNameReferencer 2,529
MarkUsedTemplateParameterVisitor 2,448
Visitor Calls to TraverseType()
MarkReferencedDecls 232,417,799
CollectUnexpandedParameterPacksVisitor 21,681,491
FallthroughMapper 14,102,937
LocalTypedefNameReferencer 2,163,902
ImmediateCallVisitor 935,887
FindCXXThisExpr 36
MarkUsedTemplateParameterVisitor 0

@Sirraide Sirraide added the clang Clang issues not falling into any other category label Aug 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 20, 2024

@llvm/pr-subscribers-clang

Author: None (Sirraide)

Changes

First, please see #93462 if you’re not already familiar w/ what this is about.

This introduces a variant of RecursiveASTVisitor (RAV), called DynamicRecursiveASTVisitor (DRAV), that uses virtual functions instead of CRTP in an attempt to combat long compile times and increasing binary size incurred by stamping out dozens of RAV instantiations.

This pr may look big, but most of the changes are very repetitive (it’s mostly just adding override like 200 places or so and changing : RecursiveASTVisitor&lt;Foo&gt; to : DynamicRecursiveASTVisitor).

This pr does NOT remove the old RAV, because

  1. Downstream forks may be reliant on it, and
  2. There are some things that the DRAV currently does not support, mostly because they are either
    1. complicated or impossible to implement using virtual functions (post-order traversal); or
    2. basically no-one is using them (e.g. overriding WalkUpFromX or getStmtChildren(); there are about 5 or so visitors
      in the entire codebase that override those functions; also post-order traversal) and making them virtual might have a negative impact on performance; or
    3. I candidly have no idea what to do with them (LexicallyOrderedRecursiveASTVisitor; I’m still not sure I understand how that one works).

There is TableGen support for traversing attributes, but for the same reasons as 2ii and 2iii above, it isn’t currently supported.

Implementation

The DRAV is currently based on the RAV, mostly using a bunch of X macros. This means that, when a new AST node is added, the code for traversing it only needs to be added to the CRTP-based RAV.

The DRAV is based on the principle that no template instantiation (of the RAV) should be necessary to use it (otherwise, that wouldn’t really solve the problem as the problem is all the template instantiation). To that end, the DRAV is customised using virtual functions instead.

The inner workings of the DRAV are a bit arcane (at least it took me a while to figure out a way to get this to work; I’ll be adding some more comments for this soon. Also, if anyone can think of a better approach, please let me know): The .cpp file for the DRAV contains a struct called Impl that derives from the RAV like usual using CTRP. This Impl struct contains a pointer to an instance of (a derived class of) the DRAV and overrides every member function of the RAV that is marked virtual in the DRAV to delegate to it instead via a virtual call.

This will then either execute the implementation of that function in the derived class; if the base class function of DRAV itself is called (either by the derived class or because the derived class doesn’t override it), then it will statically create a new Impl instance, passing in itself (which is a no-op because it has a single reference member and the constructor does nothing), and then call the RAV’s base implementation of the corresponding function (which then executes the code in the old RAV and so on).

Results

To see how effective this would be, a lot of visitors all over the codebase have been migrated to use the dynamic visitor. The results of this refactor can be viewed here (if you look at the branch in llvm-compile-time-tracker and see that there are a few commits where everything got magically faster; those are merge commits, so please ignore those; the link here compares the current state of this branch to the commit on trunk that it is based on to eliminate any effects that unrelated changes might have had).

However, it is worth noting that using virtual functions for this does seem to be measurably slower than using CRTP. I have more or less indiscriminately migrated visitors regardless of how often they are used to see how much of a slowdown this would incur. Furthermore, I also checked how often each visitor is being instantiated when compiling Clang itself. The results of that are shown below.

I’ve only checked how often a few select functions were being called; reverting one of the most commonly used visitors (by any of the three tables below) doesn’t really seem to change anything, though (the only thing it does is make Clang’s binary size go up). If anyone has a better idea as to how to determine what visitors are causing (most of) the performance loss, please let me know.

Visitor Instances created
MarkReferencedDecls 142,088,373
FindCXXThisExpr 82,500,552
CollectUnexpandedParameterPacksVisitor 67,408,408
FallthroughMapper 30,640,329
ImmediateCallVisitor 2,779,953
LocalTypedefNameReferencer 841,018
MarkUsedTemplateParameterVisitor 2,448
Visitor Calls to TraverseStmt()
FallthroughMapper 426,518,146
CollectUnexpandedParameterPacksVisitor 31,363,671
ImmediateCallVisitor 8,163,769
FindCXXThisExpr 3,776,540
MarkReferencedDecls 105,877
LocalTypedefNameReferencer 2,529
MarkUsedTemplateParameterVisitor 2,448
Visitor Calls to TraverseType()
MarkReferencedDecls 232,417,799
CollectUnexpandedParameterPacksVisitor 21,681,491
FallthroughMapper 14,102,937
LocalTypedefNameReferencer 2,163,902
ImmediateCallVisitor 935,887
FindCXXThisExpr 36
MarkUsedTemplateParameterVisitor 0

Patch is 323.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105195.diff

121 Files Affected:

  • (modified) clang/docs/RAVFrontendAction.rst (+8-9)
  • (modified) clang/examples/CallSuperAttribute/CallSuperAttrInfo.cpp (+7-5)
  • (modified) clang/examples/PrintFunctionNames/PrintFunctionNames.cpp (+4-3)
  • (added) clang/include/clang/AST/DynamicRecursiveASTVisitor.h (+252)
  • (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+5-3)
  • (modified) clang/include/clang/Analysis/CallGraph.h (+6-9)
  • (modified) clang/include/clang/Analysis/FlowSensitive/ASTOps.h (+16-15)
  • (modified) clang/include/clang/InstallAPI/Visitor.h (+11-9)
  • (modified) clang/include/clang/Tooling/Refactoring/RecursiveSymbolVisitor.h (+13-18)
  • (modified) clang/lib/ARCMigrate/ObjCMT.cpp (+58-55)
  • (modified) clang/lib/ARCMigrate/TransAPIUses.cpp (+2-2)
  • (modified) clang/lib/ARCMigrate/TransARCAssign.cpp (+2-2)
  • (modified) clang/lib/ARCMigrate/TransAutoreleasePool.cpp (+6-7)
  • (modified) clang/lib/ARCMigrate/TransBlockObjCVariable.cpp (+8-10)
  • (modified) clang/lib/ARCMigrate/TransEmptyStatementsAndDealloc.cpp (+3-4)
  • (modified) clang/lib/ARCMigrate/TransGCAttrs.cpp (+7-9)
  • (modified) clang/lib/ARCMigrate/TransGCCalls.cpp (+3-5)
  • (modified) clang/lib/ARCMigrate/TransProtectedScope.cpp (+4-4)
  • (modified) clang/lib/ARCMigrate/TransZeroOutPropsInDealloc.cpp (+9-12)
  • (modified) clang/lib/ARCMigrate/Transforms.cpp (+24-25)
  • (modified) clang/lib/ARCMigrate/Transforms.h (+1)
  • (modified) clang/lib/AST/ASTImporterLookupTable.cpp (+10-10)
  • (modified) clang/lib/AST/CMakeLists.txt (+1)
  • (added) clang/lib/AST/DynamicRecursiveASTVisitor.cpp (+387)
  • (modified) clang/lib/AST/ParentMapContext.cpp (+19-23)
  • (modified) clang/lib/AST/StmtOpenACC.cpp (+4-4)
  • (modified) clang/lib/ASTMatchers/ASTMatchFinder.cpp (+77-89)
  • (modified) clang/lib/ASTMatchers/GtestMatchers.cpp (-1)
  • (modified) clang/lib/Analysis/CallGraph.cpp (+3)
  • (modified) clang/lib/Analysis/CalledOnceCheck.cpp (+4-4)
  • (modified) clang/lib/Analysis/FlowSensitive/ASTOps.cpp (+7-8)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+5-6)
  • (modified) clang/lib/Analysis/ReachableCode.cpp (+8-6)
  • (modified) clang/lib/Analysis/UnsafeBufferUsage.cpp (+22-28)
  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+10-11)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+10-11)
  • (modified) clang/lib/CodeGen/CodeGenPGO.cpp (+25-27)
  • (modified) clang/lib/CodeGen/ObjectFilePCHContainerWriter.cpp (+7-7)
  • (modified) clang/lib/Frontend/ASTConsumers.cpp (+12-14)
  • (modified) clang/lib/Frontend/InterfaceStubFunctionsConsumer.cpp (+3-3)
  • (modified) clang/lib/Index/IndexBody.cpp (+40-41)
  • (modified) clang/lib/Index/IndexTypeSourceInfo.cpp (+20-21)
  • (modified) clang/lib/InstallAPI/Visitor.cpp (+5-5)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+16-18)
  • (modified) clang/lib/Sema/SemaAvailability.cpp (+24-20)
  • (modified) clang/lib/Sema/SemaCodeComplete.cpp (+8-8)
  • (modified) clang/lib/Sema/SemaConcept.cpp (-1)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+4-5)
  • (modified) clang/lib/Sema/SemaDeclCXX.cpp (+25-28)
  • (modified) clang/lib/Sema/SemaDeclObjC.cpp (+4-5)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+24-25)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (+3-3)
  • (modified) clang/lib/Sema/SemaHLSL.cpp (+5-6)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+5-5)
  • (modified) clang/lib/Sema/SemaTemplate.cpp (+16-18)
  • (modified) clang/lib/Sema/SemaTemplateDeduction.cpp (+6-8)
  • (modified) clang/lib/Sema/SemaTemplateDeductionGuide.cpp (+6-7)
  • (modified) clang/lib/Sema/SemaTemplateVariadic.cpp (+203-205)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CastToStructChecker.cpp (+4-4)
  • (modified) clang/lib/StaticAnalyzer/Checkers/DeadStoresChecker.cpp (+8-8)
  • (modified) clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp (+4-4)
  • (modified) clang/lib/StaticAnalyzer/Checkers/IdenticalExprChecker.cpp (+8-9)
  • (modified) clang/lib/StaticAnalyzer/Checkers/LocalizationChecker.cpp (+14-14)
  • (modified) clang/lib/StaticAnalyzer/Checkers/ObjCMissingSuperCallChecker.cpp (+3-3)
  • (modified) clang/lib/StaticAnalyzer/Checkers/PaddingChecker.cpp (+8-7)
  • (modified) clang/lib/StaticAnalyzer/Checkers/WebKit/NoUncountedMembersChecker.cpp (+5-6)
  • (modified) clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp (+5-6)
  • (modified) clang/lib/StaticAnalyzer/Checkers/WebKit/UncountedCallArgsChecker.cpp (+7-9)
  • (modified) clang/lib/StaticAnalyzer/Checkers/WebKit/UncountedLambdaCapturesChecker.cpp (+5-6)
  • (modified) clang/lib/StaticAnalyzer/Checkers/WebKit/UncountedLocalVarsChecker.cpp (+16-19)
  • (modified) clang/lib/StaticAnalyzer/Core/BugSuppression.cpp (+4-4)
  • (modified) clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp (+10-10)
  • (modified) clang/lib/Tooling/ASTDiff/ASTDiff.cpp (+10-9)
  • (modified) clang/lib/Tooling/Refactoring/Rename/USRFinder.cpp (+5-7)
  • (modified) clang/lib/Tooling/Refactoring/Rename/USRFindingAction.cpp (+6-5)
  • (modified) clang/lib/Tooling/Refactoring/Rename/USRLocFinder.cpp (+11-12)
  • (modified) clang/unittests/AST/EvaluateAsRValueTest.cpp (+5-5)
  • (modified) clang/unittests/Analysis/CloneDetectionTest.cpp (+3-4)
  • (modified) clang/unittests/Frontend/FrontendActionTest.cpp (+3-3)
  • (modified) clang/unittests/Tooling/ASTSelectionTest.cpp (+2-2)
  • (added) clang/unittests/Tooling/CRTPTestVisitor.h (+42)
  • (modified) clang/unittests/Tooling/CastExprTest.cpp (+3-3)
  • (modified) clang/unittests/Tooling/CommentHandlerTest.cpp (+3-5)
  • (modified) clang/unittests/Tooling/ExecutionTest.cpp (+4-6)
  • (modified) clang/unittests/Tooling/LexicallyOrderedRecursiveASTVisitorTest.cpp (+3-2)
  • (modified) clang/unittests/Tooling/LookupTest.cpp (+5-5)
  • (modified) clang/unittests/Tooling/QualTypeNamesTest.cpp (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTestDeclVisitor.cpp (+9-9)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTestPostOrderVisitor.cpp (+8-4)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTestTypeLocVisitor.cpp (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/Attr.cpp (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/BitfieldInitializer.cpp (+2-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/CXXBoolLiteralExpr.cpp (+2-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/CXXMemberCall.cpp (+2-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/CXXMethodDecl.cpp (+7-11)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/CXXOperatorCallExprTraverser.cpp (+3-5)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/CallbacksCommon.h (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/Class.cpp (+3-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/Concept.cpp (+61-53)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/ConstructExpr.cpp (+6-14)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/DeclRefExpr.cpp (+5-12)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/DeductionGuide.cpp (+7-10)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/ImplicitCtor.cpp (+2-5)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/ImplicitCtorInitializer.cpp (+6-12)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/InitListExprPostOrder.cpp (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/InitListExprPostOrderNoQueue.cpp (+3-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/InitListExprPreOrder.cpp (+5-10)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/InitListExprPreOrderNoQueue.cpp (+3-4)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/IntegerLiteral.cpp (+2-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/LambdaDefaultCapture.cpp (+2-3)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/LambdaExpr.cpp (+10-9)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/LambdaTemplateParams.cpp (+7-6)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/MemberPointerTypeLoc.cpp (+3-4)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/NestedNameSpecifiers.cpp (+3-4)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/ParenExpr.cpp (+2-2)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/TemplateArgumentLocTraverser.cpp (+3-5)
  • (modified) clang/unittests/Tooling/RecursiveASTVisitorTests/TraversalScope.cpp (+3-3)
  • (modified) clang/unittests/Tooling/RefactoringTest.cpp (+9-12)
  • (modified) clang/unittests/Tooling/SourceCodeTest.cpp (+29-26)
  • (modified) clang/unittests/Tooling/TestVisitor.h (+86-64)
  • (modified) clang/utils/TableGen/ClangAttrEmitter.cpp (+38-4)
diff --git a/clang/docs/RAVFrontendAction.rst b/clang/docs/RAVFrontendAction.rst
index 2e387b4b339d6c..28a99d1744b7b5 100644
--- a/clang/docs/RAVFrontendAction.rst
+++ b/clang/docs/RAVFrontendAction.rst
@@ -70,10 +70,9 @@ CXXRecordDecl's.
 
 ::
 
-      class FindNamedClassVisitor
-        : public RecursiveASTVisitor<FindNamedClassVisitor> {
+      class FindNamedClassVisitor : public DynamicRecursiveASTVisitor {
       public:
-        bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
+        bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) override {
           // For debugging, dumping the AST nodes will show which nodes are already
           // being visited.
           Declaration->dump();
@@ -91,7 +90,7 @@ can check for a specific qualified name:
 
 ::
 
-      bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
+      bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) override {
         if (Declaration->getQualifiedNameAsString() == "n::m::C")
           Declaration->dump();
         return true;
@@ -122,7 +121,7 @@ locations:
 
 ::
 
-      bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
+      bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) override {
         if (Declaration->getQualifiedNameAsString() == "n::m::C") {
           // getFullLoc uses the ASTContext's SourceManager to resolve the source
           // location and break it up into its line and column parts.
@@ -143,20 +142,20 @@ Now we can combine all of the above into a small example program:
 ::
 
       #include "clang/AST/ASTConsumer.h"
-      #include "clang/AST/RecursiveASTVisitor.h"
+      #include "clang/AST/DynamicRecursiveASTVisitor.h"
+      #include "clang/AST/DeclCXX.h"
       #include "clang/Frontend/CompilerInstance.h"
       #include "clang/Frontend/FrontendAction.h"
       #include "clang/Tooling/Tooling.h"
 
       using namespace clang;
 
-      class FindNamedClassVisitor
-        : public RecursiveASTVisitor<FindNamedClassVisitor> {
+      class FindNamedClassVisitor final : public DynamicRecursiveASTVisitor {
       public:
         explicit FindNamedClassVisitor(ASTContext *Context)
           : Context(Context) {}
 
-        bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
+        bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) override {
           if (Declaration->getQualifiedNameAsString() == "n::m::C") {
             FullSourceLoc FullLocation = Context->getFullLoc(Declaration->getBeginLoc());
             if (FullLocation.isValid())
diff --git a/clang/examples/CallSuperAttribute/CallSuperAttrInfo.cpp b/clang/examples/CallSuperAttribute/CallSuperAttrInfo.cpp
index 12d4c311586e6f..036c54e593dab4 100644
--- a/clang/examples/CallSuperAttribute/CallSuperAttrInfo.cpp
+++ b/clang/examples/CallSuperAttribute/CallSuperAttrInfo.cpp
@@ -25,7 +25,7 @@
 
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/Attr.h"
-#include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/DynamicRecursiveASTVisitor.h"
 #include "clang/Frontend/FrontendPluginRegistry.h"
 #include "clang/Sema/ParsedAttr.h"
 #include "clang/Sema/Sema.h"
@@ -41,13 +41,14 @@ bool isMarkedAsCallSuper(const CXXMethodDecl *D) {
   return MarkedMethods.contains(D);
 }
 
-class MethodUsageVisitor : public RecursiveASTVisitor<MethodUsageVisitor> {
+class MethodUsageVisitor final : public DynamicRecursiveASTVisitor {
 public:
   bool IsOverriddenUsed = false;
   explicit MethodUsageVisitor(
       llvm::SmallPtrSet<const CXXMethodDecl *, 16> &MustCalledMethods)
       : MustCalledMethods(MustCalledMethods) {}
-  bool VisitCallExpr(CallExpr *CallExpr) {
+
+  bool VisitCallExpr(CallExpr *CallExpr) override {
     const CXXMethodDecl *Callee = nullptr;
     for (const auto &MustCalled : MustCalledMethods) {
       if (CallExpr->getCalleeDecl() == MustCalled) {
@@ -67,7 +68,7 @@ class MethodUsageVisitor : public RecursiveASTVisitor<MethodUsageVisitor> {
   llvm::SmallPtrSet<const CXXMethodDecl *, 16> &MustCalledMethods;
 };
 
-class CallSuperVisitor : public RecursiveASTVisitor<CallSuperVisitor> {
+class CallSuperVisitor final : public DynamicRecursiveASTVisitor {
 public:
   CallSuperVisitor(DiagnosticsEngine &Diags) : Diags(Diags) {
     WarningSuperNotCalled = Diags.getCustomDiagID(
@@ -77,7 +78,8 @@ class CallSuperVisitor : public RecursiveASTVisitor<CallSuperVisitor> {
     NotePreviousCallSuperDeclaration = Diags.getCustomDiagID(
         DiagnosticsEngine::Note, "function marked 'call_super' here");
   }
-  bool VisitCXXMethodDecl(CXXMethodDecl *MethodDecl) {
+
+  bool VisitCXXMethodDecl(CXXMethodDecl *MethodDecl) override {
     if (MethodDecl->isThisDeclarationADefinition() && MethodDecl->hasBody()) {
       // First find out which overridden methods are marked as 'call_super'
       llvm::SmallPtrSet<const CXXMethodDecl *, 16> OverriddenMarkedMethods;
diff --git a/clang/examples/PrintFunctionNames/PrintFunctionNames.cpp b/clang/examples/PrintFunctionNames/PrintFunctionNames.cpp
index b2b785b87c25cf..248e3e159b329a 100644
--- a/clang/examples/PrintFunctionNames/PrintFunctionNames.cpp
+++ b/clang/examples/PrintFunctionNames/PrintFunctionNames.cpp
@@ -14,7 +14,7 @@
 #include "clang/Frontend/FrontendPluginRegistry.h"
 #include "clang/AST/AST.h"
 #include "clang/AST/ASTConsumer.h"
-#include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/DynamicRecursiveASTVisitor.h"
 #include "clang/Frontend/CompilerInstance.h"
 #include "clang/Sema/Sema.h"
 #include "llvm/Support/raw_ostream.h"
@@ -52,11 +52,12 @@ class PrintFunctionsConsumer : public ASTConsumer {
     // The advantage of doing this in HandleTranslationUnit() is that all
     // codegen (when using -add-plugin) is completely finished and this can't
     // affect the compiler output.
-    struct Visitor : public RecursiveASTVisitor<Visitor> {
+    struct Visitor final : DynamicRecursiveASTVisitor {
       const std::set<std::string> &ParsedTemplates;
       Visitor(const std::set<std::string> &ParsedTemplates)
           : ParsedTemplates(ParsedTemplates) {}
-      bool VisitFunctionDecl(FunctionDecl *FD) {
+
+      bool VisitFunctionDecl(FunctionDecl *FD) override {
         if (FD->isLateTemplateParsed() &&
             ParsedTemplates.count(FD->getNameAsString()))
           LateParsedDecls.insert(FD);
diff --git a/clang/include/clang/AST/DynamicRecursiveASTVisitor.h b/clang/include/clang/AST/DynamicRecursiveASTVisitor.h
new file mode 100644
index 00000000000000..1390977c09ea79
--- /dev/null
+++ b/clang/include/clang/AST/DynamicRecursiveASTVisitor.h
@@ -0,0 +1,252 @@
+#ifndef LLVM_CLANG_AST_DYNAMIC_RECURSIVE_AST_VISITOR_H
+#define LLVM_CLANG_AST_DYNAMIC_RECURSIVE_AST_VISITOR_H
+
+#include "clang/AST/Attr.h"
+#include "clang/AST/ExprConcepts.h"
+#include "clang/AST/TypeLoc.h"
+
+namespace clang {
+class ASTContext;
+
+/// Recursive AST visitor that supports extension via dynamic dispatch.
+///
+/// This only supports some of the more common visitation operations; in
+/// particular, it does not support overriding WalkUpFromX or post-order
+/// traversal.
+///
+/// Features that are currently not supported:
+///
+///   - Visiting attributes
+///   - Post-order traversal
+///   - Overriding WalkUpFromX
+///   - Overriding getStmtChildren()
+///
+/// \see RecursiveASTVisitor
+class DynamicRecursiveASTVisitor {
+public:
+  /// Whether this visitor should recurse into template instantiations.
+  bool ShouldVisitTemplateInstantiations = false;
+
+  /// Whether this visitor should recurse into the types of TypeLocs.
+  bool ShouldWalkTypesOfTypeLocs = true;
+
+  /// Whether this visitor should recurse into implicit code, e.g.
+  /// implicit constructors and destructors.
+  bool ShouldVisitImplicitCode = false;
+
+  /// Whether this visitor should recurse into lambda body
+  bool ShouldVisitLambdaBody = true;
+
+protected:
+  DynamicRecursiveASTVisitor() = default;
+
+public:
+  virtual void anchor();
+
+  // Copying/moving a polymorphic type is a bad idea.
+  DynamicRecursiveASTVisitor(DynamicRecursiveASTVisitor &&) = delete;
+  DynamicRecursiveASTVisitor(const DynamicRecursiveASTVisitor &) = delete;
+  DynamicRecursiveASTVisitor &operator=(DynamicRecursiveASTVisitor &&) = delete;
+  DynamicRecursiveASTVisitor &
+  operator=(const DynamicRecursiveASTVisitor &) = delete;
+  virtual ~DynamicRecursiveASTVisitor() = default;
+
+  /// Recursively visits an entire AST, starting from the TranslationUnitDecl.
+  /// \returns false if visitation was terminated early.
+  virtual bool TraverseAST(ASTContext &AST);
+
+  /// Recursively visit an attribute, by dispatching to
+  /// Traverse*Attr() based on the argument's dynamic type.
+  ///
+  /// \returns false if the visitation was terminated early, true
+  /// otherwise (including when the argument is a Null type location).
+  virtual bool TraverseAttr(Attr *At);
+
+  /// Recursively visit a constructor initializer.  This
+  /// automatically dispatches to another visitor for the initializer
+  /// expression, but not for the name of the initializer, so may
+  /// be overridden for clients that need access to the name.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseConstructorInitializer(CXXCtorInitializer *Init);
+
+  /// Recursively visit a base specifier. This can be overridden by a
+  /// subclass.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseCXXBaseSpecifier(const CXXBaseSpecifier &Base);
+
+  /// Recursively visit a declaration, by dispatching to
+  /// Traverse*Decl() based on the argument's dynamic type.
+  ///
+  /// \returns false if the visitation was terminated early, true
+  /// otherwise (including when the argument is NULL).
+  virtual bool TraverseDecl(Decl *D);
+
+  /// Recursively visit a name with its location information.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseDeclarationNameInfo(DeclarationNameInfo NameInfo);
+
+  /// Recursively visit a lambda capture. \c Init is the expression that
+  /// will be used to initialize the capture.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseLambdaCapture(LambdaExpr *LE, const LambdaCapture *C,
+                                     Expr *Init);
+
+  /// Recursively visit a C++ nested-name-specifier.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseNestedNameSpecifier(NestedNameSpecifier *NNS);
+
+  /// Recursively visit a C++ nested-name-specifier with location
+  /// information.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseNestedNameSpecifierLoc(NestedNameSpecifierLoc NNS);
+
+  /// Recursively visit a statement or expression, by
+  /// dispatching to Traverse*() based on the argument's dynamic type.
+  ///
+  /// \returns false if the visitation was terminated early, true
+  /// otherwise (including when the argument is nullptr).
+  virtual bool TraverseStmt(Stmt *S);
+
+  /// Recursively visit a template argument and dispatch to the
+  /// appropriate method for the argument type.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  // FIXME: migrate callers to TemplateArgumentLoc instead.
+  virtual bool TraverseTemplateArgument(const TemplateArgument &Arg);
+
+  /// Recursively visit a template argument location and dispatch to the
+  /// appropriate method for the argument type.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseTemplateArgumentLoc(const TemplateArgumentLoc &ArgLoc);
+
+  /// Recursively visit a set of template arguments.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  // FIXME: take a TemplateArgumentLoc* (or TemplateArgumentListInfo) instead.
+  bool
+  TraverseTemplateArguments(ArrayRef<TemplateArgument> Args); // NOT virtual
+
+  /// Recursively visit a template name and dispatch to the
+  /// appropriate method.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseTemplateName(TemplateName Template);
+
+  /// Recursively visit a type, by dispatching to
+  /// Traverse*Type() based on the argument's getTypeClass() property.
+  ///
+  /// \returns false if the visitation was terminated early, true
+  /// otherwise (including when the argument is a Null type).
+  virtual bool TraverseType(QualType T);
+
+  /// Recursively visit a type with location, by dispatching to
+  /// Traverse*TypeLoc() based on the argument type's getTypeClass() property.
+  ///
+  /// \returns false if the visitation was terminated early, true
+  /// otherwise (including when the argument is a Null type location).
+  virtual bool TraverseTypeLoc(TypeLoc TL);
+
+  /// Recursively visit an Objective-C protocol reference with location
+  /// information.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool TraverseObjCProtocolLoc(ObjCProtocolLoc ProtocolLoc);
+
+  /// Traverse a concept (requirement).
+  virtual bool TraverseTypeConstraint(const TypeConstraint *C);
+  virtual bool TraverseConceptRequirement(concepts::Requirement *R);
+  virtual bool TraverseConceptTypeRequirement(concepts::TypeRequirement *R);
+  virtual bool TraverseConceptExprRequirement(concepts::ExprRequirement *R);
+  virtual bool TraverseConceptNestedRequirement(concepts::NestedRequirement *R);
+  virtual bool TraverseConceptReference(ConceptReference *CR);
+  virtual bool VisitConceptReference(ConceptReference *CR) { return true; }
+
+  /// Visit a node.
+  virtual bool VisitAttr(Attr *A) { return true; }
+  virtual bool VisitDecl(Decl *D) { return true; }
+  virtual bool VisitStmt(Stmt *S) { return true; }
+  virtual bool VisitType(Type *T) { return true; }
+  virtual bool VisitTypeLoc(TypeLoc TL) { return true; }
+
+  /// Walk up from a node.
+  bool WalkUpFromDecl(Decl *D) { return VisitDecl(D); }
+  bool WalkUpFromStmt(Stmt *S) { return VisitStmt(S); }
+  bool WalkUpFromType(Type *T) { return VisitType(T); }
+  bool WalkUpFromTypeLoc(TypeLoc TL) { return VisitTypeLoc(TL); }
+
+  /// Invoked before visiting a statement or expression via data recursion.
+  ///
+  /// \returns false to skip visiting the node, true otherwise.
+  virtual bool dataTraverseStmtPre(Stmt *S) { return true; }
+
+  /// Invoked after visiting a statement or expression via data recursion.
+  /// This is not invoked if the previously invoked \c dataTraverseStmtPre
+  /// returned false.
+  ///
+  /// \returns false if the visitation was terminated early, true otherwise.
+  virtual bool dataTraverseStmtPost(Stmt *S) { return true; }
+  virtual bool dataTraverseNode(Stmt *S);
+
+  /*// Declare Traverse*() and friends for attributes.
+#define DYNAMIC_ATTR_VISITOR_DECLS
+#include "clang/AST/AttrVisitor.inc"
+#undef DYNAMIC_ATTR_VISITOR_DECLS*/
+
+  // Not virtual for now because no-one overrides them.
+#define DEF_TRAVERSE_TMPL_INST(kind)                                           \
+  virtual bool TraverseTemplateInstantiations(kind##TemplateDecl *D);
+  DEF_TRAVERSE_TMPL_INST(Class)
+  DEF_TRAVERSE_TMPL_INST(Var)
+  DEF_TRAVERSE_TMPL_INST(Function)
+#undef DEF_TRAVERSE_TMPL_INST
+
+  // Declare Traverse*() for and friends all concrete Decl classes.
+#define ABSTRACT_DECL(DECL)
+#define DECL(CLASS, BASE) virtual bool Traverse##CLASS##Decl(CLASS##Decl *D);
+#include "clang/AST/DeclNodes.inc"
+
+#define DECL(CLASS, BASE)                                                      \
+  bool WalkUpFrom##CLASS##Decl(CLASS##Decl *D);                                \
+  virtual bool Visit##CLASS##Decl(CLASS##Decl *D) { return true; }
+#include "clang/AST/DeclNodes.inc"
+
+  // Declare Traverse*() and friends for all concrete Stmt classes.
+#define ABSTRACT_STMT(STMT)
+#define STMT(CLASS, PARENT) virtual bool Traverse##CLASS(CLASS *S);
+#include "clang/AST/StmtNodes.inc"
+
+#define STMT(CLASS, PARENT)                                                    \
+  bool WalkUpFrom##CLASS(CLASS *S);                                            \
+  virtual bool Visit##CLASS(CLASS *S) { return true; }
+#include "clang/AST/StmtNodes.inc"
+
+  // Declare Traverse*() and friends for all concrete Type classes.
+#define ABSTRACT_TYPE(CLASS, BASE)
+#define TYPE(CLASS, BASE) virtual bool Traverse##CLASS##Type(CLASS##Type *T);
+#include "clang/AST/TypeNodes.inc"
+
+#define TYPE(CLASS, BASE)                                                      \
+  bool WalkUpFrom##CLASS##Type(CLASS##Type *T);                                \
+  virtual bool Visit##CLASS##Type(CLASS##Type *T) { return true; }
+#include "clang/AST/TypeNodes.inc"
+
+#define ABSTRACT_TYPELOC(CLASS, BASE)
+#define TYPELOC(CLASS, BASE)                                                   \
+  virtual bool Traverse##CLASS##TypeLoc(CLASS##TypeLoc TL);
+#include "clang/AST/TypeLocNodes.def"
+
+#define TYPELOC(CLASS, BASE)                                                   \
+  bool WalkUpFrom##CLASS##TypeLoc(CLASS##TypeLoc TL);                          \
+  virtual bool Visit##CLASS##TypeLoc(CLASS##TypeLoc TL) { return true; }
+#include "clang/AST/TypeLocNodes.def"
+};
+} // namespace clang
+
+#endif // LLVM_CLANG_AST_DYNAMIC_RECURSIVE_AST_VISITOR_H
diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h
index 2b35997bd539ac..8c0ed24c01149e 100644
--- a/clang/include/clang/AST/RecursiveASTVisitor.h
+++ b/clang/include/clang/AST/RecursiveASTVisitor.h
@@ -327,11 +327,11 @@ template <typename Derived> class RecursiveASTVisitor {
   bool VisitAttr(Attr *A) { return true; }
 
 // Declare Traverse* and empty Visit* for all Attr classes.
-#define ATTR_VISITOR_DECLS_ONLY
+#define ATTR_VISITOR_DECLS
 #include "clang/AST/AttrVisitor.inc"
-#undef ATTR_VISITOR_DECLS_ONLY
+#undef ATTR_VISITOR_DECLS
 
-// ---- Methods on Stmts ----
+  // ---- Methods on Stmts ----
 
   Stmt::child_range getStmtChildren(Stmt *S) { return S->children(); }
 
@@ -727,9 +727,11 @@ bool RecursiveASTVisitor<Derived>::TraverseTypeLoc(TypeLoc TL) {
 }
 
 // Define the Traverse*Attr(Attr* A) methods
+#define ATTR_VISITOR_IMPL
 #define VISITORCLASS RecursiveASTVisitor
 #include "clang/AST/AttrVisitor.inc"
 #undef VISITORCLASS
+#undef ATTR_VISITOR_IMPL
 
 template <typename Derived>
 bool RecursiveASTVisitor<Derived>::TraverseDecl(Decl *D) {
diff --git a/clang/include/clang/Analysis/CallGraph.h b/clang/include/clang/Analysis/CallGraph.h
index 78f8d115550178..4266ebfd2f7829 100644
--- a/clang/include/clang/Analysis/CallGraph.h
+++ b/clang/include/clang/Analysis/CallGraph.h
@@ -18,7 +18,8 @@
 #define LLVM_CLANG_ANALYSIS_CALLGRAPH_H
 
 #include "clang/AST/Decl.h"
-#include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/DeclObjC.h"
+#include "clang/AST/DynamicRecursiveASTVisitor.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/GraphTraits.h"
 #include "llvm/ADT/STLExtras.h"
@@ -39,7 +40,7 @@ class Stmt;
 /// The call graph extends itself with the given declarations by implementing
 /// the recursive AST visitor, which constructs the graph by visiting the given
 /// declarations.
-class CallGraph : public RecursiveASTVisitor<CallGraph> {
+class CallGraph final : public DynamicRecursiveASTVisitor {
   friend class CallGraphNode;
 
   using FunctionMapTy =
@@ -109,7 +110,7 @@ class CallGraph : public RecursiveASTVisitor<CallGraph> {
 
   /// Part of recursive declaration visitation. We recursively visit all the
   /// declarations to collect the root functions.
-  bool VisitFunctionDecl(FunctionDecl *FD) {
+  bool VisitFunctionDecl(FunctionDecl *FD) override {
     // We skip function template definitions, as their semantics is
     // only determined when they are instantiated.
     if (includeInGraph(FD) && FD->isThisDeclarationADefinition()) {
@@ -124,7 +125,7 @@ class CallGraph : public RecursiveASTVisitor<CallGraph> {
   }
 
   /// Part of recursive declaration visitation.
-  bool VisitObjCMethodDecl(ObjCMethodDecl *MD) {
+  bool VisitObjCMethodDecl(ObjCMethodDecl *MD) override {
     if (includeInGraph(MD)) {
       addNodesForBlocks(MD);
       addNodeForDecl(MD, true);
@@ -133,11 +134,7 @@ class CallGraph : public RecursiveASTVisitor<CallGraph> {
   }
 
   // We are only collecting the declarations, so do not step into the bodies.
-  bool TraverseStmt(Stmt *S...
[truncated]

@Sirraide Sirraide marked this pull request as ready for review August 20, 2024 18:10
@Sirraide Sirraide requested a review from cyndyishida as a code owner August 20, 2024 18:10
Copy link

github-actions bot commented Aug 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@Sirraide
Copy link
Member Author

Btw, if that’s preferred then I can also split this into smaller prs (one that introduces the DRAV, etc.).

@cor3ntin
Copy link
Contributor

cor3ntin commented Aug 21, 2024

This is a very impressive piece of work!
I think we are looking at a non-trivial choice:

  • Do we make clang faster to compile for ourselves, or slightly faster for everyone? My initial thinking is that clang is run orders of magnitude more often than it is compile, and so improving compile times might not be the best tradeoff?
    It would certainly be nice for ourselves.

We should also look at how PGO/BOLT affect both of these options. (Note sure how one does that though
#65010).

Edit: I am jumping the shark with bolt. What about just LTO?

Copy link
Contributor

@zyn0217 zyn0217 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -0,0 +1,252 @@
#ifndef LLVM_CLANG_AST_DYNAMIC_RECURSIVE_AST_VISITOR_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget the license header :)

Comment on lines +46 to +51
// Copying/moving a polymorphic type is a bad idea.
DynamicRecursiveASTVisitor(DynamicRecursiveASTVisitor &&) = delete;
DynamicRecursiveASTVisitor(const DynamicRecursiveASTVisitor &) = delete;
DynamicRecursiveASTVisitor &operator=(DynamicRecursiveASTVisitor &&) = delete;
DynamicRecursiveASTVisitor &
operator=(const DynamicRecursiveASTVisitor &) = delete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being dense, but can you explain why this is a bad idea? We allow copying/moving an RAV, right? (I might be wrong, because I hardly remember I've copied/moved an RAV)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik, you’re generally not supposed to make polymorphic types copyable or movable because of object slicing issues. It would be possible if you want to require everyone who uses it to be careful with it, but I don’t think we ever really need to copy or move an AST visitor, so I figured it would be easier to just disallow it.

@Endilll
Copy link
Contributor

Endilll commented Aug 21, 2024

I wonder if we can reduce binary size by simply marking RAV as visibility("hidden")

@Sirraide
Copy link
Member Author

I wonder if we can reduce binary size by simply marking RAV as visibility("hidden")

Hmm, I haven’t thought about that, but I’m not convinced it would help: the main issue w/ the RAV is that to traverse every possible AST that you might end up handing it, you need an instantiation of the entire visitor; I don’t think there is a good way to avoid that, so I don’t really see how that would help, but I’m also not really too familiar w/ how visibility("hidden") works (iirc it’s makes it so the symbol is not exported, but I think a lot of the visitors are already declared in an anonymous namespace or in a function).

@Endilll
Copy link
Contributor

Endilll commented Aug 21, 2024

but I think a lot of the visitors are already declared in an anonymous namespace or in a function

So you're saying that most of the binary size savings come from less inlining and instantiating? Yeah, that might be the case.

@Sirraide
Copy link
Member Author

Edit: I am jumping the shark with bolt. What about just LTO?

I’m worried my system might explode if I attempt to build Clang w/ LTO enabled, but I can give it a try (I already need to limit the memory using systemd so the linker doesn’t crash if I’m compiling in debug mode; I can try release mode though).

@Sirraide
Copy link
Member Author

So you're saying that most of the binary size savings come from less inlining and instantiating? Yeah, that might be the case.

Yeah, pretty much. I think stamping out a few 10000 lines worth of templates every time a RAV is declared is what’s causing the increased binary size (because even if it does inline a bunch of things, traversing the AST still requires a considerable amount of code...)

@Sirraide
Copy link
Member Author

Looks like some of the clang-tidy tests are failing; I don’t think I’ve touched any visitors in there, but I might be wrong. I also didn’t run the clang-tidy tests admittedly to speed up the process of working on this, but I’ll do that next.

@nikic
Copy link
Contributor

nikic commented Sep 4, 2024

To see how effective this would be, a lot of visitors all over the codebase have been migrated to use the dynamic visitor. The results of this refactor can be viewed here (if you look at the branch in llvm-compile-time-tracker and see that there are a few commits where everything got magically faster; those are merge commits, so please ignore those; the link here compares the current state of this branch to the commit on trunk that it is based on to eliminate any effects that unrelated changes might have had).

Impressive results! I think the code size improvement is still underestimated by a good bit because the compile-time tracker doesn't build the StaticAnalyzer and ARCMigrate functionality. The build time improvement is also underestimated because unit tests aren't built -- and some of those are very visitor heavy.

However, it is worth noting that using virtual functions for this does seem to be measurably slower than using CRTP. I have more or less indiscriminately migrated visitors regardless of how often they are used to see how much of a slowdown this would incur. Furthermore, I also checked how often each visitor is being instantiated when compiling Clang itself. The results of that are shown below.

The compile-time impact here looks pretty low -- clang fairly routinely accepts compile-time regressions of that magnitude.

Something I noticed looking at the per-file stats (https://llvm-compile-time-tracker.com/compare.php?from=27e5f505e5eee3da27c1515d6ed95d66fbe543ea&to=8f7d61923b414085d5d1e419cebd46500cef2662&stat=instructions%3Au&details=on) is that the impact is primarily on very small objects, and linking is also impacted, which is not something we would usually expect from a clang frontend change.

My suspicion is thus that this actually marginally regresses binary loading time. An obvious guess would be that this introduces many large vtables, and accordingly also many dynamic relocations that need to be resolved by the dynamic linker. (This is the problem that -fexperimental-relative-c++-abi-vtables solves, but well, experimental.)

So if there is some way to reduce the number of virtual methods, that would be good, but otherwise this just seems like the price of doing business...


In terms of landing this, I'd split this up in at least three parts, which is 1. the implementation (which is the part that will need detailed review), 2. converting unit tests (that should be a no-brainer) and 3. migrating production visitors, possibly further split down (e.g. something like ARCMigrate is self-contained).

@Sirraide
Copy link
Member Author

My suspicion is thus that this actually marginally regresses binary loading time.

Huh, that’s interesting; I didn’t even consider that as a possibility, but that would definitely explain why reverting some of the most commonly used visitors back to using CRTP didn’t do anything at all (except increase Clang’s binary size and make compiling it slower again...).

In terms of landing this, I'd split this up in at least three parts, which is 1. the implementation (which is the part that will need detailed review), 2. converting unit tests (that should be a no-brainer) and 3. migrating production visitors, possibly further split down (e.g. something like ARCMigrate is self-contained).

Yeah, that stgm. This pr was mostly just to show ‘this is what it’s like right now’. I’ll start by splitting the DynamicRecursiveASTVisitor implementation into a separate pr.

@Sirraide
Copy link
Member Author

Sirraide commented Sep 24, 2024

So if there is some way to reduce the number of virtual methods, that would be good, but otherwise this just seems like the price of doing business...

I’ve already tried to remove as much functionality as possible, because some of the things that the RAV can do are only used by like, 4 visitors in the entire codebase (e.g. overriding WalkUpFromX), but more details on that on the pr for the DRAV impl.

@Sirraide
Copy link
Member Author

Closing this to split it up into multiple prs.

@Sirraide Sirraide closed this Sep 25, 2024
Sirraide added a commit that referenced this pull request Nov 5, 2024
See #105195 as well as the big comment in DynamicRecursiveASTVisitor.cpp
for more context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants