-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++20] [Modules] Introduce reduced BMI #75894
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-modules Author: Chuanqi Xu (ChuanqiXu9) ChangesClose #71034 See This patch introduces reduced BMI, which doesn't contain the definitions of functions and variables if its definitions won't contribute to the ABI. Testing is a big part of the patch. We want to make sure the reduced BMI contains the same behavior with the existing and relatively stable fatBMI. This is pretty helpful for further reduction. The user interfaces part it left to following patches to ease the reviewing. Patch is 108.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75894.diff 108 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 1b02087425b751..c8f7675c6170ad 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7302,7 +7302,9 @@ def ast_view : Flag<["-"], "ast-view">,
def emit_module : Flag<["-"], "emit-module">,
HelpText<"Generate pre-compiled module file from a module map">;
def emit_module_interface : Flag<["-"], "emit-module-interface">,
- HelpText<"Generate pre-compiled module file from a C++ module interface">;
+ HelpText<"Generate pre-compiled module file from a standard C++ module interface unit">;
+def emit_reduced_module_interface : Flag<["-"], "emit-reduced-module-interface">,
+ HelpText<"Generate reduced prebuilt module interface from a standard C++ module interface unit">;
def emit_header_unit : Flag<["-"], "emit-header-unit">,
HelpText<"Generate C++20 header units from header files">;
def emit_pch : Flag<["-"], "emit-pch">,
diff --git a/clang/include/clang/Frontend/FrontendActions.h b/clang/include/clang/Frontend/FrontendActions.h
index fcce31ac0590ff..00a118d51f58a4 100644
--- a/clang/include/clang/Frontend/FrontendActions.h
+++ b/clang/include/clang/Frontend/FrontendActions.h
@@ -118,6 +118,9 @@ class GenerateModuleAction : public ASTFrontendAction {
CreateOutputFile(CompilerInstance &CI, StringRef InFile) = 0;
protected:
+ std::vector<std::unique_ptr<ASTConsumer>>
+ CreateMultiplexConsumer(CompilerInstance &CI, StringRef InFile);
+
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
StringRef InFile) override;
@@ -147,8 +150,10 @@ class GenerateModuleFromModuleMapAction : public GenerateModuleAction {
CreateOutputFile(CompilerInstance &CI, StringRef InFile) override;
};
+/// Generates fatBMI (which contains full information to generate the object
+/// files) for C++20 Named Modules.
class GenerateModuleInterfaceAction : public GenerateModuleAction {
-private:
+protected:
bool BeginSourceFileAction(CompilerInstance &CI) override;
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
@@ -158,6 +163,13 @@ class GenerateModuleInterfaceAction : public GenerateModuleAction {
CreateOutputFile(CompilerInstance &CI, StringRef InFile) override;
};
+/// Only generates the reduced BMI. This action is mainly used by tests.
+class GenerateThinModuleInterfaceAction : public GenerateModuleInterfaceAction {
+private:
+ std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
+ StringRef InFile) override;
+};
+
class GenerateHeaderUnitAction : public GenerateModuleAction {
private:
diff --git a/clang/include/clang/Frontend/FrontendOptions.h b/clang/include/clang/Frontend/FrontendOptions.h
index 53a8681cfdbba0..2ee342154f8cbf 100644
--- a/clang/include/clang/Frontend/FrontendOptions.h
+++ b/clang/include/clang/Frontend/FrontendOptions.h
@@ -85,9 +85,13 @@ enum ActionKind {
/// Generate pre-compiled module from a module map.
GenerateModule,
- /// Generate pre-compiled module from a C++ module interface file.
+ /// Generate pre-compiled module from a standard C++ module interface unit.
GenerateModuleInterface,
+ /// Generate reduced module interface for a standard C++ module interface
+ /// unit.
+ GenerateThinModuleInterface,
+
/// Generate a C++20 header unit module from a header file.
GenerateHeaderUnit,
diff --git a/clang/include/clang/Serialization/ASTWriter.h b/clang/include/clang/Serialization/ASTWriter.h
index a56929ef0245ee..87cc31a72d95f2 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -166,6 +166,10 @@ class ASTWriter : public ASTDeserializationListener,
/// Indicates that the AST contained compiler errors.
bool ASTHasCompilerErrors = false;
+ /// Indicates that we're going to generate the reduced BMI for C++20
+ /// named modules.
+ bool GeneratingReducedBMI = false;
+
/// Mapping from input file entries to the index into the
/// offset table where information about that input file is stored.
llvm::DenseMap<const FileEntry *, uint32_t> InputFileIDs;
@@ -582,7 +586,8 @@ class ASTWriter : public ASTDeserializationListener,
ASTWriter(llvm::BitstreamWriter &Stream, SmallVectorImpl<char> &Buffer,
InMemoryModuleCache &ModuleCache,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
- bool IncludeTimestamps = true, bool BuildingImplicitModule = false);
+ bool IncludeTimestamps = true, bool BuildingImplicitModule = false,
+ bool GeneratingReducedBMI = false);
~ASTWriter() override;
ASTContext &getASTContext() const {
@@ -813,6 +818,13 @@ class PCHGenerator : public SemaConsumer {
const ASTWriter &getWriter() const { return Writer; }
SmallVectorImpl<char> &getPCH() const { return Buffer->Data; }
+ bool isComplete() const { return Buffer->IsComplete; }
+ PCHBuffer *getBufferPtr() { return Buffer.get(); }
+ StringRef getOutputFile() const { return OutputFile; }
+ DiagnosticsEngine &getDiagnostics() const {
+ return SemaPtr->getDiagnostics();
+ }
+
public:
PCHGenerator(const Preprocessor &PP, InMemoryModuleCache &ModuleCache,
StringRef OutputFile, StringRef isysroot,
@@ -820,7 +832,8 @@ class PCHGenerator : public SemaConsumer {
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
bool AllowASTWithErrors = false, bool IncludeTimestamps = true,
bool BuildingImplicitModule = false,
- bool ShouldCacheASTInMemory = false);
+ bool ShouldCacheASTInMemory = false,
+ bool GeneratingReducedBMI = false);
~PCHGenerator() override;
void InitializeSema(Sema &S) override { SemaPtr = &S; }
@@ -830,6 +843,19 @@ class PCHGenerator : public SemaConsumer {
bool hasEmittedPCH() const { return Buffer->IsComplete; }
};
+class ReducedBMIGenerator : public PCHGenerator {
+public:
+ ReducedBMIGenerator(const Preprocessor &PP, InMemoryModuleCache &ModuleCache,
+ StringRef OutputFile, std::shared_ptr<PCHBuffer> Buffer,
+ bool IncludeTimestamps);
+
+ void HandleTranslationUnit(ASTContext &Ctx) override;
+};
+
+/// If the definition may impact the ABI. If yes, we're allowed to eliminate
+/// the definition of D in reduced BMI.
+bool MayDefAffectABI(const Decl *D);
+
/// A simple helper class to pack several bits in order into (a) 32 bit
/// integer(s).
class BitsPacker {
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index 11f3f2c2d6425c..ffaf3db5fdc076 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2567,6 +2567,7 @@ static const auto &getFrontendActionTable() {
{frontend::GenerateModule, OPT_emit_module},
{frontend::GenerateModuleInterface, OPT_emit_module_interface},
+ {frontend::GenerateThinModuleInterface, OPT_emit_reduced_module_interface},
{frontend::GenerateHeaderUnit, OPT_emit_header_unit},
{frontend::GeneratePCH, OPT_emit_pch},
{frontend::GenerateInterfaceStubs, OPT_emit_interface_stubs},
@@ -4274,6 +4275,7 @@ static bool isStrictlyPreprocessorAction(frontend::ActionKind Action) {
case frontend::FixIt:
case frontend::GenerateModule:
case frontend::GenerateModuleInterface:
+ case frontend::GenerateThinModuleInterface:
case frontend::GenerateHeaderUnit:
case frontend::GeneratePCH:
case frontend::GenerateInterfaceStubs:
diff --git a/clang/lib/Frontend/FrontendActions.cpp b/clang/lib/Frontend/FrontendActions.cpp
index c1d6e71455365c..c9a0b6858ae8ad 100644
--- a/clang/lib/Frontend/FrontendActions.cpp
+++ b/clang/lib/Frontend/FrontendActions.cpp
@@ -184,12 +184,12 @@ bool GeneratePCHAction::BeginSourceFileAction(CompilerInstance &CI) {
return true;
}
-std::unique_ptr<ASTConsumer>
-GenerateModuleAction::CreateASTConsumer(CompilerInstance &CI,
- StringRef InFile) {
+std::vector<std::unique_ptr<ASTConsumer>>
+GenerateModuleAction::CreateMultiplexConsumer(CompilerInstance &CI,
+ StringRef InFile) {
std::unique_ptr<raw_pwrite_stream> OS = CreateOutputFile(CI, InFile);
if (!OS)
- return nullptr;
+ return {};
std::string OutputFile = CI.getFrontendOpts().OutputFile;
std::string Sysroot;
@@ -210,6 +210,17 @@ GenerateModuleAction::CreateASTConsumer(CompilerInstance &CI,
+CI.getFrontendOpts().BuildingImplicitModule));
Consumers.push_back(CI.getPCHContainerWriter().CreatePCHContainerGenerator(
CI, std::string(InFile), OutputFile, std::move(OS), Buffer));
+ return std::move(Consumers);
+}
+
+std::unique_ptr<ASTConsumer>
+GenerateModuleAction::CreateASTConsumer(CompilerInstance &CI,
+ StringRef InFile) {
+ std::vector<std::unique_ptr<ASTConsumer>> Consumers =
+ CreateMultiplexConsumer(CI, InFile);
+ if (Consumers.empty())
+ return nullptr;
+
return std::make_unique<MultiplexConsumer>(std::move(Consumers));
}
@@ -265,7 +276,12 @@ GenerateModuleInterfaceAction::CreateASTConsumer(CompilerInstance &CI,
CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;
CI.getHeaderSearchOpts().ModulesSkipPragmaDiagnosticMappings = true;
- return GenerateModuleAction::CreateASTConsumer(CI, InFile);
+ std::vector<std::unique_ptr<ASTConsumer>> Consumers =
+ CreateMultiplexConsumer(CI, InFile);
+ if (Consumers.empty())
+ return nullptr;
+
+ return std::make_unique<MultiplexConsumer>(std::move(Consumers));
}
std::unique_ptr<raw_pwrite_stream>
@@ -274,6 +290,15 @@ GenerateModuleInterfaceAction::CreateOutputFile(CompilerInstance &CI,
return CI.createDefaultOutputFile(/*Binary=*/true, InFile, "pcm");
}
+std::unique_ptr<ASTConsumer>
+GenerateThinModuleInterfaceAction::CreateASTConsumer(CompilerInstance &CI,
+ StringRef InFile) {
+ auto Buffer = std::make_shared<PCHBuffer>();
+ return std::make_unique<ReducedBMIGenerator>(
+ CI.getPreprocessor(), CI.getModuleCache(), CI.getFrontendOpts().OutputFile, Buffer,
+ /*IncludeTimestamps=*/+CI.getFrontendOpts().IncludeTimestamps);
+}
+
bool GenerateHeaderUnitAction::BeginSourceFileAction(CompilerInstance &CI) {
if (!CI.getLangOpts().CPlusPlusModules) {
CI.getDiagnostics().Report(diag::err_module_interface_requires_cpp_modules);
@@ -840,7 +865,6 @@ void DumpModuleInfoAction::ExecuteAction() {
const LangOptions &LO = getCurrentASTUnit().getLangOpts();
if (LO.CPlusPlusModules && !LO.CurrentModule.empty()) {
-
ASTReader *R = getCurrentASTUnit().getASTReader().get();
unsigned SubModuleCount = R->getTotalNumSubmodules();
serialization::ModuleFile &MF = R->getModuleManager().getPrimaryModule();
diff --git a/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp b/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp
index b280a1359d2f27..59f7f955db5097 100644
--- a/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp
+++ b/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp
@@ -65,6 +65,8 @@ CreateFrontendBaseAction(CompilerInstance &CI) {
return std::make_unique<GenerateModuleFromModuleMapAction>();
case GenerateModuleInterface:
return std::make_unique<GenerateModuleInterfaceAction>();
+ case GenerateThinModuleInterface:
+ return std::make_unique<GenerateThinModuleInterfaceAction>();
case GenerateHeaderUnit:
return std::make_unique<GenerateHeaderUnitAction>();
case GeneratePCH: return std::make_unique<GeneratePCHAction>();
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 91eb2af8f8ad6a..71d1a577eefcc4 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -4595,10 +4595,12 @@ ASTWriter::ASTWriter(llvm::BitstreamWriter &Stream,
SmallVectorImpl<char> &Buffer,
InMemoryModuleCache &ModuleCache,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
- bool IncludeTimestamps, bool BuildingImplicitModule)
+ bool IncludeTimestamps, bool BuildingImplicitModule,
+ bool GeneratingReducedBMI)
: Stream(Stream), Buffer(Buffer), ModuleCache(ModuleCache),
IncludeTimestamps(IncludeTimestamps),
- BuildingImplicitModule(BuildingImplicitModule) {
+ BuildingImplicitModule(BuildingImplicitModule),
+ GeneratingReducedBMI(GeneratingReducedBMI) {
for (const auto &Ext : Extensions) {
if (auto Writer = Ext->createExtensionWriter(*this))
ModuleFileExtensionWriters.push_back(std::move(Writer));
@@ -5405,18 +5407,20 @@ void ASTWriter::WriteDeclUpdatesBlocks(RecordDataImpl &OffsetsRecord) {
// Add a trailing update record, if any. These must go last because we
// lazily load their attached statement.
- if (HasUpdatedBody) {
- const auto *Def = cast<FunctionDecl>(D);
- Record.push_back(UPD_CXX_ADDED_FUNCTION_DEFINITION);
- Record.push_back(Def->isInlined());
- Record.AddSourceLocation(Def->getInnerLocStart());
- Record.AddFunctionDefinition(Def);
- } else if (HasAddedVarDefinition) {
- const auto *VD = cast<VarDecl>(D);
- Record.push_back(UPD_CXX_ADDED_VAR_DEFINITION);
- Record.push_back(VD->isInline());
- Record.push_back(VD->isInlineSpecified());
- Record.AddVarDeclInit(VD);
+ if (!GeneratingReducedBMI || MayDefAffectABI(D)) {
+ if (HasUpdatedBody) {
+ const auto *Def = cast<FunctionDecl>(D);
+ Record.push_back(UPD_CXX_ADDED_FUNCTION_DEFINITION);
+ Record.push_back(Def->isInlined());
+ Record.AddSourceLocation(Def->getInnerLocStart());
+ Record.AddFunctionDefinition(Def);
+ } else if (HasAddedVarDefinition) {
+ const auto *VD = cast<VarDecl>(D);
+ Record.push_back(UPD_CXX_ADDED_VAR_DEFINITION);
+ Record.push_back(VD->isInline());
+ Record.push_back(VD->isInlineSpecified());
+ Record.AddVarDeclInit(VD);
+ }
}
OffsetsRecord.push_back(GetDeclRef(D));
diff --git a/clang/lib/Serialization/ASTWriterDecl.cpp b/clang/lib/Serialization/ASTWriterDecl.cpp
index 43169b2befc687..b7a1562d0422ac 100644
--- a/clang/lib/Serialization/ASTWriterDecl.cpp
+++ b/clang/lib/Serialization/ASTWriterDecl.cpp
@@ -16,6 +16,7 @@
#include "clang/AST/DeclTemplate.h"
#include "clang/AST/DeclVisitor.h"
#include "clang/AST/Expr.h"
+#include "clang/AST/ODRHash.h"
#include "clang/AST/OpenMPClause.h"
#include "clang/AST/PrettyDeclStackTrace.h"
#include "clang/Basic/SourceManager.h"
@@ -40,11 +41,14 @@ namespace clang {
serialization::DeclCode Code;
unsigned AbbrevToUse;
+ bool GeneratingReducedBMI = false;
+
public:
ASTDeclWriter(ASTWriter &Writer, ASTContext &Context,
- ASTWriter::RecordDataImpl &Record)
+ ASTWriter::RecordDataImpl &Record, bool GeneratingReducedBMI)
: Writer(Writer), Context(Context), Record(Writer, Record),
- Code((serialization::DeclCode)0), AbbrevToUse(0) {}
+ Code((serialization::DeclCode)0), AbbrevToUse(0),
+ GeneratingReducedBMI(GeneratingReducedBMI) {}
uint64_t Emit(Decl *D) {
if (!Code)
@@ -270,6 +274,27 @@ namespace clang {
};
}
+bool clang::MayDefAffectABI(const Decl *D) {
+ if (auto *FD = dyn_cast<FunctionDecl>(D)) {
+ if (FD->isInlined() || FD->isConstexpr())
+ return true;
+
+ if (FD->isDependentContext())
+ return true;
+ }
+
+ if (auto *VD = dyn_cast<VarDecl>(D)) {
+ if (!VD->getDeclContext()->getRedeclContext()->isFileContext() ||
+ VD->isInline() || VD->isConstexpr() || isa<ParmVarDecl>(VD))
+ return true;
+
+ if (VD->getTemplateSpecializationKind() == TSK_ImplicitInstantiation)
+ return true;
+ }
+
+ return false;
+}
+
void ASTDeclWriter::Visit(Decl *D) {
DeclVisitor<ASTDeclWriter>::Visit(D);
@@ -285,9 +310,12 @@ void ASTDeclWriter::Visit(Decl *D) {
// have been written. We want it last because we will not read it back when
// retrieving it from the AST, we'll just lazily set the offset.
if (auto *FD = dyn_cast<FunctionDecl>(D)) {
- Record.push_back(FD->doesThisDeclarationHaveABody());
- if (FD->doesThisDeclarationHaveABody())
- Record.AddFunctionDefinition(FD);
+ if (!GeneratingReducedBMI || MayDefAffectABI(FD)) {
+ Record.push_back(FD->doesThisDeclarationHaveABody());
+ if (FD->doesThisDeclarationHaveABody())
+ Record.AddFunctionDefinition(FD);
+ } else
+ Record.push_back(0);
}
// Similar to FunctionDecls, handle VarDecl's initializer here and write it
@@ -295,7 +323,10 @@ void ASTDeclWriter::Visit(Decl *D) {
// we have finished recursive deserialization, because it can recursively
// refer back to the variable.
if (auto *VD = dyn_cast<VarDecl>(D)) {
- Record.AddVarDeclInit(VD);
+ if (!GeneratingReducedBMI || MayDefAffectABI(VD))
+ Record.AddVarDeclInit(VD);
+ else
+ Record.push_back(0);
}
// And similarly for FieldDecls. We already serialized whether there is a
@@ -2474,7 +2505,7 @@ void ASTWriter::WriteDecl(ASTContext &Context, Decl *D) {
assert(ID >= FirstDeclID && "invalid decl ID");
RecordData Record;
- ASTDeclWriter W(*this, Context, Record);
+ ASTDeclWriter W(*this, Context, Record, GeneratingReducedBMI);
// Build a record for this declaration
W.Visit(D);
diff --git a/clang/lib/Serialization/GeneratePCH.cpp b/clang/lib/Serialization/GeneratePCH.cpp
index cf8084333811f1..c4c2ac98b3e9d5 100644
--- a/clang/lib/Serialization/GeneratePCH.cpp
+++ b/clang/lib/Serialization/GeneratePCH.cpp
@@ -12,9 +12,11 @@
//===----------------------------------------------------------------------===//
#include "clang/AST/ASTContext.h"
+#include "clang/Frontend/FrontendDiagnostic.h"
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/Preprocessor.h"
#include "clang/Sema/SemaConsumer.h"
+#include "clang/Serialization/ASTReader.h"
#include "clang/Serialization/ASTWriter.h"
#include "llvm/Bitstream/BitstreamWriter.h"
@@ -25,11 +27,12 @@ PCHGenerator::PCHGenerator(
StringRef OutputFile, StringRef isysroot, std::shared_ptr<PCHBuffer> Buffer,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
bool AllowASTWithErrors, bool IncludeTimestamps,
- bool BuildingImplicitModule, bool ShouldCacheASTInMemory)
+ bool BuildingImplicitModule, bool ShouldCacheASTInMemory,
+ bool GeneratingReducedBMI)
: PP(PP), OutputFile(OutputFile), isysroot(isysroot.str()),
SemaPtr(nullptr), Buffer(std::move(Buffer)), Stream(this->Buffer->Data),
Writer(Stream, this->Buffer->Data, ModuleCache, Extensions,
- IncludeTimestamps, BuildingImplicitModule),
+ IncludeTimestamps, BuildingImplicitModule, GeneratingReducedBMI),
AllowASTWithErrors(AllowASTWithErrors),
ShouldCacheASTInMemory(ShouldCacheASTInMemory) {
this->Buffer->IsComplete = false;
@@ -78,3 +81,33 @@ ASTMutationListener *PCHGenerator::GetASTMutationListener() {
ASTDeserializationListener *PCHGenerator::GetASTDeserializationListener() {
return &Writer;
}
+
+ReducedBMIGenerator::ReducedBMIGenerator(const Preprocessor &PP,
+ InMemoryModuleCache &ModuleCache,
+ StringRef OutputFile,
+ std::shared_ptr<PCHBuffer> Buffer,
+ bool IncludeTimestamps)
+ : PCHGenerator(
+ PP, ModuleCache, OutputFile, llvm::StringRef(), Buffer,
+ /*Extensions=*/ArrayRef<std::shared_ptr<ModuleFileExtension>>(),
+ /*AllowASTWithErrors*/ false, /*IncludeTimestamps=*/IncludeTimestamps,
+ /*BuildingImplicitModule=*/false, /*ShouldCacheASTInMemory=*/false,
+ /*GeneratingReducedBMI=*/true) {}
+
+void ReducedBMIGenerator::HandleTranslationUnit(ASTContext &Ctx) {
+ PCHGenerator::HandleTranslationUnit(Ctx);
+
+ if (!isComplete())
+ return;
+
+ std::error_code EC;
+ auto OS = std::make_unique<llvm::raw_fd_ostream>(getOutputFile(), EC);
+ if (EC) {
+ getDiagnostics().Report(diag::err_fe_unable_...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
My impression to the feedbacks is that every one of us loves the direction, while we may need more agreement on the user interfaces. To make it easier to review, I split all the user interfaces related part to following patches. So that the current patch won't affect users. This is almost a NFC : ) This patch introduced a CC1 option I feel the patch good and hope we can land this quickly so that we can start the work to help users actually. |
ed0d5f6
to
b5bc1bc
Compare
@iains @dwblaikie ping~ |
@ChuanqiXu9 very sorry for the slow review. It would help me if the design was described in the commit message instead of trying to deduce it from the patch (maybe it's in a thread somewhere - so a cross-reference would help). two immediate questions and one observation:
so something like
As I understand the patch you are combining the transform with the output? |
hi @iains , sorry for the confusion. It may be hard to balance how detail the message it is. So it is no problem at all to ask questions as long as it need. This should be the point of the review process too. For this patch itself, the design is to introduce a CC1 command line Like I said in the commit message, this patch itself doesn't involve anything relevant to user interfaces. I left it to the latter patches.
In fact, I am not objecting your design due to multiplex consumer. I am objecting your design in implementing it in
Yes, this is the goal. I've already had a patch in the downstream to do that. But I am still wondering if I can implement it better.
I am not sure if I understand this. What does it mean "require a codegen back end to be instantiated to allow the reduced BMI "? Do you mean to not touch CodeGen part or to not touch the CodeGen action? My local patch touched the code gen action without touching any real CodeGen related things.
For
Yeah, it should be less concerned. BTW, currently the simpler serializer/deserializer should be ASTRecordWriter/ASTRecordReader. And the current ASTReader/ASTWriter takes some semantical job.
On the one hand, the current patch doesn't do that. The current patch is almost a NFC patch. It belongs to the following patch. On the other hand, the answer may be yes. Probably I did the Given all of us loves reduced BMI, I suggest we can focus on current patch then discuss user interfaces related things in the next patch after this got landed. |
Are you in a position to post the next patch (at least as a draft)? That would help me see the direction.
OK, that answers my concern (which was that we might have to add the code-gen backend to a --precompile if that was the mechanism used to do the BMI reduction).
... and, on the reader side, that already gives us some big problems (as I say, I am less concerned on the writer side, but who can see the whole future?).
Maybe not for the short-term relatively simple tasks - but we should also take a view on the medium and longer term (for example, GMF decl elision is likely to be helpful to users in reducing both the size of the BMI and the number of decls that need merging on input). We need the AST in this path to be mutable (including removal of decls); that way each transform can be maintained as a separate entity - I think that if we end up doing "many transforms" as part of the output, it will become very confusing. (although, to be clear, in the short-term we might agree to do the work in the output - I really do think it would be bad to make that the long term mechanism).
We do all want to produce the reduced BMI, I agree; but we also always have limited resources to do the work, so that it would be good to try and pick an implementation that will be smooth for the future work too. I understand the purpose of the current patch better now - and will try to take a more detailed look over the next few days - as noted above, it would help very much to have a preview of the next patch in the series. |
This is draft for the user interfaces for llvm#75894 and required by @iains to get a feeling about the proposed future direction. Note that this is still in an early stage and nothing is decided.
I post it here ChuanqiXu9@efcc7e8 since I didn't get how to send stacked pr in github yet. But I guess it might be sufficient since it is really in an early phase.
I guess you're referring the process how we decide a declaration is visible?
For GMF decl elision, I posted a patch to implement it in ASTWriter and I reposted it in #76930. The big problem is that this is not formal. (Just for sharing, I am not proposing this now)
While the model sounds good, I am pessimistic for making it correctly, completely, and efficiently.
Not against, I just think it is not so bad. There already many optimizations in the current serializations.
Understood. I just think it won't be too bad. Or it is not easy for us to get a much better solution in the resources we have. I prefer the style that don't make perfect the enemy of better.
I don't intend to land this in clang18. So we don't need to be hurry.
Sent. But I just feel it is not so helpful for reviewing this patch : ) |
@iains would you like to take a look on this. I feel OK to not land this in 18 but I think it is really important to land this in 19. Although it is still early to that, there are other works to do (e.g., the next patch and more testing). So it may be better to start it earlier. |
The process seems to me to be either one of:
It still feels to me to be better to have clear separation of this work from the work of streaming - but if we can make clear layers within the streaming, then maybe the maintenance will not be too hard. |
Yeah. Great to see we have some progress finally. I think this is really important since I see more and more peope complaninig the performance for modules. I feel this series patch is key to to improve the user's impression that named modules are just purely wrappers for PCH.
After looking into the codes more, I changed my mind for this. I feel it is the most efficient and the most natural way to omit declarations in ASTWriter. The idea is, previously, we'll start to write declarations in the current TU from the first declarations. And in the C++20 named modules world, we would start to write declarations from the first declarations in the global module fragment. And we can implement GMF decl elision naturally by writing declarations from the first declaration in the module purview and only write the referenced decl from the GMF during the writing process. This is super natural and efficient.
Now I doubt both of the method to be not efficiency and it adds additional burdens to the developers and the users.
I think the maintainance may not be too hard in that way. ASTWriter is not such a devil :) Probably we need some refactoration in the serialization to make codes more clear. But the point is that we must get the ASTWriter/ASTReader involved. It may not be a good idea to leave the ASTWriter/ASTReader as is and add another layer... For example, it should be better to not deserialize the declaration from the very beginning instead of deserializing the declaration and judge that it is not wanted/visible. And to achieve this, we must touch the serialization layer deeply. |
Do you expect to make any changes to type streaming? |
I don't expect to do that explicitly. The number of types deserialized can be decreased naturally after we avoid emitting declarations during the writing. |
I am going to land this in the week later if no objections come in. I think it is necessary to land the series of patches (to reduce the contents of BMI) for clang19. And of course, the functionality will be opt in for one~two releases for experimental. |
I have no further comments so LGTM if there are no objections from the other reviewers this week. |
Can we add a release note and documentation for this? |
The current patch is transparent to users and it is only part of the series patches. I'd like to document that after I made the series of patches. |
Close llvm#71034 See https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755 This patch introduces reduced BMI, which doesn't contain the definitions of functions and variables if its definitions won't contribute to the ABI. Testing is a big part of the patch. We want to make sure the reduced BMI contains the same behavior with the existing and relatively stable fatBMI. This is pretty helpful for further reduction. The user interfaces part it left to following patches to ease the reviewing.
This is the driver part of #75894. This patch introduces '-fexperimental-modules-reduced-bmi' to enable generating the reduced BMI. This patch did: - When `-fexperimental-modules-reduced-bmi` is specified but `--precompile` is not specified for a module unit, we'll skip the precompile phase to avoid unnecessary two-phase compilation phases. Then if `-c` is specified, we will generate the reduced BMI in CodeGenAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified and `--precompile` is specified, we will generate the reduced BMI in GenerateModuleInterfaceAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified for a non-module unit. We don't do anything nor try to give a warn. This is more user friendly so that the end users can try to test and experiment with the feature without asking help from the build systems. The core design idea is that users should be able to enable this easily with the existing cmake mechanisms. The future plan for the flag is: - Add this to clang19 and make it opt-in for 1~2 releases. It depends on the testing feedback to decide how long we like to make it opt-in. - Then we can announce the existing BMI generating may be deprecated and suggesting people (end users or build systems) to enable this for 1~2 releases. - Finally we will enable this by default. When that time comes, the term `BMI` will refer to the reduced BMI today and the existing BMI will only be meaningful to build systems which loves to support two phase compilations. I'll send release notes and document in seperate commits after this get landed.
…85050) This is the driver part of llvm#75894. This patch introduces '-fexperimental-modules-reduced-bmi' to enable generating the reduced BMI. This patch did: - When `-fexperimental-modules-reduced-bmi` is specified but `--precompile` is not specified for a module unit, we'll skip the precompile phase to avoid unnecessary two-phase compilation phases. Then if `-c` is specified, we will generate the reduced BMI in CodeGenAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified and `--precompile` is specified, we will generate the reduced BMI in GenerateModuleInterfaceAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified for a non-module unit. We don't do anything nor try to give a warn. This is more user friendly so that the end users can try to test and experiment with the feature without asking help from the build systems. The core design idea is that users should be able to enable this easily with the existing cmake mechanisms. The future plan for the flag is: - Add this to clang19 and make it opt-in for 1~2 releases. It depends on the testing feedback to decide how long we like to make it opt-in. - Then we can announce the existing BMI generating may be deprecated and suggesting people (end users or build systems) to enable this for 1~2 releases. - Finally we will enable this by default. When that time comes, the term `BMI` will refer to the reduced BMI today and the existing BMI will only be meaningful to build systems which loves to support two phase compilations. I'll send release notes and document in seperate commits after this get landed.
…85050) This is the driver part of llvm#75894. This patch introduces '-fexperimental-modules-reduced-bmi' to enable generating the reduced BMI. This patch did: - When `-fexperimental-modules-reduced-bmi` is specified but `--precompile` is not specified for a module unit, we'll skip the precompile phase to avoid unnecessary two-phase compilation phases. Then if `-c` is specified, we will generate the reduced BMI in CodeGenAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified and `--precompile` is specified, we will generate the reduced BMI in GenerateModuleInterfaceAction as a by-product. - When `-fexperimental-modules-reduced-bmi` is specified for a non-module unit. We don't do anything nor try to give a warn. This is more user friendly so that the end users can try to test and experiment with the feature without asking help from the build systems. The core design idea is that users should be able to enable this easily with the existing cmake mechanisms. The future plan for the flag is: - Add this to clang19 and make it opt-in for 1~2 releases. It depends on the testing feedback to decide how long we like to make it opt-in. - Then we can announce the existing BMI generating may be deprecated and suggesting people (end users or build systems) to enable this for 1~2 releases. - Finally we will enable this by default. When that time comes, the term `BMI` will refer to the reduced BMI today and the existing BMI will only be meaningful to build systems which loves to support two phase compilations. I'll send release notes and document in seperate commits after this get landed.
Close #71034
See
https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755
This patch introduces reduced BMI, which doesn't contain the definitions of functions and variables if its definitions won't contribute to the ABI.
Testing is a big part of the patch. We want to make sure the reduced BMI contains the same behavior with the existing and relatively stable fatBMI. This is pretty helpful for further reduction.
The user interfaces part it left to following patches to ease the reviewing.