-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Modules] No transitive source location change #86912
[Modules] No transitive source location change #86912
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
9046d21
to
8d4e349
Compare
@llvm/pr-subscribers-clang-modules @llvm/pr-subscribers-clang Author: Chuanqi Xu (ChuanqiXu9) ChangesThis is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for:
Almost every time A.cppm changes, we need to recompile Motivation ExampleThis patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test.
Here the only difference between However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from Internal perspective of status quoTo fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as:
As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for,
Assuming the offset of a source location (let's name the location as And when we want to write module Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is:
In Then we're almost done about how we handle the offset of source locations in serializers. The high level design of current patchFrom the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 208M from 200M. I think the trade-off is pretty fair. Some low level detailsI didn't use another slot to record the module file index. I tried to use the higher bits of the existing source location encodings to store that information. This design may be safe. Since we use Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. I don't remove optimizations as CorrectnessThe mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it may be correct. Future PlansI'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. Patch is 42.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86912.diff 15 Files Affected:
diff --git a/clang/include/clang/Basic/SourceLocation.h b/clang/include/clang/Basic/SourceLocation.h
index 00b1e0fa855b7a..7a0f5ba8d1270b 100644
--- a/clang/include/clang/Basic/SourceLocation.h
+++ b/clang/include/clang/Basic/SourceLocation.h
@@ -90,6 +90,7 @@ class SourceLocation {
friend class ASTWriter;
friend class SourceManager;
friend struct llvm::FoldingSetTrait<SourceLocation, void>;
+ friend class SourceLocationEncoding;
public:
using UIntTy = uint32_t;
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index f31efa5117f0d1..628ce03572fea6 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -22,6 +22,7 @@
#include "clang/Basic/IdentifierTable.h"
#include "clang/Basic/OperatorKinds.h"
#include "clang/Basic/SourceLocation.h"
+#include "clang/Serialization/SourceLocationEncoding.h"
#include "llvm/ADT/DenseMapInfo.h"
#include "llvm/Bitstream/BitCodes.h"
#include <cassert>
@@ -175,45 +176,38 @@ const unsigned int NUM_PREDEF_SUBMODULE_IDS = 1;
/// Source range/offset of a preprocessed entity.
struct PPEntityOffset {
+ using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
/// Raw source location of beginning of range.
- SourceLocation::UIntTy Begin;
+ RawLocEncoding Begin;
/// Raw source location of end of range.
- SourceLocation::UIntTy End;
+ RawLocEncoding End;
/// Offset in the AST file relative to ModuleFile::MacroOffsetsBase.
uint32_t BitOffset;
- PPEntityOffset(SourceRange R, uint32_t BitOffset)
- : Begin(R.getBegin().getRawEncoding()), End(R.getEnd().getRawEncoding()),
- BitOffset(BitOffset) {}
-
- SourceLocation getBegin() const {
- return SourceLocation::getFromRawEncoding(Begin);
- }
+ PPEntityOffset(RawLocEncoding Begin, RawLocEncoding End, uint32_t BitOffset)
+ : Begin(Begin), End(End), BitOffset(BitOffset) {}
- SourceLocation getEnd() const {
- return SourceLocation::getFromRawEncoding(End);
- }
+ RawLocEncoding getBegin() const { return Begin; }
+ RawLocEncoding getEnd() const { return End; }
};
/// Source range of a skipped preprocessor region
struct PPSkippedRange {
+ using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
/// Raw source location of beginning of range.
- SourceLocation::UIntTy Begin;
+ RawLocEncoding Begin;
/// Raw source location of end of range.
- SourceLocation::UIntTy End;
+ RawLocEncoding End;
- PPSkippedRange(SourceRange R)
- : Begin(R.getBegin().getRawEncoding()), End(R.getEnd().getRawEncoding()) {
- }
+ PPSkippedRange(RawLocEncoding Begin, RawLocEncoding End)
+ : Begin(Begin), End(End) {}
- SourceLocation getBegin() const {
- return SourceLocation::getFromRawEncoding(Begin);
- }
- SourceLocation getEnd() const {
- return SourceLocation::getFromRawEncoding(End);
- }
+ RawLocEncoding getBegin() const { return Begin; }
+ RawLocEncoding getEnd() const { return End; }
};
/// Offset in the AST file. Use splitted 64-bit integer into low/high
@@ -239,8 +233,10 @@ struct UnderalignedInt64 {
/// Source location and bit offset of a declaration.
struct DeclOffset {
+ using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
/// Raw source location.
- SourceLocation::UIntTy Loc = 0;
+ RawLocEncoding RawLoc = 0;
/// Offset relative to the start of the DECLTYPES_BLOCK block. Keep
/// structure alignment 32-bit and avoid padding gap because undefined
@@ -248,17 +244,15 @@ struct DeclOffset {
UnderalignedInt64 BitOffset;
DeclOffset() = default;
- DeclOffset(SourceLocation Loc, uint64_t BitOffset,
- uint64_t DeclTypesBlockStartOffset) {
- setLocation(Loc);
+ DeclOffset(RawLocEncoding RawLoc, uint64_t BitOffset,
+ uint64_t DeclTypesBlockStartOffset)
+ : RawLoc(RawLoc) {
setBitOffset(BitOffset, DeclTypesBlockStartOffset);
}
- void setLocation(SourceLocation L) { Loc = L.getRawEncoding(); }
+ void setRawLoc(RawLocEncoding Loc) { RawLoc = Loc; }
- SourceLocation getLocation() const {
- return SourceLocation::getFromRawEncoding(Loc);
- }
+ RawLocEncoding getRawLoc() const { return RawLoc; }
void setBitOffset(uint64_t Offset, const uint64_t DeclTypesBlockStartOffset) {
BitOffset.setBitOffset(Offset - DeclTypesBlockStartOffset);
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 370d8037a4da17..017c6b76a91495 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -696,7 +696,7 @@ class ASTReader
/// Mapping from global submodule IDs to the module file in which the
/// submodule resides along with the offset that should be added to the
/// global submodule ID to produce a local ID.
- GlobalSubmoduleMapType GlobalSubmoduleMap;
+ mutable GlobalSubmoduleMapType GlobalSubmoduleMap;
/// A set of hidden declarations.
using HiddenNames = SmallVector<Decl *, 2>;
@@ -942,6 +942,12 @@ class ASTReader
/// Sema tracks these to emit deferred diags.
llvm::SmallSetVector<serialization::DeclID, 4> DeclsToCheckForDeferredDiags;
+ /// The module files imported by different module files. Indirectly imported
+ /// module files are included too. The information comes from
+ /// ReadModuleOffsetMap(ModuleFile&).
+ mutable llvm::DenseMap<ModuleFile *, llvm::SmallVector<ModuleFile *>>
+ ImportedModuleFiles;
+
private:
struct ImportedSubmodule {
serialization::SubmoduleID ID;
@@ -1761,6 +1767,7 @@ class ASTReader
/// Retrieve the module manager.
ModuleManager &getModuleManager() { return ModuleMgr; }
+ const ModuleManager &getModuleManager() const { return ModuleMgr; }
/// Retrieve the preprocessor.
Preprocessor &getPreprocessor() const { return PP; }
@@ -2170,8 +2177,8 @@ class ASTReader
/// Retrieve the global submodule ID given a module and its local ID
/// number.
- serialization::SubmoduleID
- getGlobalSubmoduleID(ModuleFile &M, unsigned LocalID);
+ serialization::SubmoduleID getGlobalSubmoduleID(ModuleFile &M,
+ unsigned LocalID) const;
/// Retrieve the submodule that corresponds to a global submodule ID.
///
@@ -2184,7 +2191,7 @@ class ASTReader
/// Retrieve the module file with a given local ID within the specified
/// ModuleFile.
- ModuleFile *getLocalModuleFile(ModuleFile &M, unsigned ID);
+ ModuleFile *getLocalModuleFile(ModuleFile &M, unsigned ID) const;
/// Get an ID for the given module file.
unsigned getModuleFileID(ModuleFile *M);
@@ -2220,40 +2227,47 @@ class ASTReader
return Sema::AlignPackInfo::getFromRawEncoding(Raw);
}
+ using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
/// Read a source location from raw form and return it in its
/// originating module file's source location space.
- SourceLocation ReadUntranslatedSourceLocation(SourceLocation::UIntTy Raw,
- LocSeq *Seq = nullptr) const {
+ std::pair<SourceLocation, unsigned>
+ ReadUntranslatedSourceLocation(RawLocEncoding Raw,
+ LocSeq *Seq = nullptr) const {
return SourceLocationEncoding::decode(Raw, Seq);
}
/// Read a source location from raw form.
- SourceLocation ReadSourceLocation(ModuleFile &ModuleFile,
- SourceLocation::UIntTy Raw,
- LocSeq *Seq = nullptr) const {
- SourceLocation Loc = ReadUntranslatedSourceLocation(Raw, Seq);
- return TranslateSourceLocation(ModuleFile, Loc);
+ SourceLocation ReadRawSourceLocation(ModuleFile &MF, RawLocEncoding Raw,
+ LocSeq *Seq = nullptr) const {
+ if (!MF.ModuleOffsetMap.empty())
+ ReadModuleOffsetMap(MF);
+
+ auto [Loc, ModuleFileIndex] = ReadUntranslatedSourceLocation(Raw, Seq);
+ ModuleFile *ModuleFileHomingLoc =
+ ModuleFileIndex ? ImportedModuleFiles[&MF][ModuleFileIndex - 1] : &MF;
+ return TranslateSourceLocation(*ModuleFileHomingLoc, Loc);
}
/// Translate a source location from another module file's source
/// location space into ours.
SourceLocation TranslateSourceLocation(ModuleFile &ModuleFile,
SourceLocation Loc) const {
- if (!ModuleFile.ModuleOffsetMap.empty())
- ReadModuleOffsetMap(ModuleFile);
- assert(ModuleFile.SLocRemap.find(Loc.getOffset()) !=
- ModuleFile.SLocRemap.end() &&
- "Cannot find offset to remap.");
- SourceLocation::IntTy Remap =
- ModuleFile.SLocRemap.find(Loc.getOffset())->second;
- return Loc.getLocWithOffset(Remap);
+ if (Loc.isInvalid())
+ return Loc;
+
+ // It implies that the Loc is already translated.
+ if (SourceMgr.isLoadedSourceLocation(Loc))
+ return Loc;
+
+ return Loc.getLocWithOffset(ModuleFile.SLocEntryBaseOffset - 2);
}
/// Read a source location.
SourceLocation ReadSourceLocation(ModuleFile &ModuleFile,
const RecordDataImpl &Record, unsigned &Idx,
LocSeq *Seq = nullptr) {
- return ReadSourceLocation(ModuleFile, Record[Idx++], Seq);
+ return ReadRawSourceLocation(ModuleFile, Record[Idx++], Seq);
}
/// Read a FileID.
diff --git a/clang/include/clang/Serialization/ASTWriter.h b/clang/include/clang/Serialization/ASTWriter.h
index 3ed9803fa3745b..70bf204ed598ef 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -648,6 +648,10 @@ class ASTWriter : public ASTDeserializationListener,
void AddSourceLocation(SourceLocation Loc, RecordDataImpl &Record,
LocSeq *Seq = nullptr);
+ /// Return the raw encodings for source locations.
+ SourceLocationEncoding::RawLocEncoding
+ getRawSourceLocationEncoding(SourceLocation Loc, LocSeq *Seq = nullptr);
+
/// Emit a source range.
void AddSourceRange(SourceRange Range, RecordDataImpl &Record,
LocSeq *Seq = nullptr);
diff --git a/clang/include/clang/Serialization/ModuleFile.h b/clang/include/clang/Serialization/ModuleFile.h
index bc0aa89966c2b4..ea24b44e5e411b 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -295,10 +295,6 @@ class ModuleFile {
/// AST file.
const uint32_t *SLocEntryOffsets = nullptr;
- /// Remapping table for source locations in this module.
- ContinuousRangeMap<SourceLocation::UIntTy, SourceLocation::IntTy, 2>
- SLocRemap;
-
// === Identifiers ===
/// The number of identifiers in this AST file.
diff --git a/clang/include/clang/Serialization/SourceLocationEncoding.h b/clang/include/clang/Serialization/SourceLocationEncoding.h
index 9bb0dbe2e4d6f6..3925b08c1837ed 100644
--- a/clang/include/clang/Serialization/SourceLocationEncoding.h
+++ b/clang/include/clang/Serialization/SourceLocationEncoding.h
@@ -6,23 +6,26 @@
//
//===----------------------------------------------------------------------===//
//
-// Source locations are stored pervasively in the AST, making up a third of
-// the size of typical serialized files. Storing them efficiently is important.
+// We wish to encode the SourceLocation from other module file not dependent
+// on the other module file. So that the source location changes from other
+// module file may not affect the contents of the current module file. Then the
+// users don't need to recompile the whole project due to a new line in a module
+// unit in the root of the dependency graph.
//
-// We use integers optimized by VBR-encoding, because:
-// - when abbreviations cannot be used, VBR6 encoding is our only choice
-// - in the worst case a SourceLocation can be ~any 32-bit number, but in
-// practice they are highly predictable
+// To achieve this, we need to encode the index of the module file into the
+// encoding of the source location. The encoding of the source location may be:
//
-// We encode the integer so that likely values encode as small numbers that
-// turn into few VBR chunks:
-// - the invalid sentinel location is a very common value: it encodes as 0
-// - the "macro or not" bit is stored at the bottom of the integer
-// (rather than at the top, as in memory), so macro locations can have
-// small representations.
-// - related locations (e.g. of a left and right paren pair) are usually
-// similar, so when encoding a sequence of locations we store only
-// differences between successive elements.
+// |-----------------------|-----------------------|
+// | A | B | C |
+//
+// * A: 32 bit. The index of the module file in the module manager + 1. The +1
+// here
+// is necessary since we wish 0 stands for the current module file.
+// * B: 31 bit. The offset of the source location to the module file containing
+// it.
+// * C: The macro bit. We rotate it to the lowest bit so that we can save some
+// space
+// in case the index of the module file is 0.
//
//===----------------------------------------------------------------------===//
@@ -52,11 +55,20 @@ class SourceLocationEncoding {
friend SourceLocationSequence;
public:
- static uint64_t encode(SourceLocation Loc,
- SourceLocationSequence * = nullptr);
- static SourceLocation decode(uint64_t, SourceLocationSequence * = nullptr);
+ using RawLocEncoding = uint64_t;
+
+ static RawLocEncoding encode(SourceLocation Loc, UIntTy BaseOffset,
+ unsigned BaseModuleFileIndex,
+ SourceLocationSequence * = nullptr);
+ static std::pair<SourceLocation, unsigned>
+ decode(RawLocEncoding, SourceLocationSequence * = nullptr);
};
+/// TODO: Remove SourceLocationSequence since it is not used now.
+/// Since we will put the index for ModuleFile in the high bits in the encodings
+/// for source locations, it is meaningless to reduce the size of source
+/// locations.
+///
/// Serialized encoding of a sequence of SourceLocations.
///
/// Optimized to produce small values when locations with the sequence are
@@ -149,14 +161,30 @@ class SourceLocationSequence::State {
operator SourceLocationSequence *() { return &Seq; }
};
-inline uint64_t SourceLocationEncoding::encode(SourceLocation Loc,
- SourceLocationSequence *Seq) {
- return Seq ? Seq->encode(Loc) : encodeRaw(Loc.getRawEncoding());
+inline SourceLocationEncoding::RawLocEncoding
+SourceLocationEncoding::encode(SourceLocation Loc, UIntTy BaseOffset,
+ unsigned BaseModuleFileIndex,
+ SourceLocationSequence *Seq) {
+ if (Loc.isInvalid())
+ return 0;
+
+ assert(Loc.getOffset() >= BaseOffset);
+ Loc = Loc.getLocWithOffset(-BaseOffset);
+ RawLocEncoding Encoded = encodeRaw(Loc.getRawEncoding());
+ assert(Encoded < ((RawLocEncoding)1 << 32));
+
+ assert(BaseModuleFileIndex < ((RawLocEncoding)1 << 32));
+ Encoded |= (RawLocEncoding)BaseModuleFileIndex << 32;
+ return Encoded;
}
-inline SourceLocation
-SourceLocationEncoding::decode(uint64_t Encoded, SourceLocationSequence *Seq) {
- return Seq ? Seq->decode(Encoded)
- : SourceLocation::getFromRawEncoding(decodeRaw(Encoded));
+inline std::pair<SourceLocation, unsigned>
+SourceLocationEncoding::decode(RawLocEncoding Encoded,
+ SourceLocationSequence *Seq) {
+ unsigned ModuleFileIndex = Encoded >> 32;
+ Encoded &= ((RawLocEncoding)1 << 33) - 1;
+ SourceLocation Loc = SourceLocation::getFromRawEncoding(decodeRaw(Encoded));
+
+ return {Loc, ModuleFileIndex};
}
} // namespace clang
diff --git a/clang/lib/Frontend/ASTUnit.cpp b/clang/lib/Frontend/ASTUnit.cpp
index 3610a08831e79a..1c655260b09eb5 100644
--- a/clang/lib/Frontend/ASTUnit.cpp
+++ b/clang/lib/Frontend/ASTUnit.cpp
@@ -2373,8 +2373,6 @@ bool ASTUnit::serialize(raw_ostream &OS) {
return serializeUnit(Writer, Buffer, getSema(), OS);
}
-using SLocRemap = ContinuousRangeMap<unsigned, int, 2>;
-
void ASTUnit::TranslateStoredDiagnostics(
FileManager &FileMgr,
SourceManager &SrcMgr,
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 28e8d27fef08c6..625666ead0fb49 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1645,7 +1645,7 @@ bool ASTReader::ReadSLocEntry(int ID) {
if (!File)
return true;
- SourceLocation IncludeLoc = ReadSourceLocation(*F, Record[1]);
+ SourceLocation IncludeLoc = ReadRawSourceLocation(*F, Record[1]);
if (IncludeLoc.isInvalid() && F->Kind != MK_MainFile) {
// This is the module's main file.
IncludeLoc = getImportLocation(F);
@@ -1687,7 +1687,7 @@ bool ASTReader::ReadSLocEntry(int ID) {
unsigned Offset = Record[0];
SrcMgr::CharacteristicKind
FileCharacter = (SrcMgr::CharacteristicKind)Record[2];
- SourceLocation IncludeLoc = ReadSourceLocation(*F, Record[1]);
+ SourceLocation IncludeLoc = ReadRawSourceLocation(*F, Record[1]);
if (IncludeLoc.isInvalid() && F->isModule()) {
IncludeLoc = getImportLocation(F);
}
@@ -1707,9 +1707,9 @@ bool ASTReader::ReadSLocEntry(int ID) {
case SM_SLOC_EXPANSION_ENTRY: {
LocSeq::State Seq;
- SourceLocation SpellingLoc = ReadSourceLocation(*F, Record[1], Seq);
- SourceLocation ExpansionBegin = ReadSourceLocation(*F, Record[2], Seq);
- SourceLocation ExpansionEnd = ReadSourceLocation(*F, Record[3], Seq);
+ SourceLocation SpellingLoc = ReadRawSourceLocation(*F, Record[1], Seq);
+ SourceLocation ExpansionBegin = ReadRawSourceLocation(*F, Record[2], Seq);
+ SourceLocation ExpansionEnd = ReadRawSourceLocation(*F, Record[3], Seq);
SourceMgr.createExpansionLoc(SpellingLoc, ExpansionBegin, ExpansionEnd,
Record[5], Record[4], ID,
BaseOffset + Record[0]);
@@ -3038,8 +3038,10 @@ ASTReader::ReadControlBlock(ModuleFile &F,
// The import location will be the local one for now; we will adjust
// all import locations of module imports after the global source
// location info are setup, in ReadAST.
- SourceLocation ImportLoc =
+ auto [ImportLoc, ImportModuleFileIndex] =
ReadUntranslatedSourceLocation(Record[Idx++]);
+ // The import location must belong to the current module file itself.
+ assert(ImportModuleFileIndex == 0);
off_t StoredSize = !IsImportingStdCXXModule ? (off_t)Record[Idx++] : 0;
time_t StoredModTime =
!IsImportingStdCXXModule ? (time_t)Record[Idx++] : 0;
@@ -3658,13 +3660,6 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
std::make_pair(SourceManager::MaxLoadedOffset - F.SLocEntryBaseOffset
- SLocSpaceSize,&F));
- // Initialize the remapping table.
- // Invalid stays invalid.
- F.SLocRemap.insertOrReplace(std::make_pair(0U, 0));
- // This module. Base was 2 when being compiled.
- F.SLocRemap.insertOrReplace(std::make_pair(
- 2U, static_cast<SourceLocation::IntTy>(F.SLocEntryBaseOffset - 2)));
-
TotalNumSLocEntries += F.LocalNumSLocEntries;
break;
}
@@ -3941,7 +3936,7 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
if (Record.size() != 1)
return llvm::createStringError(std::errc::illegal_byte_sequence,
"invalid pragma optimize record");
- OptimizeOffPragmaLocation = ReadSourceLocation(F, Record[0]);
+ OptimizeOffPragmaLocation = ReadRawSourceLocation(F, Record[0]);
break;
case MSSTRUCT_PRAGMA_OPTIONS:
@@ -3957,7 +3952,7 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
std::errc::illegal_byte_sequence,
"invalid pragma pointers to members record");
PragmaMSPointersToMembersState = Record[0];
- PointersToMembersPragmaLocation = ReadSourceLocation(F, Record[1]);
+ PointersToMembersPragmaLocation = ReadRawSourceLocation(F, Record[1]);
break;
case UNUSED_LOCAL_TYPEDEF_NAME_CA...
[truncated]
|
BTW, after this patch, with reduced BMI (#85050), we can already do something more interesting than reformating:
Now the B.pcm will keep unchanged with |
+1 on the high-level plan. Switching from a linear offset to a {local-module-index, offset-within-module} pair sounds great! |
By default, |
Yes, I explained this in The only trade-off I saw about this change is that it may increase the size of on-disk .pcm files due to we use VBR6 format to decrease the size of small numbers. But on the one side, we still need to pay for more spaces if we want to use |
Thank you, using 64 bits in the serialization format makes sense! This also means that whenever Clang is configured with 64 bit
Sorry, I don't quite understand. Are you saying you did or did not try to encode this as two separate 32bit values? |
If Clang is configured with 64 bit Or otherwise we can use less bits for module file index (32 bits seems to be too much honestly), then we can use higher 16 bits to store the module file index, and leave the lower 48 bits to store the source location. In this case, the assumption becomes to "the offset of the source location may not large than 2^48". But it is slightly hard to believe we can reach such extreme cases.
I tried to encode this as two separate 32bit values. But it will break too many codes. Since a lot of places assume that we can encode the source location as an uint64_t. What I mean is, with VBR6 format (https://llvm.org/docs/BitCodeFormat.html#variable-width-integer), we can save more space for small integers in on-disk .pcm files (the memory representation should be the same). For example, for a 64 bits unsigned int |
Let's see if @statham-arm (who introduced the
Right. My thinking was that single 64bit value with the module file index in the upper 32 bits would basically disable VBR6 encoding for the lower 32 bits. If we split this thing into two separate 32bit values, we are more likely to VBR6 encode both of them. But this would actually increase size for (what I assume is the most common case) local source locations. Still, I think having a rough idea of how alternative implementations compare would be great. Do you have any data on how much recompilation this can save for real world projects? |
Yes, this is the trade offs.
I don't have a specific data though. But I think it is understandable that the feature is super important. And we can save countless unnecessary compilations after this feature. After the patch, for reduced BMI, we're already able to avoid changing the BMI if we only change the definitions of non-inline function bodies in the module unit. Further more, based on this patch, we can do further optimizations. e.g., if we split declaration ID like we did in this patch, we may be able to avoid recompilations if a unreferenced module unit adds or deletes declarations. This may be pretty common in practice:
Then if the user only changes module unit I see this is a killer feature for C++20 modules. I think it is significantly valuable. So I really want to make this (including later optimizations) into clang19. |
I'm afraid my knowledge of C++ modules is very close to zero. They were mentioned in a training course I did last year, but not in much detail. On 64-bit SourceLocation in general: our patch series to implement those as an option in clang was never fully landed, because the second half of it stalled in review. I'd still like to see it finished off, though I'm sure it would need some vigorous rebasing and retesting by now. The 32-bit SourceLocation limit is a problem for at least some of our users, apparently because there's a library of header files that break the limit all by themselves. (I'm not sure how; I haven't seen them. Maybe by including each other multiple times with different But I have no idea whether the same considerations would apply to modules, because I don't really know enough about modules, sorry! |
Hi Statham, thanks for quick reply. I guess the abstract question beyond modules may be: is it safe to use 48 bit to store the source location? Then we can use the higher 16 bit to represent module specific informations, e.g., module file index. |
Yes, we think that should be safe – we don't think any of our users is generating 2^48 bytes of preprocessed output. Parsing it afterwards would take prohibitively long if they did! Probably they aren't breaking the 2^32 barrier by much. |
Thanks. And that makes sense. I never heard compilation crash due to too long source fils before : ) |
8d4e349
to
8e07cbd
Compare
Rebase with main and enabling the sequence optimization when the module file index is 0. Now the size of the PCMs goes to 204M (originally it is 200M). It looks better. @jansvoboda11 @Bigcheese ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the general approach makes sense. I'll take a closer look at the specific changes.
assert(Loc.getOffset() >= BaseOffset); | ||
Loc = Loc.getLocWithOffset(-BaseOffset); | ||
RawLocEncoding Encoded = encodeRaw(Loc.getRawEncoding()); | ||
assert(Encoded < ((RawLocEncoding)1 << 32)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My compiler says:
warning: result of comparison of constant 4294967296 with expression of type 'unsigned int' is always true [-Wtautological-constant-out-of-range-compare]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you may refer to the next line and that one is indeed comparing unsigned with (1<<32)
, and I fixed that. This line should be fine since Encoded has type RawLocEncoding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm referring to this line. I assume the compiler sees through RawLocEncoding
and knows it's being initialized from SourceLocation::UIntTy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the compiler may be too smart : )
My intention was like, if someday we turned 64 bit source location on, this assertion can make us remember to update here.
But given the warning, I don't know how to write the assertion here. So I just removed it.
mutable llvm::DenseMap<ModuleFile *, llvm::SmallVector<ModuleFile *>> | ||
ImportedModuleFiles; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this live in ASTReader
? If we moved it to ModuleFile
, we could remove mutable
and the map lookups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in the new commit.
There was a similar member Imports
in ModuleFile
, which records the directly imported module files from the coding. But, yes, it is good to remove the map lookups and let's try to handle the ambiguous by commenting.
// 0 means the location is not loaded. So we need to add 1 to the index to | ||
// make it clear. | ||
ModuleFileIndex = F->Index + 1; | ||
assert(&getChain()->getModuleManager()[F->Index] == F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit tautological.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I'd like to remain it. The assertions make me feel better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think ASTWriter
should be re-checking ModuleManager
invariants either way.
Reverted. It looks like the configuration isn't in our bots actually. I can't open that site. I need to take another look at the code though. |
Oh, maybe I found the reason. It is because my patch breaks the alignments of llvm-project/clang/include/clang/Serialization/ASTBitCodes.h Lines 237 to 240 in 8d28e58
then it explains why it work well in some platforms but not in other platforms. I'll fix this. |
This is certainly a case of unaligned access. In a local build, I've run the first failing
this reveals
The |
thanks, it is pretty helpful. |
This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
I've relanded this in 947b062. |
into the current module Following of #86912. After #86912, with reduced BMI, the BMI can keep unchange if the dependent modules only changes the implementation (without introduing new decls). However, this is not strictly correct. For example: ``` // a.cppm export module a; export inline int a() { ... } // b.cppm export module b; import a; export inline int b() { return a(); } ``` Since both `a()` and `b()` are inline, we need to make sure the BMI of `b.pcm` will change after the implementation of `a()` changes. We can't get that naturally since we won't record the body of `a()` during the writing process. We can't reuse ODRHash here since ODRHash won't calculate the called function recursively. So ODRHash will be problematic if `a()` calls other inline functions. Probably we can solve this by a new hash mechanism. But the safety and efficiency may a problem too. Here we just combine the hash value of the used modules conservatively.
Following of #86912 #### Motivation Example The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. #### Design details The design of the patch is similar to #86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. #### Overhead As #86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of #86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to #86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As #86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of #86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to #86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As #86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of llvm#86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to llvm#86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As llvm#86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of #86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to #86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As #86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of llvm#86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to llvm#86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As llvm#86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of #86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to #86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As #86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Following of #92083 The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. After this patch, (and the above patch), we can already do something pretty interesting. For example, #### Motivation example ``` //--- m-partA.cppm export module m:partA; export inline int getA() { return 43; } export class A { public: int getMem(); }; export template <typename T> class ATempl { public: T getT(); }; //--- m-partA.v1.cppm export module m:partA; export inline int getA() { return 43; } // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. export inline int getA2() { return 88; } export class A { public: int getMem(); // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. int getMem2(); }; export template <typename T> class ATempl { public: T getT(); // Add a new declaration without introducing a new type. T getT2(); }; //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` In this example, module `m` exports two partitions `:partA` and `:partB`. And a consumer `useBOnly` only consumes the entities from `:partB`. So we don't hope the BMI of `useBOnly` changes if only `:partA` changes. After this patch, we can make it if the change of `:partA` doesn't introduce new types. (And we can get rid of this if we make no-transitive-type-change). As the example shows, when we change the implementation of `:partA` from `m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration `getA2()` at the global namespace, add a new member function `getMem2()` to class `A` and add a new member function to `getT2()` to class template `ATempl`. And since `:partA` is not used by `useBOnly` completely, the BMI of `useBOnly` won't change after we made above changes. #### Design details Method used in this patch is similar with #92083 and #86912. It extends the 32 bit IdentifierID to 64 bits and use the higher 32 bits to store the module file index. So that the encoding of the identifier won't get affected by other modules. #### Overhead Similar with #92083 and #86912. The change is only expected to increase the size of the on-disk .pcm files and not affect the compile-time performances. And from my experiment, the size of the on-disk change only increase 1%+ and observe no compile-time impacts. #### Future Plans I'll try to do the same thing for type ids. IIRC, it won't change the dependency graph if we add a new type in an unused units. I do think this is a significant win. And this will be a pretty good answer to "why modules are better than headers."
Following of llvm#92083 The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. After this patch, (and the above patch), we can already do something pretty interesting. For example, #### Motivation example ``` //--- m-partA.cppm export module m:partA; export inline int getA() { return 43; } export class A { public: int getMem(); }; export template <typename T> class ATempl { public: T getT(); }; //--- m-partA.v1.cppm export module m:partA; export inline int getA() { return 43; } // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. export inline int getA2() { return 88; } export class A { public: int getMem(); // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. int getMem2(); }; export template <typename T> class ATempl { public: T getT(); // Add a new declaration without introducing a new type. T getT2(); }; //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` In this example, module `m` exports two partitions `:partA` and `:partB`. And a consumer `useBOnly` only consumes the entities from `:partB`. So we don't hope the BMI of `useBOnly` changes if only `:partA` changes. After this patch, we can make it if the change of `:partA` doesn't introduce new types. (And we can get rid of this if we make no-transitive-type-change). As the example shows, when we change the implementation of `:partA` from `m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration `getA2()` at the global namespace, add a new member function `getMem2()` to class `A` and add a new member function to `getT2()` to class template `ATempl`. And since `:partA` is not used by `useBOnly` completely, the BMI of `useBOnly` won't change after we made above changes. #### Design details Method used in this patch is similar with llvm#92083 and llvm#86912. It extends the 32 bit IdentifierID to 64 bits and use the higher 32 bits to store the module file index. So that the encoding of the identifier won't get affected by other modules. #### Overhead Similar with llvm#92083 and llvm#86912. The change is only expected to increase the size of the on-disk .pcm files and not affect the compile-time performances. And from my experiment, the size of the on-disk change only increase 1%+ and observe no compile-time impacts. #### Future Plans I'll try to do the same thing for type ids. IIRC, it won't change the dependency graph if we add a new type in an unused units. I do think this is a significant win. And this will be a pretty good answer to "why modules are better than headers."
(Some backgrounds, not required to read: https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755) This is the document part for the no-transitive-change (#86912, #92083, #92085, #92511) to provide the ability for build system to skip some unnecessary recompilations. See the patch for examples.
(Some backgrounds, not required to read: https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755) This is the document part for the no-transitive-change (llvm#86912, llvm#92083, llvm#92085, llvm#92511) to provide the ability for build system to skip some unnecessary recompilations. See the patch for examples.
Summary: (Some backgrounds, not required to read: https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755) This is the document part for the no-transitive-change (#86912, #92083, #92085, #92511) to provide the ability for build system to skip some unnecessary recompilations. See the patch for examples. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251509
This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting.
The idea comes from @jyknight posted on LLVM discourse. That for:
Almost every time A.cppm changes, we need to recompile
B
. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompilingC
if the change fromA
wouldn't change the BMI of B.Motivation Example
This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test.
Here the only difference between
A.cppm
andA.v1.cppm
is thatA.v1.cppm
has an additional blank line. Then the test shows that two BMI ofB.cppm
, one specified-fmodule-file=A=A.pcm
and the other specified-fmodule-file=A=A.v1.pcm
, should have the bit-wise same contents.However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from
A
andA.v1
, so it is expected that the BMIC.pcm
andC.v1.pcm
can and should differ.Internal perspective of status quo
To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them.
For source locations, we encoded them as:
As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset.
For example, for,
Assuming the offset of a source location (let's name the location as
S
) in a.cppm is 45 and we will record the value45
into the BMIa.pcm
. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space forS
, we can get it simply by adding45
to90
to135
. Finally we can get the source location forS
in module B as135
.And when we want to write module
b
, we would also write the source location ofS
as135
directly in the BMI. And to clarify the locationS
comes from modulea
, we also need to record the base offset of modulea
, 90 in the BMI ofb
.Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve.
Let's continue with the existing design to understand what's going on. Another interesting case is:
In
c.cppm
, when we importa
, we still need to allocate a base location offset for it, let's say the value becomes to200
somehow. Then when we reach the locationS
recorded in moduleb
, we need to translate it into the current source location space. The solution is quite simple, we can get it by135 + (200 - 90) = 245
. In another word, the offset of a source location in current module can be computed asRecorded Offset + Base Offset of the its module file - Recorded Base Offset
.Then we're almost done about how we handle the offset of source locations in serializers.
The high level design of current patch
From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location.
So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in
b.pcm
, the source location ofS
will be recorded as135
directly. And in the new design, the source location ofS
will be recorded as<1, 45>
. Here1
stands for the module file index ofa
in moduleb
. And45
means the offset ofS
to the base offset of modulea
.So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too.
And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair.
Some low level details
I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use
unsigned
to store source locations but we use uint64_t in serialization. And generallyunsigned
is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be:(The B and C is the existing raw encoding for source locations)
Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty.
Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the
SourceLocationSequence
optimization.This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy.
Correctness
The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location.
For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct.
Future Plans
I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing.
The release nots and documentation will be added seperately.