Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Port VN and loop hoisting to new loop representation #95589

Merged
merged 26 commits into from
Dec 7, 2023

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Dec 4, 2023

Switch VN to use the new loop representation for the limited amount of
stuff it does. This is mainly computing loop side effects and using it
for some more precision, in addition to storing some memory dependencies
around loops.
It requires us to have a block -> loop mapping, so add a
BlockToNaturalLoopMap class to do this. We really do not need this
functionality for much, so we may consider seeing if we can remove it in
the future (and remove BasicBlock::bbNatLoopNum).

In loop hoisting move a few members out of LoopDsc and into
LoopHoistContext; in particular the counts of hoisted variables, which
we use for profitability checks and which gets updated while hoisting is
going on for a single loop. We do not need to refer to the information
from other loops.

Record separate postorder numbers for the old and new DFS since we need
to use both simultaneously after loop unrolling. We will be able to get
rid of this again soon.

A small number of diffs are expected because the loop side effects
computation is now more precise, since the old loop code includes some
blocks in loops that are not actually part of the loop. For example,
blocks that always throw can be in the lexical range and would
previously cause the side effect computation to believe there was a
memory havoc. Also, the side effect computation does some limited value
tracking of assignment, which is more precise now since it is running in
RPO instead of based on loop order before.

Run the new loop recognition at the same time as old loop recognition
and cross validate that the information matches the old loop
information.

Implement induction analysis on the new representation. Validate that it
matches the old loop induction analysis. Add a few quirks to ensure
this.

Port loop cloning to run on the new loop representation. Add a number of
quirks to keep the logic close to the old cloning. One notable
difference is that the old loop recognition considers some blocks to be
part of the loops that actually are not; for example, blocks that return
or break out from the loop are not part of the loop's SCC, but can be
part of the lexical range. This means that the old loop cloning will
clone those blocks and apply static optimizations to them, while the new
loop cloning won't. However, this does not seem to cause a significant
number of regressions.

A few diffs are expected due to considering fewer blocks to be part of
the loop and thus cloning fewer blocks. Also, throughput regressions are
expected since we are doing both loop identifications for now. I
expect to incrementally work towards replacing everything with the new
representation, at which point we can get rid of the old loop
identification and hopefully most of the reachability/dominator
computations. I expect us to make back the TP at that point.
Switch VN to use the new loop representation for the limited amount of
stuff it does. This is mainly computing loop side effects and using it
for some more precision, in addition to storing some memory dependencies
around loops.
It requires us to have a block -> loop mapping, so add a
`BlockToNaturalLoopMap` class to do this. We really do not need this
functionality for much, so we may consider seeing if we can remove it in
the future (and remove `BasicBlock::bbNatLoopNum`).

In loop hoisting move a few members out of `LoopDsc` and into
`LoopHoistContext`; in particular the counts of hoisted variables, which
we use for profitability checks and which gets updated while hoisting is
going on for a single loop. We do not need to refer to the information
from other loops.

Record separate postorder numbers for the old and new DFS since we need
to use both simultaneously after loop unrolling. We will be able to get
rid of this again soon.

A small number of diffs are expected because the loop side effects
computation is now more precise, since the old loop code includes some
blocks in loops that are not actually part of the loop. For example,
blocks that always throw can be in the lexical range and would
previously cause the side effect computation to believe there was a
memory havoc. Also, the side effect computation does some limited value
tracking of assignment, which is more precise now since it is running in
RPO instead of based on loop order before.
@ghost ghost assigned jakobbotsch Dec 4, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 4, 2023
@ghost
Copy link

ghost commented Dec 4, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Switch VN to use the new loop representation for the limited amount of
stuff it does. This is mainly computing loop side effects and using it
for some more precision, in addition to storing some memory dependencies
around loops.
It requires us to have a block -> loop mapping, so add a
BlockToNaturalLoopMap class to do this. We really do not need this
functionality for much, so we may consider seeing if we can remove it in
the future (and remove BasicBlock::bbNatLoopNum).

In loop hoisting move a few members out of LoopDsc and into
LoopHoistContext; in particular the counts of hoisted variables, which
we use for profitability checks and which gets updated while hoisting is
going on for a single loop. We do not need to refer to the information
from other loops.

Record separate postorder numbers for the old and new DFS since we need
to use both simultaneously after loop unrolling. We will be able to get
rid of this again soon.

A small number of diffs are expected because the loop side effects
computation is now more precise, since the old loop code includes some
blocks in loops that are not actually part of the loop. For example,
blocks that always throw can be in the lexical range and would
previously cause the side effect computation to believe there was a
memory havoc. Also, the side effect computation does some limited value
tracking of assignment, which is more precise now since it is running in
RPO instead of based on loop order before.

Based on #95326.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch
Copy link
Member Author

TP regressions:

Base: 115669398234, Diff: 115827060902, +0.1363%

60716940  : NA        : 15.70% : +0.0525% : public: unsigned int __cdecl ValueNumStore::VNForExpr(struct BasicBlock *, enum var_types)                                                                                                                                                   
53395839  : NA        : 13.80% : +0.0462% : private: void __cdecl Compiler::optComputeLoopSideEffectsOfBlock(struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                                          
33827472  : NA        : 8.74%  : +0.0292% : protected: void __cdecl Compiler::AddContainsCallAllContainingLoops(class FlowGraphNaturalLoop *)                                                                                                                                            
24380889  : +596.76%  : 6.30%  : +0.0211% : protected: void __cdecl Compiler::optComputeLoopSideEffects(void)                                                                                                                                                                            
20125631  : NA        : 5.20%  : +0.0174% : protected: bool __cdecl Compiler::optVNIsLoopInvariant(unsigned int, class FlowGraphNaturalLoop *, class JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior> *)
10564925  : NA        : 2.73%  : +0.0091% : public: static class BlockToNaturalLoopMap * __cdecl BlockToNaturalLoopMap::Build(class FlowGraphNaturalLoops *)                                                                                                                             
8714325   : +4.76%    : 2.25%  : +0.0075% : public: static struct BasicBlock * __cdecl BasicBlock::New(class Compiler *)                                                                                                                                                                 
8096425   : +1.19%    : 2.09%  : +0.0070% : public: void __cdecl Compiler::fgValueNumberTree(struct GenTree *)                                                                                                                                                                           
7373669   : NA        : 1.91%  : +0.0064% : public: unsigned int __cdecl ValueNumStore::VNForMapStore(unsigned int, unsigned int, unsigned int)                                                                                                                                          
7322133   : +38.09%   : 1.89%  : +0.0063% : public: unsigned int __cdecl ValueNumStore::VNForMapSelectInner(enum ValueNumKind, enum var_types, unsigned int, unsigned int)                                                                                                               
5431101   : +0.61%    : 1.40%  : +0.0047% : memset                                                                                                                                                                                                                                       
4280844   : +1.48%    : 1.11%  : +0.0037% : public: void __cdecl Compiler::fgValueNumberBlock(struct BasicBlock *)                                                                                                                                                                       
4204838   : NA        : 1.09%  : +0.0036% : protected: void __cdecl Compiler::optHoistLoopBlocks(class FlowGraphNaturalLoop *, class ArrayStack<struct BasicBlock *> *, struct Compiler::LoopHoistContext *)                                                                             
4124325   : +2.51%    : 1.07%  : +0.0036% : public: void __cdecl Compiler::compInit(class ArenaAllocator *, struct CORINFO_METHOD_STRUCT_*, class ICorJitInfo *, struct CORINFO_METHOD_INFO *, struct InlineInfo *)                                                                      
3321557   : NA        : 0.86%  : +0.0029% : protected: bool __cdecl Compiler::optHoistThisLoop(class FlowGraphNaturalLoop *, struct Compiler::LoopHoistContext *)                                                                                                                        
2852281   : NA        : 0.74%  : +0.0025% : public: void __cdecl Compiler::optFindNewLoops(void)                                                                                                                                                                                         
2337736   : +1170.99% : 0.60%  : +0.0020% : public: bool __cdecl FlowGraphNaturalLoop::ContainsBlock(struct BasicBlock *)                                                                                                                                                                
2197805   : +0.09%    : 0.57%  : +0.0019% : public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)                                                                                                                                                                      
1423528   : +2.27%    : 0.37%  : +0.0012% : public: static class FlowGraphNaturalLoops * __cdecl FlowGraphNaturalLoops::Find(class FlowGraphDfsTree const *)                                                                                                                             
1129028   : +0.34%    : 0.29%  : +0.0010% : private: void __cdecl SsaBuilder::BlockRenameVariables(struct BasicBlock *)                                                                                                                                                                  
1127779   : NA        : 0.29%  : +0.0010% : public: unsigned int __cdecl Compiler::fgMemoryVNForLoopSideEffects(enum MemoryKind, struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                      
982049    : +0.36%    : 0.25%  : +0.0008% : `Compiler::fgComputeDfs'::`2'::<lambda_1>::operator()                                                                                                                                                                                        
863690    : NA        : 0.22%  : +0.0007% : protected: void __cdecl Compiler::optHoistCandidate(struct GenTree *, struct BasicBlock *, class FlowGraphNaturalLoop *, struct Compiler::LoopHoistContext *)                                                                                
639173    : NA        : 0.17%  : +0.0006% : private: void __cdecl Compiler::optPerformHoistExpr(struct GenTree *, struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                                     
503193    : +69.49%   : 0.13%  : +0.0004% : public: void __cdecl Compiler::optRecordLoopMemoryDependence(struct GenTree *, struct BasicBlock *, unsigned int)                                                                                                                            
-475395   : -15.05%   : 0.12%  : -0.0004% : public: void __cdecl Compiler::fgValueNumberArrayElemLoad(struct GenTree *, struct VNFuncApp *)                                                                                                                                              
-492997   : -100.00%  : 0.13%  : -0.0004% : protected: void __cdecl Compiler::AddModifiedFieldAllContainingLoops(unsigned int, struct CORINFO_FIELD_STRUCT_*, enum Compiler::FieldKindForVN)                                                                                             
-543196   : -28.05%   : 0.14%  : -0.0005% : public: void __cdecl Compiler::fgValueNumberArrayElemStore(struct GenTree *, struct VNFuncApp *, unsigned int, unsigned int)                                                                                                                 
-662618   : -100.00%  : 0.17%  : -0.0006% : private: void __cdecl Compiler::optPerformHoistExpr(struct GenTree *, struct BasicBlock *, unsigned int)                                                                                                                                     
-925938   : -100.00%  : 0.24%  : -0.0008% : protected: void __cdecl Compiler::optHoistCandidate(struct GenTree *, struct BasicBlock *, unsigned int, struct Compiler::LoopHoistContext *)                                                                                                
-935385   : -1.85%    : 0.24%  : -0.0008% : `Compiler::optHoistLoopBlocks'::`2'::HoistVisitor::PostOrderVisit                                                                                                                                                                            
-1028653  : -14.12%   : 0.27%  : -0.0009% : public: void __cdecl Compiler::fgValueNumberFieldStore(struct GenTree *, struct GenTree *, class FieldSeq *, __int64, unsigned int, unsigned int)                                                                                            
-1279178  : -100.00%  : 0.33%  : -0.0011% : protected: void __cdecl Compiler::AddContainsCallAllContainingLoops(unsigned int)                                                                                                                                                            
-1914799  : -100.00%  : 0.49%  : -0.0017% : public: unsigned int __cdecl Compiler::fgMemoryVNForLoopSideEffects(enum MemoryKind, struct BasicBlock *, unsigned int)                                                                                                                      
-2128688  : -100.00%  : 0.55%  : -0.0018% : private: void __cdecl Compiler::optRecordLoopNestsMemoryHavoc(unsigned int, unsigned int)                                                                                                                                                    
-2249218  : -77.42%   : 0.58%  : -0.0019% : public: enum PhaseStatus __cdecl Compiler::optFindLoopsPhase(void)                                                                                                                                                                           
-2348262  : -11.82%   : 0.61%  : -0.0020% : public: void __cdecl Compiler::fgValueNumberFieldLoad(struct GenTree *, struct GenTree *, class FieldSeq *, __int64)                                                                                                                         
-2733873  : -14.67%   : 0.71%  : -0.0024% : public: void __cdecl Compiler::fgValueNumberCall(struct GenTreeCall *)                                                                                                                                                                       
-3219484  : -100.00%  : 0.83%  : -0.0028% : protected: bool __cdecl Compiler::optHoistThisLoop(unsigned int, struct Compiler::LoopHoistContext *)                                                                                                                                        
-4207077  : -100.00%  : 1.09%  : -0.0036% : protected: void __cdecl Compiler::optHoistLoopBlocks(unsigned int, class ArrayStack<struct BasicBlock *> *, struct Compiler::LoopHoistContext *)                                                                                             
-4276316  : -31.33%   : 1.11%  : -0.0037% : public: void __cdecl Compiler::fgMutateGcHeap(struct GenTree *)                                                                                                                                                                              
-13184191 : -100.00%  : 3.41%  : -0.0114% : public: void __cdecl Compiler::recordGcHeapStore(struct GenTree *, unsigned int)                                                                                                                                                             
-20002426 : -100.00%  : 5.17%  : -0.0173% : protected: bool __cdecl Compiler::optVNIsLoopInvariant(unsigned int, unsigned int, class JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior> *)                
-50043441 : -100.00%  : 12.94% : -0.0433% : private: bool __cdecl Compiler::optComputeLoopSideEffectsOfBlock(struct BasicBlock *)                                                                                                                                                        

So (1) In VNForExpr it is a bit more expensive to extract the loop containing the block; (2) AddContainsCallAllContainingLoops is more expensive; it should be switched to use FlowGraphNaturalLoop::GetChild(); (3) I switched optComputeLoopSideEffects to iterate over all blocks, whereas it previously iterated top-level loops and then their loop blocks. Will switch this back.

Also looks like there is a related jitstress failure, so will look at that.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Dec 6, 2023

TP regression is a bit smaller now (with local changes I haven't pushed yet), but I think we'll have to live with it. It mainly comes from being a bit more expensive to do the loop number <-> block queries.

Base: 115669398234, Diff: 115791952713, +0.1060%

60288105  : NA        : 17.15% : +0.0521% : public: unsigned int __cdecl ValueNumStore::VNForExpr(struct BasicBlock *, enum var_types)                                                                                                                                                   
53395839  : NA        : 15.19% : +0.0462% : private: void __cdecl Compiler::optComputeLoopSideEffectsOfBlock(struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                                          
20125631  : NA        : 5.73%  : +0.0174% : protected: bool __cdecl Compiler::optVNIsLoopInvariant(unsigned int, class FlowGraphNaturalLoop *, class JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior> *)
14648945  : NA        : 4.17%  : +0.0127% : public: static class BlockToNaturalLoopMap * __cdecl BlockToNaturalLoopMap::Build(class FlowGraphNaturalLoops *)                                                                                                                             
14159157  : NA        : 4.03%  : +0.0122% : BitSetOps<unsigned __int64 *,1,BitVecTraits *,BitVecTraits>::VisitBits<`FlowGraphNaturalLoop::VisitLoopBlocksReversePostOrder<`Compiler::optComputeLoopSideEffects'::`10'::<lambda_1> >'::`2'::<lambda_1> >                                  
8714325   : +4.76%    : 2.48%  : +0.0075% : public: static struct BasicBlock * __cdecl BasicBlock::New(class Compiler *)                                                                                                                                                                 
8096425   : +1.19%    : 2.30%  : +0.0070% : public: void __cdecl Compiler::fgValueNumberTree(struct GenTree *)                                                                                                                                                                           
7114922   : +37.01%   : 2.02%  : +0.0062% : public: unsigned int __cdecl ValueNumStore::VNForMapSelectInner(enum ValueNumKind, enum var_types, unsigned int, unsigned int)                                                                                                               
7066803   : NA        : 2.01%  : +0.0061% : public: unsigned int __cdecl ValueNumStore::VNForMapStore(unsigned int, unsigned int, unsigned int)                                                                                                                                          
6795126   : NA        : 1.93%  : +0.0059% : protected: void __cdecl Compiler::AddContainsCallAllContainingLoops(class FlowGraphNaturalLoop *)                                                                                                                                            
4204838   : NA        : 1.20%  : +0.0036% : protected: void __cdecl Compiler::optHoistLoopBlocks(class FlowGraphNaturalLoop *, class ArrayStack<struct BasicBlock *> *, struct Compiler::LoopHoistContext *)                                                                             
4141024   : +1.43%    : 1.18%  : +0.0036% : public: void __cdecl Compiler::fgValueNumberBlock(struct BasicBlock *)                                                                                                                                                                       
4124325   : +2.51%    : 1.17%  : +0.0036% : public: void __cdecl Compiler::compInit(class ArenaAllocator *, struct CORINFO_METHOD_STRUCT_*, class ICorJitInfo *, struct CORINFO_METHOD_INFO *, struct InlineInfo *)                                                                      
3908060   : +0.44%    : 1.11%  : +0.0034% : memset                                                                                                                                                                                                                                       
3321557   : NA        : 0.94%  : +0.0029% : protected: bool __cdecl Compiler::optHoistThisLoop(class FlowGraphNaturalLoop *, struct Compiler::LoopHoistContext *)                                                                                                                        
2852281   : NA        : 0.81%  : +0.0025% : public: void __cdecl Compiler::optFindNewLoops(void)                                                                                                                                                                                         
2337736   : +1170.99% : 0.67%  : +0.0020% : public: bool __cdecl FlowGraphNaturalLoop::ContainsBlock(struct BasicBlock *)                                                                                                                                                                
2198509   : +0.09%    : 0.63%  : +0.0019% : public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)                                                                                                                                                                      
1423528   : +2.27%    : 0.40%  : +0.0012% : public: static class FlowGraphNaturalLoops * __cdecl FlowGraphNaturalLoops::Find(class FlowGraphDfsTree const *)                                                                                                                             
1129028   : +0.34%    : 0.32%  : +0.0010% : private: void __cdecl SsaBuilder::BlockRenameVariables(struct BasicBlock *)                                                                                                                                                                  
1127779   : NA        : 0.32%  : +0.0010% : public: unsigned int __cdecl Compiler::fgMemoryVNForLoopSideEffects(enum MemoryKind, struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                      
982049    : +0.36%    : 0.28%  : +0.0008% : `Compiler::fgComputeDfs'::`2'::<lambda_1>::operator()                                                                                                                                                                                        
863690    : NA        : 0.25%  : +0.0007% : protected: void __cdecl Compiler::optHoistCandidate(struct GenTree *, struct BasicBlock *, class FlowGraphNaturalLoop *, struct Compiler::LoopHoistContext *)                                                                                
639229    : NA        : 0.18%  : +0.0006% : private: void __cdecl Compiler::optPerformHoistExpr(struct GenTree *, struct BasicBlock *, class FlowGraphNaturalLoop *)                                                                                                                     
503013    : +69.46%   : 0.14%  : +0.0004% : public: void __cdecl Compiler::optRecordLoopMemoryDependence(struct GenTree *, struct BasicBlock *, unsigned int)                                                                                                                            
-475395   : -15.05%   : 0.14%  : -0.0004% : public: void __cdecl Compiler::fgValueNumberArrayElemLoad(struct GenTree *, struct VNFuncApp *)                                                                                                                                              
-492997   : -100.00%  : 0.14%  : -0.0004% : protected: void __cdecl Compiler::AddModifiedFieldAllContainingLoops(unsigned int, struct CORINFO_FIELD_STRUCT_*, enum Compiler::FieldKindForVN)                                                                                             
-543196   : -28.05%   : 0.15%  : -0.0005% : public: void __cdecl Compiler::fgValueNumberArrayElemStore(struct GenTree *, struct VNFuncApp *, unsigned int, unsigned int)                                                                                                                 
-662618   : -100.00%  : 0.19%  : -0.0006% : private: void __cdecl Compiler::optPerformHoistExpr(struct GenTree *, struct BasicBlock *, unsigned int)                                                                                                                                     
-925938   : -100.00%  : 0.26%  : -0.0008% : protected: void __cdecl Compiler::optHoistCandidate(struct GenTree *, struct BasicBlock *, unsigned int, struct Compiler::LoopHoistContext *)                                                                                                
-935385   : -1.85%    : 0.27%  : -0.0008% : `Compiler::optHoistLoopBlocks'::`2'::HoistVisitor::PostOrderVisit                                                                                                                                                                            
-1028653  : -14.12%   : 0.29%  : -0.0009% : public: void __cdecl Compiler::fgValueNumberFieldStore(struct GenTree *, struct GenTree *, class FieldSeq *, __int64, unsigned int, unsigned int)                                                                                            
-1279178  : -100.00%  : 0.36%  : -0.0011% : protected: void __cdecl Compiler::AddContainsCallAllContainingLoops(unsigned int)                                                                                                                                                            
-1914799  : -100.00%  : 0.54%  : -0.0017% : public: unsigned int __cdecl Compiler::fgMemoryVNForLoopSideEffects(enum MemoryKind, struct BasicBlock *, unsigned int)                                                                                                                      
-2128688  : -100.00%  : 0.61%  : -0.0018% : private: void __cdecl Compiler::optRecordLoopNestsMemoryHavoc(unsigned int, unsigned int)                                                                                                                                                    
-2249218  : -77.42%   : 0.64%  : -0.0019% : public: enum PhaseStatus __cdecl Compiler::optFindLoopsPhase(void)                                                                                                                                                                           
-2348262  : -11.82%   : 0.67%  : -0.0020% : public: void __cdecl Compiler::fgValueNumberFieldLoad(struct GenTree *, struct GenTree *, class FieldSeq *, __int64)                                                                                                                         
-2733873  : -14.67%   : 0.78%  : -0.0024% : public: void __cdecl Compiler::fgValueNumberCall(struct GenTreeCall *)                                                                                                                                                                       
-3219484  : -100.00%  : 0.92%  : -0.0028% : protected: bool __cdecl Compiler::optHoistThisLoop(unsigned int, struct Compiler::LoopHoistContext *)                                                                                                                                        
-4207077  : -100.00%  : 1.20%  : -0.0036% : protected: void __cdecl Compiler::optHoistLoopBlocks(unsigned int, class ArrayStack<struct BasicBlock *> *, struct Compiler::LoopHoistContext *)                                                                                             
-4276316  : -31.33%   : 1.22%  : -0.0037% : public: void __cdecl Compiler::fgMutateGcHeap(struct GenTree *)                                                                                                                                                                              
-13184191 : -100.00%  : 3.75%  : -0.0114% : public: void __cdecl Compiler::recordGcHeapStore(struct GenTree *, unsigned int)                                                                                                                                                             
-20002426 : -100.00%  : 5.69%  : -0.0173% : protected: bool __cdecl Compiler::optVNIsLoopInvariant(unsigned int, unsigned int, class JitHashTable<unsigned int, struct JitSmallPrimitiveKeyFuncs<unsigned int>, bool, class CompAllocator, class JitHashTableBehavior> *)                
-50043441 : -100.00%  : 14.24% : -0.0433% : private: bool __cdecl Compiler::optComputeLoopSideEffectsOfBlock(struct BasicBlock *)                                                                                                                                                        

I wanted to share this anyway because it shows the cost of switching from the lexical LoopDsc::LoopBlocks() iteration to the bit vector based VisitLoopBlocks; in the side effect code the cost of that appears to be about 0.0122% TP, so that's probably not a place we need to optimize significantly.

@jakobbotsch
Copy link
Member Author

jitstress failure was in a test with DOTNET_JitOptRepeat=*. I'd forgotten to recompute the new loop information in RecomputeLoopInfo.

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch jakobbotsch marked this pull request as ready for review December 6, 2023 15:53
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @BruceForstall

Diffs. See above for why.

This passed jitstress and libraries-jitstress on the previous commit (before I added a quirk).

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for all the renamings.

Comment on lines 4905 to 4906
struct LoopSideEffects* m_loopSideEffects;
struct HoistCounts* m_loopHoistCounts;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you put forward declarations at the top of this file with the others

Suggested change
struct LoopSideEffects* m_loopSideEffects;
struct HoistCounts* m_loopHoistCounts;
LoopSideEffects* m_loopSideEffects;
HoistCounts* m_loopHoistCounts;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved LoopSideEffects above Compiler so no forward declaration is needed, and removed m_loopHoistCounts entirely (I added that before I realized we could move those fields inside the HoistContext)

Copy link
Member Author

@jakobbotsch jakobbotsch Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoopDsc is a nested class defined below this field so it needs the nested forward declaration, but we will be able to get rid of that soon anyway.

@@ -12200,6 +12237,36 @@ class StringPrinter
void Append(char chr);
};

typedef JitHashTable<CORINFO_FIELD_HANDLE, JitPtrKeyFuncs<struct CORINFO_FIELD_STRUCT_>, FieldKindForVN> FieldHandleSet;

typedef JitHashTable<CORINFO_CLASS_HANDLE, JitPtrKeyFuncs<struct CORINFO_CLASS_STRUCT_>, bool> ClassHandleSet;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always seems like we could use a dedicated "set" type instead of overloading JitHashTable like this... (no request here; and not unique to this PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this seems wasteful.

@@ -6723,6 +6773,8 @@ class Compiler
PhaseStatus optFindLoopsPhase(); // Finds loops and records them in the loop table

void optFindLoops();
void optFindNewLoops();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is unfortunately confusing: when not in context, I thought it meant "find newly created loops, after some initial loop finding pass".

Maybe this is temporary and we'll get rid of "old" at some point and rename "new" to remove the word "new" everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should be able to get rid of these old/new distinctions soon.


BasicBlock* preheader = loop->EntryEdges()[0]->getSourceBlock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, but I wonder if it would be nice to have convenience functions to get a particular edge/etc. from a vector, e.g.

Suggested change
BasicBlock* preheader = loop->EntryEdges()[0]->getSourceBlock();
BasicBlock* preheader = loop->EntryEdge(0)->getSourceBlock();
FlowEdge* EntryEdge(unsigned i)
{
return m_entryEdges[i];
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@@ -7129,11 +7111,62 @@ void Compiler::optCopyLoopMemoryDependence(GenTree* fromTree, GenTree* toTree)
}
}

//------------------------------------------------------------------------
// AddVariableLiveness: Adds the variable liveness information for 'blk' to 'this' LoopDsc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no LoopDsc here

Suggested change
// AddVariableLiveness: Adds the variable liveness information for 'blk' to 'this' LoopDsc
// AddVariableLiveness: Adds the variable liveness information for 'blk' to 'this' loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this and a few other instances of "lnum" I had missed.

@jakobbotsch jakobbotsch closed this Dec 7, 2023
@jakobbotsch
Copy link
Member Author

Closing and reopening to pick up #95718

@jakobbotsch jakobbotsch reopened this Dec 7, 2023
@jakobbotsch jakobbotsch merged commit 03c2d25 into dotnet:main Dec 7, 2023
127 of 129 checks passed
@jakobbotsch jakobbotsch deleted the vn-hoisting-new-representation branch December 7, 2023 10:59
@github-actions github-actions bot locked and limited conversation to collaborators Jan 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants