Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions from loop alignment fixes #71646

Closed
performanceautofiler bot opened this issue Jun 28, 2022 · 16 comments · Fixed by #71868
Closed

Regressions from loop alignment fixes #71646

performanceautofiler bot opened this issue Jun 28, 2022 · 16 comments · Fixed by #71868
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 79f6709eded51a0cfd8bcfcdf501dd42e4358113
Compare 18d7e3926db5ff7238f2d93364d39ac300c00ee4
Diff Diff

Regressions in Benchstone.BenchI.HeapSort

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 213.62 μs 236.83 μs 1.11 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchI.HeapSort*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchI.HeapSort.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 236.83366619318178 > 224.52101449324323.
IsChangePoint: Marked as a change because one of 5/11/2022 7:02:26 PM, 6/2/2022 5:58:32 PM, 6/21/2022 12:11:10 PM, 6/28/2022 4:38:49 AM falls between 6/19/2022 1:08:59 PM and 6/28/2022 4:38:49 AM.
IsRegressionStdDev: Marked as regression because -79.44625167509122 (T) = (0 -236968.62503618616) / Math.Sqrt((500056.9408471677 / (31)) + (2032221.7978173392 / (30))) is less than -2.000995378087428 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (30) - 2, .025) and -0.10753435170351378 = (213960.5193028112 - 236968.62503618616) / 213960.5193028112 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline 79f6709eded51a0cfd8bcfcdf501dd42e4358113
Compare 18d7e3926db5ff7238f2d93364d39ac300c00ee4
Diff Diff

Regressions in Benchstone.BenchF.MatInv4

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 1.48 ms 1.61 ms 1.09 0.02 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchF.MatInv4*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchF.MatInv4.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 1.6114536046875 > 1.5493830323863635.
IsChangePoint: Marked as a change because one of 5/11/2022 7:02:26 PM, 6/2/2022 5:58:32 PM, 6/21/2022 12:11:10 PM, 6/28/2022 4:38:49 AM falls between 6/19/2022 1:08:59 PM and 6/28/2022 4:38:49 AM.
IsRegressionStdDev: Marked as regression because -20.622338989238603 (T) = (0 -1593560.3587474185) / Math.Sqrt((42164305.5343299 / (31)) + (824225530.567459 / (30))) is less than -2.000995378087428 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (30) - 2, .025) and -0.0746798415302852 = (1482823.3462333078 - 1593560.3587474185) / 1482823.3462333078 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels Jun 28, 2022
@kunalspathak kunalspathak changed the title [Perf] Changes at 6/21/2022 5:13:43 PM Regressions from loop alignment fixes Jul 5, 2022
@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Jul 5, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak
Copy link
Member

Coming from #70936

@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 6, 2022
@ghost
Copy link

ghost commented Jul 6, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 79f6709eded51a0cfd8bcfcdf501dd42e4358113
Compare 18d7e3926db5ff7238f2d93364d39ac300c00ee4
Diff Diff

Regressions in Benchstone.BenchI.HeapSort

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 213.62 μs 236.83 μs 1.11 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchI.HeapSort*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchI.HeapSort.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 236.83366619318178 > 224.52101449324323.
IsChangePoint: Marked as a change because one of 5/11/2022 7:02:26 PM, 6/2/2022 5:58:32 PM, 6/21/2022 12:11:10 PM, 6/28/2022 4:38:49 AM falls between 6/19/2022 1:08:59 PM and 6/28/2022 4:38:49 AM.
IsRegressionStdDev: Marked as regression because -79.44625167509122 (T) = (0 -236968.62503618616) / Math.Sqrt((500056.9408471677 / (31)) + (2032221.7978173392 / (30))) is less than -2.000995378087428 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (30) - 2, .025) and -0.10753435170351378 = (213960.5193028112 - 236968.62503618616) / 213960.5193028112 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline 79f6709eded51a0cfd8bcfcdf501dd42e4358113
Compare 18d7e3926db5ff7238f2d93364d39ac300c00ee4
Diff Diff

Regressions in Benchstone.BenchF.MatInv4

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 1.48 ms 1.61 ms 1.09 0.02 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchF.MatInv4*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchF.MatInv4.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 1.6114536046875 > 1.5493830323863635.
IsChangePoint: Marked as a change because one of 5/11/2022 7:02:26 PM, 6/2/2022 5:58:32 PM, 6/21/2022 12:11:10 PM, 6/28/2022 4:38:49 AM falls between 6/19/2022 1:08:59 PM and 6/28/2022 4:38:49 AM.
IsRegressionStdDev: Marked as regression because -20.622338989238603 (T) = (0 -1593560.3587474185) / Math.Sqrt((42164305.5343299 / (31)) + (824225530.567459 / (30))) is less than -2.000995378087428 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (30) - 2, .025) and -0.0746798415302852 = (1482823.3462333078 - 1593560.3587474185) / 1482823.3462333078 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: AndyAyersMS, kunalspathak
Labels:

area-CodeGen-coreclr, untriaged, refs/heads/main, ubuntu 18.04, RunKind=micro, Regression, CoreClr, x64

Milestone: -

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Jul 7, 2022
@kunalspathak
Copy link
Member

kunalspathak commented Jul 8, 2022

Investigating HeapSort, after #70936, we stop aligning a hot loop because of which we see the regression:

image

image

Investigating further on why we end up in situation where loop block appears before loop top is because at one point, we compact two blocks, BB25 which is part of in L00 and BB07 which is part of L01 and has align bit (meaning we would like to align L01). However, after compacting, we simply propagate BBF_LOOP_ALIGN flag without checking that we are propagating it between loops now.

I can think of 3 possible ways to solve this problem:

  1. I am wondering if we should even compact such blocks to begin with because the compacted block (BB25 in this case) gets marked as part of L00 although it has some traces of BB07 which was part of L01.
  2. Alternatively, we could compact the blocks (if there is a strong reason for it), but then not propagate BBF_LOOP_ALIGN if lpLoopNum of 2 blocks are different. This will starve a hot-loop from getting aligned though.
  3. Do not compact blocks if one of them has BBF_LOOP_ALIGN flag.
JitDump extract
*************** In fgUpdateFlowGraph()
Before updating the flow graph:

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..004)-> BB16 ( cond )                     i LIR 
BB02 [0020]  1       BB01                  1       [???..???)-> BB11 ( cond )                     internal LIR 
BB04 [0028]  1       BB02                  1       [???..???)-> BB11 ( cond )                     internal LIR 
BB05 [0029]  1       BB04                  1       [???..???)-> BB11 ( cond )                     internal idxlen LIR 
BB06 [0001]  2       BB05,BB09             3.96  0 [004..010)-> BB09 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB25 [0030]  1       BB06                  3.96  0 [010..???)                                     internal idxlen LoopPH LIR 
BB07 [0002]  2       BB08,BB25            15.84  1 [010..016)-> BB09 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR align 
BB08 [0003]  1       BB07                 15.84  1 [016..026)-> BB07 ( cond )                     i idxlen bwd LIR 
BB09 [0005]  3       BB06,BB07,BB08        3.96  0 [026..032)-> BB06 ( cond )                     i idxlen bwd LIR 
BB10 [0021]  1       BB09                  1       [???..???)-> BB16 (always)                     internal LIR 
BB11 [0026]  4       BB02,BB04,BB05,BB15   0.04    [004..010)-> BB15 ( cond )                     i idxlen bwd LIR 
BB13 [0023]  2       BB11,BB14             0.16    [010..016)-> BB15 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB14 [0024]  1       BB13                  0.16    [016..026)-> BB13 ( cond )                     i idxlen bwd LIR 
BB15 [0025]  3       BB11,BB13,BB14        0.04    [026..032)-> BB11 ( cond )                     i idxlen bwd LIR 
BB16 [0007]  3       BB01,BB10,BB15        1       [032..034)                                     i idxlen LoopPH LIR 
BB17 [0008]  2       BB16,BB23             8     2 [034..044)-> BB23 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB18 [0009]  2       BB17,BB22            16     3 [044..04A)-> BB21 ( cond )                     i Loop Loop0 bwd bwd-target LIR align 
BB19 [0010]  1       BB18                  8     3 [04A..054)-> BB21 ( cond )                     i idxlen bwd LIR 
BB20 [0011]  1       BB19                  8     3 [054..058)                                     i bwd LIR 
BB21 [0012]  3       BB18,BB19,BB20       16     3 [058..05E)-> BB23 ( cond )                     i idxlen bwd LIR 
BB22 [0013]  1       BB21                 16     3 [05E..06E)-> BB18 ( cond )                     i idxlen bwd LIR 
BB23 [0015]  3       BB17,BB21,BB22        8     2 [06E..07A)-> BB17 ( cond )                     i idxlen bwd bwd-src LIR 
BB24 [0016]  1       BB23                  1       [07A..07B)        (return)                     i LIR 
BB26 [0031]  0                             0       [???..???)        (throw )                     keep i internal rare LIR 
-----------------------------------------------------------------------------------------------------------------------------------------


Compacting blocks BB25 and BB07:
Second block has multiple incoming edges
Setting edge weights for BB08 -> BB25 to [0 .. 3.402823e+38]
Propagating LOOP_ALIGN flag from BB07 to BB25 during compacting.
*************** In fgDebugCheckBBlist

After updating the flow graph:

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..004)-> BB16 ( cond )                     i LIR 
BB02 [0020]  1       BB01                  1       [???..???)-> BB11 ( cond )                     internal LIR 
BB04 [0028]  1       BB02                  1       [???..???)-> BB11 ( cond )                     internal LIR 
BB05 [0029]  1       BB04                  1       [???..???)-> BB11 ( cond )                     internal idxlen LIR 
BB06 [0001]  2       BB05,BB09             3.96  0 [004..010)-> BB09 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB25 [0030]  2       BB06,BB08            15.84  0 [010..016)-> BB09 ( cond )                     i idxlen bwd LIR align 
BB08 [0003]  1       BB25                 15.84  1 [016..026)-> BB25 ( cond )                     i idxlen bwd LIR 
BB09 [0005]  3       BB06,BB08,BB25        3.96  0 [026..032)-> BB06 ( cond )                     i idxlen bwd LIR 
BB10 [0021]  1       BB09                  1       [???..???)-> BB16 (always)                     internal LIR 
BB11 [0026]  4       BB02,BB04,BB05,BB15   0.04    [004..010)-> BB15 ( cond )                     i idxlen bwd LIR 
BB13 [0023]  2       BB11,BB14             0.16    [010..016)-> BB15 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB14 [0024]  1       BB13                  0.16    [016..026)-> BB13 ( cond )                     i idxlen bwd LIR 
BB15 [0025]  3       BB11,BB13,BB14        0.04    [026..032)-> BB11 ( cond )                     i idxlen bwd LIR 
BB16 [0007]  3       BB01,BB10,BB15        1       [032..034)                                     i idxlen LoopPH LIR 
BB17 [0008]  2       BB16,BB23             8     2 [034..044)-> BB23 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB18 [0009]  2       BB17,BB22            16     3 [044..04A)-> BB21 ( cond )                     i Loop Loop0 bwd bwd-target LIR align 
BB19 [0010]  1       BB18                  8     3 [04A..054)-> BB21 ( cond )                     i idxlen bwd LIR 
BB20 [0011]  1       BB19                  8     3 [054..058)                                     i bwd LIR 
BB21 [0012]  3       BB18,BB19,BB20       16     3 [058..05E)-> BB23 ( cond )                     i idxlen bwd LIR 
BB22 [0013]  1       BB21                 16     3 [05E..06E)-> BB18 ( cond )                     i idxlen bwd LIR 
BB23 [0015]  3       BB17,BB21,BB22        8     2 [06E..07A)-> BB17 ( cond )                     i idxlen bwd bwd-src LIR 
BB24 [0016]  1       BB23                  1       [07A..07B)        (return)                     i LIR 
BB26 [0031]  0                             0       [???..???)        (throw )                     keep i internal rare LIR 
-----------------------------------------------------------------------------------------------------------------------------------------

@AndyAyersMS
Copy link
Member

I was just about to investigate this (seems like we're both assigned). From the dump:

Compacting blocks BB25 and BB07:
Second block has multiple incoming edges

That should tell us that we're compacting blocks that potentially have quite different execution frequencies (note in the before picture BB07 has much higher weight that BB25) and that we need to tread carefully.

Some thoughts:

  • In this case the new block arguably should become part of L01 (we're probably missing an lpTop update, but by this point we may have given up on maintaining the loop table). We could pursue a fix along these lines, but I suspect it would be messy.
  • We could block compaction if the second block has the align flag AND the second block has multiple predecessors. That seems more surgical.

@kunalspathak
Copy link
Member

We could block compaction if the second block has the align flag AND the second block has multiple predecessors. That seems more surgical.

Sure, but then the compacted block should have the lpLoopNum of second block then?

@AndyAyersMS
Copy link
Member

? If we block compaction there is no compacted block, BB07 lives on as is and says it belongs to L01 like it does now.

@kunalspathak
Copy link
Member

Ah you meant "block" as a verb, and I read that as "noun" :)

@AndyAyersMS
Copy link
Member

Right, we could block (== not allow) compaction of these blocks.

@AndyAyersMS
Copy link
Member

Are you going to work on a fix? If so, I'll unassign myself.

If not, I'm happy to take this over.

@kunalspathak
Copy link
Member

I can do it.

@kunalspathak
Copy link
Member

kunalspathak commented Jul 9, 2022

I just think that we should prohibit compacting if the 2 blocks are part of different loops? fgUpdateFlowGraph gets called multiple times from various phases and I think we should retain the correct information related to bbNatLoopNum. Currently, we just update the 2nd block's bbNatLoopNum to that of 1st block.

Edit: Or make sure to retain the bbNatLoopNum of 2nd block.

JITDump

*************** In fgUpdateFlowGraph()
Before updating the flow graph:

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..00A)-> BB28 ( cond )                     i idxlen LIR 
BB02 [0001]  2       BB01,BB27             4     0 [00A..00E)-> BB27 ( cond )                     i Loop Loop0 bwd bwd-target LIR 
BB03 [0013]  1       BB02                  4     0 [???..???)-> BB23 ( cond )                     internal LIR 
BB06 [0021]  1       BB03                  4     0 [???..???)-> BB23 ( cond )                     internal LIR 
BB07 [0022]  1       BB06                  4     0 [???..???)-> BB23 ( cond )                     internal idxlen LIR 
BB08 [0023]  1       BB07                  4     0 [???..???)-> BB23 ( cond )                     internal idxlen LIR 
BB30 [0035]  1       BB08                  4     0 [00E..???)                                     internal idxlen LoopPH LIR 
BB09 [0002]  2       BB21,BB30            15.84  1 [00E..02A)-> BB21 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB10 [0024]  1       BB09                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB12 [0029]  1       BB10                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB13 [0030]  1       BB12                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB14 [0031]  1       BB13                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB15 [0032]  1       BB14                 15.84  1 [???..???)-> BB19 ( cond )                     internal idxlen LIR 
BB16 [0033]  1       BB15                 15.84  1 [???..???)-> BB19 ( cond )                     internal idxlen LIR 
BB29 [0034]  1       BB16                 15.84  1 [02A..???)                                     internal idxlen LoopPH LIR 
BB17 [0003]  2       BB17,BB29            62.73  2 [02A..045)-> BB17 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR align 
BB18 [0025]  1       BB17                 15.84  1 [???..???)-> BB21 (always)                     internal LIR 
BB19 [0027]  7       BB10,BB12,BB13,BB14,BB15,BB16,BB19   0.63  1 [02A..045)-> BB19 ( cond )                     i idxlen bwd LIR 
BB21 [0005]  3       BB09,BB18,BB19       15.84  1 [045..051)-> BB09 ( cond )                     i idxlen bwd LIR 
BB22 [0014]  1       BB21                  4     0 [???..???)-> BB27 (always)                     internal LIR 
BB23 [0018]  5       BB03,BB06,BB07,BB08,BB26   0.16  0 [00E..02A)-> BB26 ( cond )                     i idxlen bwd LIR 
BB25 [0016]  2       BB23,BB25             0.64  0 [02A..045)-> BB25 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB26 [0017]  2       BB23,BB25             0.16  0 [045..051)-> BB23 ( cond )                     i idxlen bwd LIR 
BB27 [0007]  3       BB02,BB22,BB26        4     0 [051..05A)-> BB02 ( cond )                     i bwd LIR 
BB28 [0009]  2       BB01,BB27             1       [05A..05B)        (return)                     i LIR 
BB31 [0036]  0                             0       [???..???)        (throw )                     keep i internal rare LIR 
-----------------------------------------------------------------------------------------------------------------------------------------


Compacting blocks BB30 and BB09:
Second block has multiple incoming edges
Setting edge weights for BB21 -> BB30 to [0 .. 3.402823e+38]
*************** In fgDebugCheckBBlist

After updating the flow graph:

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..00A)-> BB28 ( cond )                     i idxlen LIR 
BB02 [0001]  2       BB01,BB27             4     0 [00A..00E)-> BB27 ( cond )                     i Loop Loop0 bwd bwd-target LIR 
BB03 [0013]  1       BB02                  4     0 [???..???)-> BB23 ( cond )                     internal LIR 
BB06 [0021]  1       BB03                  4     0 [???..???)-> BB23 ( cond )                     internal LIR 
BB07 [0022]  1       BB06                  4     0 [???..???)-> BB23 ( cond )                     internal idxlen LIR 
BB08 [0023]  1       BB07                  4     0 [???..???)-> BB23 ( cond )                     internal idxlen LIR 
BB30 [0035]  2       BB08,BB21            15.84  0 [00E..02A)-> BB21 ( cond )                     i idxlen bwd LIR 
BB10 [0024]  1       BB30                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB12 [0029]  1       BB10                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB13 [0030]  1       BB12                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB14 [0031]  1       BB13                 15.84  1 [???..???)-> BB19 ( cond )                     internal LIR 
BB15 [0032]  1       BB14                 15.84  1 [???..???)-> BB19 ( cond )                     internal idxlen LIR 
BB16 [0033]  1       BB15                 15.84  1 [???..???)-> BB19 ( cond )                     internal idxlen LIR 
BB29 [0034]  1       BB16                 15.84  1 [02A..???)                                     internal idxlen LoopPH LIR 
BB17 [0003]  2       BB17,BB29            62.73  2 [02A..045)-> BB17 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR align 
BB18 [0025]  1       BB17                 15.84  1 [???..???)-> BB21 (always)                     internal LIR 
BB19 [0027]  7       BB10,BB12,BB13,BB14,BB15,BB16,BB19   0.63  1 [02A..045)-> BB19 ( cond )                     i idxlen bwd LIR 
BB21 [0005]  3       BB18,BB19,BB30       15.84  1 [045..051)-> BB30 ( cond )                     i idxlen bwd LIR 
BB22 [0014]  1       BB21                  4     0 [???..???)-> BB27 (always)                     internal LIR 
BB23 [0018]  5       BB03,BB06,BB07,BB08,BB26   0.16  0 [00E..02A)-> BB26 ( cond )                     i idxlen bwd LIR 
BB25 [0016]  2       BB23,BB25             0.64  0 [02A..045)-> BB25 ( cond )                     i Loop Loop0 idxlen bwd bwd-target LIR 
BB26 [0017]  2       BB23,BB25             0.16  0 [045..051)-> BB23 ( cond )                     i idxlen bwd LIR 
BB27 [0007]  3       BB02,BB22,BB26        4     0 [051..05A)-> BB02 ( cond )                     i bwd LIR 
BB28 [0009]  2       BB01,BB27             1       [05A..05B)        (return)                     i LIR 
BB31 [0036]  0                             0       [???..???)        (throw )                     keep i internal rare LIR 
-----------------------------------------------------------------------------------------------------------------------------------------

@kunalspathak
Copy link
Member

By the way, changing just that condition would remove the 1st empty, which changes the bbNum and that upsets LSRA and we see lot of diffs around it.

@kunalspathak
Copy link
Member

I just think that we should prohibit compacting if the 2 blocks are part of different loops?

That works fine because we will eventually remove the empty block too.

image

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 9, 2022
@AndyAyersMS
Copy link
Member

By the way, changing just that condition would remove the 1st empty, which changes the bbNum and that upsets LSRA and we see lot of diffs around it.

I'm not sure what "that condition" refers to...

Your proposed change is probably fine, but I trust the pred count and align flag a bit more than I trust the loop index.

@kunalspathak
Copy link
Member

I'm not sure what "that condition" refers to...

I meant to not allow block compaction if 2nd block has alignment flag.

but I trust the pred count and align flag a bit more than I trust the loop index

True, but placeLoopAlignInstructions() has dependency on loop index unfortunately.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 10, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 9, 2022
@jeffhandley jeffhandley added runtime-coreclr specific to the CoreCLR runtime os-linux Linux OS (any supported distro) arch-x64 and removed CoreClr labels Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants