This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[tfs-changeset: 1408043]
richlander
added a commit
to richlander/coreclr
that referenced
this pull request
Apr 8, 2015
Add instructions for using this repo
Merged
mikem8361
added a commit
that referenced
this pull request
Aug 4, 2015
…p where the system ids (TIDs) are wrong. First find the managed thread os id is: (lldb) sos Threads Lock ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception 1 1 3787 00000000006547F8 20220 Preemptive 00007FFFCC0145D0:00007FFFCC015FD0 00000000006357F8 0 Ukn 6 2 3790 0000000000678FB8 21220 Preemptive 0000000000000000:0000000000000000 00000000006357F8 0 Ukn (Finalizer) (lldb) thread list Process 0 stopped * thread #1: tid = 0x0000, 0x00007f01fe64d267 libc.so.6`__GI_raise(sig=6) + 55 at raise.c:55, name = 'corerun', stop reason = signal SIGABRT thread #2: tid = 0x0001, 0x00007f01fe7138dd libc.so.6, stop reason = signal SIGABRT thread #3: tid = 0x0002, 0x00007f01fd27dda0 libpthread.so.0`__pthread_cond_wait + 192, stop reason = signal SIGABRT thread #4: tid = 0x0003, 0x00007f01fd27e149 libpthread.so.0`__pthread_cond_timedwait + 297, stop reason = signal SIGABRT thread #5: tid = 0x0004, 0x00007f01fe70f28d libc.so.6, stop reason = signal SIGABRT thread #6: tid = 0x0005, 0x00007f01fe70f49d libc.so.6, stop reason = signal SIGABRT Then use the new command "setsostid" to set the current thread using the "OSID" from above: (lldb) setsostid 3790 6 Set sos thread os id to 0x3790 which maps to lldb thread index 6 Now ClrStack should dump that managed thread: (lldb) sos ClrStack To undo the affect of this command: (lldb) setsostid Added setclrpath command that allows the path that sos/dac/dbi are loaded from to be changes instead of using the coreclr path. This may be needed if loading a core dump and the debugger binaries are in a different directory that what the dump has for coreclr's path.
eerhardt
pushed a commit
to eerhardt/coreclr
that referenced
this pull request
Aug 25, 2015
base locale and formatting for linux
kyulee1
added a commit
to kyulee1/coreclr
that referenced
this pull request
Mar 14, 2016
When zero-initializing locals, JIT emits wrong instruction sequence -- e.g, 28 byte zero-intialization as shown below. The issue was JIT passed wrong arguments to emitIns_R_R_I. Before (Fail) ``` stp xzr, xzr, [x2],dotnet#16 str xzr, [x2,dotnet#2] --> just two byte offset (no x2 post-increment) str wzr, [x2] ``` After (Pass) ``` stp xzr, xzr, [x2],dotnet#16 str xzr, [x2],dotnet#8 str wzr, [x2] ```
kyulee1
added a commit
to kyulee1/coreclr
that referenced
this pull request
May 11, 2016
Fixes https://github.com/dotnet/coreclr/issues/3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```
kyulee1
added a commit
to kyulee1/coreclr
that referenced
this pull request
May 12, 2016
Fixes #3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```
kyulee1
added a commit
to kyulee1/coreclr
that referenced
this pull request
May 12, 2016
Fixes #3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```
dotnet-bot
pushed a commit
to dotnet-bot/coreclr
that referenced
this pull request
May 18, 2016
…Forms.dll ==================== 007551: Merge pull request dotnet#1241 from mikedn/modopt Extend the DIV/MOD dividend into RDX:RAX only if needed ==================== Assert failure(PID 33656 [0x00008378], Thread: 17792 [0x4580]): Assertion failed 'addr->OperIsAddrMode() || (addr->IsCnsIntOrI() && addr->isContained()) || !addr->isContained()' in 'System.Windows.Forms.CheckedListBox:OnDrawItem(ref):this' (IL size 1216) File: e:\dd\projectk\src\ndp\clr\src\jit\emitxarch.cpp Line: 2698 Image: C:\Windows\Microsoft.NET\Framework64\v4.0.rb1605209\mscorsvw.exe The tree: ***** BB41, stmt 82 (embedded) ( 6, 8) [003723] ------------ * stmtExpr void (embedded) (IL 0x109... ???) N1045 ( 3, 2) [000115] ------------ | /--* lclVar ref V00 this u:2 REG rcx $80 N1047 ( 1, 4) [002642] ------------ | +--* const long 344 field offset Fseq[idealCheckSize] REG NA $10b N1049 ( 4, 6) [002643] -------N---- | /--* + byref REG NA $356 N1051 ( 6, 8) [000116] ----GO------ | /--* indir int REG rcx <l:$685, c:$2ef> N1053 ( 6, 8) [003669] DA--GO------ \--* st.lclVar int V172 cse1 rcx REG rcx RV During codegen: Generating BB41, stmt 71 Holding variables: [rbx rsi rdi r12-r15] Generating: N1043 ( 3, 2) [000114] ------------ * lclVar int V05 loc3 u:3 r12 (last use) REG r12 RV $31a Generating: N1045 ( 3, 2) [000115] ------------ * lclVar ref V00 this u:2 REG rcx $80 IN00db: mov rcx, gword ptr [V00 rbp+10H] GC regs: 00000040 {rsi} => 00000042 {rcx rsi} Generating: N1047 ( 1, 4) [002642] ------------ * const long 344 field offset Fseq[idealCheckSize] REG NA $10b Generating: N1049 ( 4, 6) [002643] -------N---- * + byref REG NA $356 Generating: N1051 ( 6, 8) [000116] ----GO------ * indir int REG rcx <l:$685, c:$2ef> ... assert ... (This is rollback dotnet#2: the TFS/GitHub mirror unfortunately undid rollback CS#1605814 with CS#1605840. This change should avoid that problem.) [tfs-changeset: 1605917]
BruceForstall
pushed a commit
to BruceForstall/coreclr
that referenced
this pull request
Jul 28, 2016
Merge coreclr/master.
dotnet-bot
pushed a commit
to dotnet-bot/coreclr
that referenced
this pull request
Aug 19, 2016
OVERVIEW ======== This directory contains the SuperPMI tool used for testing the .NET just-in-time (JIT) compiler. SuperPMI has two uses: 1. Verification that a JIT code change doesn't cause any asserts. 2. Finding test code where two JIT compilers generate different code, or verifying that the two compilers generate the same code. Case dotnet#1 is useful for doing quick regression checking when making a source code change to the JIT compiler. The process is: (a) make a JIT source code change, (b) run that newly built JIT through a SuperPMI run to verify no asserts have been introduced. Case dotnet#2 is useful for generating assembly language diffs, to help analyze the impact of a JIT code change. SuperPMI works in two phases: collection and playback. In the collection phase, the system is configured to collect SuperPMI data. Then, run any set of .NET managed programs. When these managed programs invoke the JIT compiler, SuperPMI gathers and captures all information passed between the JIT and its .NET host. In the playback phase, SuperPMI loads the JIT directly, and causes it to compile all the functions that it previously compiled, but using the collected data to provide answers to various questions that the JIT needs to ask. The .NET execution engine (EE) is not invoked at all. TOOLS ========== There are two native executable tools: superpmi and mcs. There is a .NET Core C# program that is built as part of the coreclr repo tests build called superpmicollect.exe. All will show a help screen if passed -?. COLLECTION ========== Set the following environment variables: SuperPMIShimLogPath=<full path to an empty temporary directory> SuperPMIShimPath=<full path to clrjit.dll, the "standalone" JIT> COMPlus_AltJit=* COMPlus_AltJitName=superpmi-shim-collector.dll (On Linux, use libclrjit.so and libsuperpmi-shim-collector.so. On Mac, use libclrjit.dylib and libsuperpmi-shim-collector.dylib.) Then, run some managed programs. When done running programs, un-set these variables. Now, you will have a large number of .mc files. Merge these using the mcs tool: mcs -merge base.mch *.mc One benefit of SuperPMI is the ability to remove duplicated compilations, so on replay only unique functions are compiled. Use the following to create a "unique" set of functions: mcs -removeDup -thin base.mch unique.mch Note that -thin is not required. However, it will delete all the compilation result collected during the collection phase, which makes the resulting MCH file smaller. Those compilation results are not required for playback. Use the superpmicollect.exe tool to automate and simplify this process. PLAYBACK ======== Once you have a merged, de-duplicated MCH collection, you can play it back using: superpmi unique.mch clrjit.dll You can do this much faster by utilizing all the processors on your machine, and replaying in parallel, using: superpmi -p unique.mch clrjit.dll REMAINING WORK ============= The basic of assembly diffing are there, using the "coredistools" package. The open source build needs to be altered to use this package to wire up the correct build steps. [tfs-changeset: 1623347]
mjsabby
pushed a commit
to mjsabby/coreclr
that referenced
this pull request
Oct 18, 2016
# The first commit's message is: Fix placeholder pdb file path in alpine nuget packages # This is the commit message dotnet#2: Add LCG JIT Compilation Profiler Callbacks Methods that contain no metadata (e.g. of sources are IL Stubs, DynamicMethod, Expression Trees, etc.) also known as LCG methods are not reported to profilers via the Profiling API. LCG, introduced in .NET 2.0 timeframe is unique in that it doesn't require the method to be hosted in an assembly > module > type heirarchy and is GCable in of itself. This change adds new APIs that notify the profiler of such methods but since there is no metadata to go lookup, it provides some useful pieces of information that the profiler author may want to expose to the profiler user. In the compilation start method we provide a className (always dynamicClass), a methodName that can be a set of few predetermined names like (ILStub_COMToCLR, etc.) or if the user has set the name for the LCG method that can show up here. For example, when using the Expression Trees API, the user can specify a friendly name which would be returned here. In the jit completed callback we provide information for the native code start address and size. This is particularly useful to get more accurate accounting of what the (previously unidentified) code is. At least the user would know it is JITTed if nothing more (but most likely more information like what kind of stub). Furthermore, since this is going to be a profiler callback, the profiler can initiate a stackwalk and give more contextual information to its users beyond the pieces of information we can provide here that could identify what they're encountering. Finally, there is also the case that today the profiling APIs underreport JITTed code in the process. Considerable amount of LCG code can now be present in the program and in security-sensitive environments where tracking JITTed code for security reasons is important the profiling apis fall short. In such environments there is also often restrictions on running with elevated privileges, so procuring this data through other means (like ETW) may pose a challenge.
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Dec 12, 2016
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Jan 8, 2017
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Jan 8, 2017
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error
manofstick
pushed a commit
to manofstick/coreclr
that referenced
this pull request
Jan 16, 2017
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the #1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since #1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error
CarolEidt
referenced
this pull request
in CarolEidt/coreclr
Jun 23, 2017
# The first commit's message is: Mark lvDoNotEnregister lclVars as contained Even if a lclVar is tracked, it may not be a register candidate. When determining whether a lclVar should be contained or RegOptional, take that into account. In the interest of making this as accurate as possible, mark lclVars early as lvDoNotEnregister when they meet criteria that will later disqualify them from a register. # This is the commit message #2: Formatting
EgorBo
pushed a commit
to EgorBo/coreclr
that referenced
this pull request
Mar 5, 2019
…inalizers Remove calls to Console.WriteLine() from finalizers in tests.
franksinankaya
added a commit
to franksinankaya/coreclr
that referenced
this pull request
Sep 20, 2019
franksinankaya
added a commit
to franksinankaya/coreclr
that referenced
this pull request
Sep 25, 2019
sandreenko
pushed a commit
that referenced
this pull request
Sep 26, 2019
* find src/jit -type f -exec sed -i -e 's/->isVararg/->GetIsVararg()/g' {} \; * Format patch #2
franksinankaya
added a commit
to franksinankaya/coreclr
that referenced
this pull request
Jan 28, 2020
This was referenced Jan 31, 2020
jkotas
pushed a commit
to jkotas/coreclr
that referenced
this pull request
Aug 27, 2020
Co-Authored-By: Qiao Pengcheng <[email protected]> Co-Authored-By: Leslie Zhai <[email protected]> Co-Authored-By: Wang Haomin <[email protected]> Co-Authored-By: Ao Qi <[email protected]>
jkotas
pushed a commit
to jkotas/coreclr
that referenced
this pull request
Aug 27, 2020
Co-Authored-By: Qiao Pengcheng <[email protected]> Co-Authored-By: Leslie Zhai <[email protected]> Co-Authored-By: Wang Haomin <[email protected]> Co-Authored-By: Ao Qi <[email protected]>
jkotas
pushed a commit
to jkotas/coreclr
that referenced
this pull request
Aug 27, 2020
Co-Authored-By: Qiao Pengcheng <[email protected]> Co-Authored-By: Leslie Zhai <[email protected]> Co-Authored-By: Wang Haomin <[email protected]> Co-Authored-By: Ao Qi <[email protected]>
k15tfu
added a commit
to k15tfu/coreclr
that referenced
this pull request
Oct 23, 2020
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
When zero-initializing locals, JIT emits wrong instruction sequence -- e.g, 28 byte zero-intialization as shown below. The issue was JIT passed wrong arguments to emitIns_R_R_I. Before (Fail) ``` stp xzr, xzr, [x2],dotnet/coreclr#16 str xzr, [x2,dotnet/coreclr#2] --> just two byte offset (no x2 post-increment) str wzr, [x2] ``` After (Pass) ``` stp xzr, xzr, [x2],dotnet/coreclr#16 str xzr, [x2],dotnet/coreclr#8 str wzr, [x2] ``` Commit migrated from dotnet/coreclr@59ea856
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
Fixes dotnet/coreclr#3332 To validate various addressing in dotnet/coreclr#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet/coreclr#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ``` Commit migrated from dotnet/coreclr@a0c6144
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
…ystem.Windows.Forms.dll ==================== 007551: Merge pull request dotnet/coreclr#1241 from mikedn/modopt Extend the DIV/MOD dividend into RDX:RAX only if needed ==================== Assert failure(PID 33656 [0x00008378], Thread: 17792 [0x4580]): Assertion failed 'addr->OperIsAddrMode() || (addr->IsCnsIntOrI() && addr->isContained()) || !addr->isContained()' in 'System.Windows.Forms.CheckedListBox:OnDrawItem(ref):this' (IL size 1216) File: e:\dd\projectk\src\ndp\clr\src\jit\emitxarch.cpp Line: 2698 Image: C:\Windows\Microsoft.NET\Framework64\v4.0.rb1605209\mscorsvw.exe The tree: ***** BB41, stmt 82 (embedded) ( 6, 8) [003723] ------------ * stmtExpr void (embedded) (IL 0x109... ???) N1045 ( 3, 2) [000115] ------------ | /--* lclVar ref V00 this u:2 REG rcx $80 N1047 ( 1, 4) [002642] ------------ | +--* const long 344 field offset Fseq[idealCheckSize] REG NA $10b N1049 ( 4, 6) [002643] -------N---- | /--* + byref REG NA $356 N1051 ( 6, 8) [000116] ----GO------ | /--* indir int REG rcx <l:$685, c:$2ef> N1053 ( 6, 8) [003669] DA--GO------ \--* st.lclVar int V172 cse1 rcx REG rcx RV During codegen: Generating BB41, stmt 71 Holding variables: [rbx rsi rdi r12-r15] Generating: N1043 ( 3, 2) [000114] ------------ * lclVar int V05 loc3 u:3 r12 (last use) REG r12 RV $31a Generating: N1045 ( 3, 2) [000115] ------------ * lclVar ref V00 this u:2 REG rcx $80 IN00db: mov rcx, gword ptr [V00 rbp+10H] GC regs: 00000040 {rsi} => 00000042 {rcx rsi} Generating: N1047 ( 1, 4) [002642] ------------ * const long 344 field offset Fseq[idealCheckSize] REG NA $10b Generating: N1049 ( 4, 6) [002643] -------N---- * + byref REG NA $356 Generating: N1051 ( 6, 8) [000116] ----GO------ * indir int REG rcx <l:$685, c:$2ef> ... assert ... (This is rollback dotnet/coreclr#2: the TFS/GitHub mirror unfortunately undid rollback CSdotnet/coreclr#1605814 with CSdotnet/coreclr#1605840. This change should avoid that problem.) [tfs-changeset: 1605917] Commit migrated from dotnet/coreclr@ce8e7e3
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
OVERVIEW ======== This directory contains the SuperPMI tool used for testing the .NET just-in-time (JIT) compiler. SuperPMI has two uses: 1. Verification that a JIT code change doesn't cause any asserts. 2. Finding test code where two JIT compilers generate different code, or verifying that the two compilers generate the same code. Case dotnet/coreclr#1 is useful for doing quick regression checking when making a source code change to the JIT compiler. The process is: (a) make a JIT source code change, (b) run that newly built JIT through a SuperPMI run to verify no asserts have been introduced. Case dotnet/coreclr#2 is useful for generating assembly language diffs, to help analyze the impact of a JIT code change. SuperPMI works in two phases: collection and playback. In the collection phase, the system is configured to collect SuperPMI data. Then, run any set of .NET managed programs. When these managed programs invoke the JIT compiler, SuperPMI gathers and captures all information passed between the JIT and its .NET host. In the playback phase, SuperPMI loads the JIT directly, and causes it to compile all the functions that it previously compiled, but using the collected data to provide answers to various questions that the JIT needs to ask. The .NET execution engine (EE) is not invoked at all. TOOLS ========== There are two native executable tools: superpmi and mcs. There is a .NET Core C# program that is built as part of the coreclr repo tests build called superpmicollect.exe. All will show a help screen if passed -?. COLLECTION ========== Set the following environment variables: SuperPMIShimLogPath=<full path to an empty temporary directory> SuperPMIShimPath=<full path to clrjit.dll, the "standalone" JIT> COMPlus_AltJit=* COMPlus_AltJitName=superpmi-shim-collector.dll (On Linux, use libclrjit.so and libsuperpmi-shim-collector.so. On Mac, use libclrjit.dylib and libsuperpmi-shim-collector.dylib.) Then, run some managed programs. When done running programs, un-set these variables. Now, you will have a large number of .mc files. Merge these using the mcs tool: mcs -merge base.mch *.mc One benefit of SuperPMI is the ability to remove duplicated compilations, so on replay only unique functions are compiled. Use the following to create a "unique" set of functions: mcs -removeDup -thin base.mch unique.mch Note that -thin is not required. However, it will delete all the compilation result collected during the collection phase, which makes the resulting MCH file smaller. Those compilation results are not required for playback. Use the superpmicollect.exe tool to automate and simplify this process. PLAYBACK ======== Once you have a merged, de-duplicated MCH collection, you can play it back using: superpmi unique.mch clrjit.dll You can do this much faster by utilizing all the processors on your machine, and replaying in parallel, using: superpmi -p unique.mch clrjit.dll REMAINING WORK ============= The basic of assembly diffing are there, using the "coredistools" package. The open source build needs to be altered to use this package to wire up the correct build steps. [tfs-changeset: 1623347] Commit migrated from dotnet/coreclr@d85eb92
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet/coreclr#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet/coreclr#2 SSE to AVX transition penalty won't happen since dotnet/coreclr#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix dotnet/coreclr#7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error Commit migrated from dotnet/coreclr@cc169ea
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
* Preliminary Changes * Module Index Resolution * Change infoModule encoding * Change referencing module in R2R * Pre-condition Check * Virtual Method Module Resolution * Remove Workarounds and add conditional import loading * Add signature kind module override * Add ELEMENT_TYPE_MODULE_ZAPSIG * Add switch to enable large version bubble * Cleanup * Change Native header check * Add large version bubble test * Add Large Version Bubble Checks * Cleanup * Revert unnecessary check * Change EncodeMethod Version Bubble Condition * Add Large Version Bubble asserts * Cleanup * Add default argument to runtests.py * Change test PreCommands * Revert whitespace changes * Change breaking conditional check * Streamline Version Bubble test * Address PR Feedback * Address PR Feedback dotnet/coreclr#2 * Remove dead code * Add crossgen-time ifdef Commit migrated from dotnet/coreclr@9fe3286
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.