Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync with upstream main branch #15

Merged
merged 423 commits into from
Dec 7, 2021

Conversation

joshpeterson
Copy link

This is an automatically generated pull request to merge changes from the upstream main branch.

vargaz and others added 30 commits November 17, 2021 02:11
…d AOT. (dotnet#61627)

If a profile contained a method like List<AnInst>:.ctor () and the
'profile-only' option was used, the fully shared version of the method
needs to be added to the aot image.
…rrent namespace for evaluation (dotnet#61252)

* Using current namespace as the default place to serach for the resolved class.

* Add tests for static class, static fields and pausing in async method.

* Added tests for class evaluation.

* Fixing support to the current namespace and adding tests for it

* Assuing that we search within the current assembly first. Removed tests that fail in Consol App.

* Remove a test-duplicate that was not testing static class or static fields.

* Fixing indentation.

* Refixing indentation.

* Refix indentations again.

* Applied the advice about adding new blank lines.

* Changed the current assembly check.

* Extracting the check from the loop. One time check is enough.

* Simplifying multiple test cases into one call.

* Using local function as per review suggestion.

* Added test that was skipped by mistake.

* Added looking for the namespace in all assemblies because there is a chance it will be located out of the current assembly.

* Extracting value based on the current frame, not the top of stack location.

* Test for classes evaluated from different frames.

* Fixing indentation and spaces.

* Applied review comments for values evaluation.

* Compressed two tests into one with MemberData.

* Added test case of type without namespace (failing).

* Addressed Ankit advices from the review.

Co-authored-by: DESKTOP-GEPIA6N\Thays <[email protected]>
* switch from FileStream to RandomAccess

* use Array.MaxLength as a limit for File.ReadAllBytes and fix an edge case bug for files: Array.MaxLength < Length < int.MaxValue

* there is no gain of using FileOptions.SequentialScan on Unix, as it requires an additional sys call

Co-authored-by: Dan Moseley <[email protected]>
…et#61562)

* Extend RegexCharClass.Canonicalize range inversion optimization

There's a simple optimization in RegexCharClass.Canonicalize that was added in .NET 5, with the goal of taking a set that's made up of exactly two ranges and seeing whether those ranges were leaving out exactly one character.  If they were, the set can instead be rewritten as that character negated, which is a normalized form used downstream and optimized.  We can extend this normalization ever so slightly to be for two ranges separated not just be a single character but by more than that as well.

* Update TODO comment

* Add some more reduction tests
* Remove some unnecessary slicing from generated Regex code

When we're outputting code to match a "multi" (a sequence of multiple characters), we're currently issuing a Slice for the known tracked offset even if that offset is 0.  We can skip that nop.

* Address PR feedback
* Ignore missing data result

* Move result = false only if error Code is other than 2/3
…otnet#61691)

During code review of my latest PR Bruce raised the concern that
hard-coding public key values and version ID for the xunit.core
reference will cause enormous maintenance pain if we decide to
upgrade to a newer version of the module in the future. As Jeremy
verified that the metadata is not really needed, I'm deleting it
from all tests I switched over last week.

Thanks

Tomas
…tChar (dotnet#61490)

* Factor out and improve the vectorization of RegexInterpreter.FindFirstChar

This change started with the "simple" goal of factoring out the FindFirstChar logic from RegexInterpreter and consuming it in SymbolicRegexMatcher.  The existing engines use FindFirstChar to quickly skip ahead to the next location that might possibly match, at which point they fall back to analyzing the whole pattern at that location.  SymbolicRegexMatcher (used by RegexOptions.NonBacktracking) had its own implementation for this, which it used any time it entered a start state.  This required non-trivial additional code to maintain, and there's no good reason it should be separate from the other engines.

However, what started out as a simple change grew due to regressions that resulted from differences in the implementations.  In particular, SymbolicRegexMatcher already works off of precomputed equivalence tables for casing, which gives it very different characteristics in this regard from the existing engines.  For example, SymbolicRegexMatcher's existing "skip ahead to the next possible match start location" logic already evaluated all the characters that could possibly start a match, which included variations of the same character when using IgnoreCase, but the existing RegexInterpreter logic didn't.  That discrepancy then results in a significant IgnoreCase regression for NonBacktracking due to losing the ability to use a vectorized search for the next starting location.  We already plan to shift the existing engines over to a plan where all of these equivalences are computed at construction time rather than using ToLower at both construction time and match time, so this PR takes some steps in that direction, doing so for most of ASCII.  This has added some temporary cruft, which we'll be able to delete once we fully shift the implementations over (which we should do in the near future).

Another difference was SymbolicRegexMatcher was enabling use of IndexOfAny for up to 5 characters, whereas RegexOptions.Compiled was only doing up to 3 characters, and RegexInterpreter wasn't doing for any number.  The PR now uses 5 everywhere.

However, the more characters involved, the more overhead there is to IndexOfAny, and for some inputs, the higher the chances are that IndexOfAny will find a match sooner, which means its overhead compounds more.  To help with that, we now not only compute the possible characters that might match at the beginning of the pattern, but also characters that might match at a fixed offset from the beginning of the pattern (e.g. in \d{3}-\d{2}-\d{4}, it will find the '-' at offset 3 and be able to vectorize a search for that and then back off by the relevant distance.  That then also means we might end up with multiple sets to choose to search for, and this PR borrows an idea from Rust, which is to use some rough frequency analysis to determine which set should be targeted.  It's not perfect, and we can update the texts use to seed the analysis (right now I based it primarily on *.cs files in dotnet/runtime and some Project Gutenberg texts), but it's good enough for these purposes for now.

We'd previously switched to using IndexOf for a case-sensitive prefix string, but still were using Boyer-Moore for case-insensitive.  Now that we're able to also vectorize a search for case-insensitive values (right now just ASCII letter, but that'll be fixed soon), we can just get rid of Boyer-Moore entirely.  This saves all the costs to do with constructing the Boyer-Moore tables and also avoids having to generate the Boyer-Moore implementations in RegexOptions.Compiled and the source generator.

The casing change also defeated some other optimizations already present.  For example, in .NET 5 we added an optimization whereby an alternation like `abcef|abcgh` would be transformed into `abc(?:ef|gh)`, and that would apply whether case-sensitive or case-insensitive.  But by transforming the expression at construction now for case-insensitive into `[Aa][Bb][Cc][Ee][Ff]|[Aa][Bb][Cc][Gg][Hh]`, that optimization was defeated.  I've added a new optimization pass for alternations that will detect common prefixes even if they're sets.

The casing change also revealed some cosmetic issues.  As part of the change, when we encounter a "multi" (a multi-character string in the pattern), we convert that single case-insensitive RegexNode to instead be one case-sensitive RegexNode per character, with a set for all the equivalent characters that can match.  This then defeats some of the nice formatting we had for multis in the source generator, so as part of this change, the source generator has been augmented to output nicer code for concatenations.  And because sets like [Ee] are now way more common (since e.g. a case-insensitive 'e' will be transformed into such a set), we also special-case that in both the source generator and RegexOptions.Compiled, to spit out the equivalent of `(c | 0x20) == 'e'` rather than `(c == 'E'| c == 'e')`.

Along the way, I cleaned up a few things as well, such as passing around a CultureInfo more rather than repeatedly calling CultureInfo.CurrentCulture, using CollectionsMarshal.GetValueRefOrAddDefault on a hot path to do with interning strings in a lookup table, tweaking SymbolicRegexRunnerFactory's Runner to itself be generic to avoid an extra layer of virtual dispatch per operation, and cleaning up code / comments in SymbolicRegexMatcher along the way.

For the most part the purpose of the change wasn't to improve perf, and in fact I was willing to accept some regressions in the name of consolidation.  There are a few regressions here, mostly small, and mostly for cases where we're simply paying an overhead for vectorization, e.g. where the current location is fine to match, or where the target character being searched for is very frequent.  Overall, though, there are some substantial improvements.

* Fix missing condition in RegexCompiler

* Try to fix mono failures and address comment feedback

* Delete more now dead code

* Fix dead code emitting after refactoring

Previously return statements while emitting anchors were short-circuiting the rest of the emitting code, but when I moved that code into a helper, the returns stopped having that impact, such that we'd end up emitting a return statement and then emit dead code after it.  Fix it.

* Remove some now dead code
* re-enable optimization when a token is replaced by corresponding type handle

* compute "isNested" ony when we need it.

* no need to make more than 2 attempts at reading
* Add support for custom container for Linux Arm64

Update platform_matrix.yml to allow custom containers for Arm64 Linux runs.
This is required for our performance runs as we need a newer version of the
python runtime. This change follows the same pattern as https://github.com/dotnet/runtime/pull/59202/files

* Update perf_slow.yml with correct container values
Co-authored-by: Alexander Köplinger <[email protected]>
Co-authored-by: Jan Kotas <[email protected]>
Co-authored-by: Santiago Fernandez Madero <[email protected]>
Co-authored-by: Tomas <[email protected]>
Global usings no longer means "global breaking" after RC1.
* Temp change to disable align loop

* download specific artifacts

to squash:

* Upload .dasm files

* fix the build id

* Add ci_run and retainOnlyTopFiles

* Rename ci_run to retainOnlyTop

* Disable struct promo to test asmdiff

* Revert "Disable struct promo to test asmdiff"

This reverts commit 3ef7adb.

* fix the parameter retainOnlyTopFiles

* add missing -

* Revert "Temp change to disable align loop"

This reverts commit b1de5c4.

* Revert "download specific artifacts"

This reverts commit db3de57.

* Review comments
…implify the api (dotnet#61392)

* Make SMonoSdbHelper part of the execution context
* Hide align behind a jmp

fix the alignBytesRemoved

Some fixes and working model

Some fixes and redesign

Some more fixes

more fixes

fix

Add the check  for fgFirstBB

misc changes

code cleanup + JitHideAlignBehindJmp switch

validatePadding only if align are before the loop IG

More cleanup, remove commented code

jit format

* Fix a problem where curIG==0 and loop might be emitted in curIG, adjust the targetIG to prevIG

Add IGF_REMOVED_ALIGN flag for special scenarios

* Add stress mode to emit int3 for xarch

* Add stress mode to emit bkpt for arm64

* Add a loop align instruction placement phase

* review comments

* Change from unsigned short to unsigned

* review comments around cleanup

* emitForceNewIG

* Remove emitPrevIG

* Revert change to forceNewIG for align instruction

* Use loopAlignCandidates

* Use loopHeadIG reference

* jit format

* Remove unneeded method

* Misc changes

* Review feedback

* Do not include align behind Jmp in PerfScore calculation

* jit format and fix a bug

* fix the loopCandidates == 0 scenario

* Add unmarkLoopAlign(), add check for fgFirstBB

* merge conflict fix

* Add missing }

* Grammar nit

Co-authored-by: Bruce Forstall <[email protected]>

Co-authored-by: Bruce Forstall <[email protected]>
This change allows devs to manually kick off full test runs on the configurations that only execute smoke tests per PR. 

To kick things off, you can run /azp run runtime-manual
* Convert Crypto P/Invokes to GeneratedDllImport.
- Do not printout exceptions from failing task tests.
- Remove default MONO_LOG_MASK=gc from debug configuration.
…leted. (dotnet#60214)

* FileSystemEntry.Unix: ensure attributes are available when file is deleted.

When the file no longer exists, we create attributes based on what we know.

The test for this was passing because it cached the attributes before the
item was deleted due to enumerating with skipping FileAttributes.Hidden.

* GetLength: fix reading from uninitialized cache.
…#60160)

* FileStatus.Unix/Process.Unix: align implementation.

Process: remove the user identity caching and extend the logic
to avoid retrieving the identity in most cases by checking
if all x-bits are set or not set.

FileStatus: use same group check as Process.

FileStatus: cache the read only flag instead of caching the
identity.
* In the JIT, add support for dumping the precise debug info out through
  an environment variable `DOTNET_JitDumpPreciseDebugInfoFile` in a
  simple JSON format. This is a stopgap until we expose the extra
  information through ETW events.

* In dotnet-pgo, add an argument --precise-debug-info-file which can
  point to the file produced by the JIT. When used, dotnet-pgo will get
  native<->IL mappings from this file instead of through ETW events.

* In dotnet-pgo, add support for attributing samples to inlinees when
  that information is present. This changes the attribution process a
  bit: previously, we would group all LBR data/samples and then
  construct the profile from all the data. We now do it in a more
  streaming way where there is a SampleCorrelator that can handle
  individual LBR records and individual samples.

* In dotnet-pgo, add an argument --dump-worst-overlap-graphs-to which
  can be used in the compare-mibc command to dump out a .dot file
  containing the flow graph of the methods with the worst overlap
  measures, and showing the relative weight count on each basic block
  and edge for the two profiles being compared. This is particular
  useful to find out where we are producing incorrect debug mappings, by
  comparing spgo.mibc and instrumented.mibc files.
tannergooding and others added 24 commits December 2, 2021 19:02
…sics (dotnet#61982)

* Ensure that we don't try to get the simdBaseJitType for scalar intrinsics

* Update a condition to also check for isScalarIsa
Rather than outputting an if block per unrolled iteration, just output a clause for each iteration as part of a single if block.  We already do this for concatenations, but we don't yet for standalone repeaters.
…runtime-assets (dotnet#62331)

* Update dependencies from https://github.com/dotnet/arcade build 20211202.3

Microsoft.DotNet.XUnitExtensions , Microsoft.DotNet.VersionTools.Tasks , Microsoft.DotNet.Build.Tasks.Workloads , Microsoft.DotNet.Build.Tasks.Templating , Microsoft.DotNet.Build.Tasks.TargetFramework.Sdk , Microsoft.DotNet.Build.Tasks.Packaging , Microsoft.DotNet.Build.Tasks.Installers , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Build.Tasks.Archives , Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.ApiCompat , Microsoft.DotNet.CodeAnalysis , Microsoft.DotNet.XUnitConsoleRunner , Microsoft.DotNet.GenFacades , Microsoft.DotNet.GenAPI , Microsoft.DotNet.RemoteExecutor , Microsoft.DotNet.PackageTesting , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SharedFramework.Sdk
 From Version 7.0.0-beta.21576.4 -> To Version 7.0.0-beta.21602.3

* Update dependencies from https://github.com/dotnet/xharness build 20211202.2

Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Xunit
 From Version 1.0.0-prerelease.21602.1 -> To Version 1.0.0-prerelease.21602.2

* Update dependencies from https://github.com/dotnet/runtime-assets build 20211202.1

Microsoft.DotNet.CilStrip.Sources , System.ComponentModel.TypeConverter.TestData , System.Drawing.Common.TestData , System.IO.Compression.TestData , System.IO.Packaging.TestData , System.Net.TestData , System.Private.Runtime.UnicodeData , System.Runtime.Numerics.TestData , System.Runtime.TimeZoneData , System.Security.Cryptography.X509Certificates.TestData , System.Windows.Extensions.TestData
 From Version 7.0.0-beta.21579.1 -> To Version 7.0.0-beta.21602.1

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
* Use separate ProxyAssemblies per ALC in DispatchProxyGenerator.
According to the ABI, they need to be returned by value.
* Update overlapped field test to conflict on 32-bit arches

   Fixes dotnet#62303

* [class-init] Setup fields of nested structs in layout check

   On AOT the field's class may not have been fully inited yet.

   Related to dotnet#62311
* new parser tests

* baseline

* Nim tests

* typos

* positive cases

* new parser tests

* change to \u
…ator (dotnet#62318)

* Delete old code generation approach from RegexCompiler / source generator

In .NET Framework and up through .NET Core 3.1, the code generated for RegexOptions.Compiler was effectively an unrolled version of what RegexInterpreter would process.  The RegexNode tree would be turned into a series of opcodes via RegexWriter; the interpreter would then sit in a loop processing those opcodes, and the RegexCompiler iterates through the opcodes generating code for each equivalent to what the interpreter would do but with some decisions made at compile-time rather than at run-time.  This approach, however, leads to complicated code that's not pay-for-play (e.g. a big backtracking jump table that all compilations go through even if there's no backtracking), that doesn't factor in the shape of the tree (e.g. it's difficult to add optimizations based on interactions between nodes in the graph), and that doesn't read well when emitted as C# instead of IL as part of the source generator.  In .NET 5, we started adding an alternative implementation that processed the RegexNode tree directly, addressing all of those cited issues; however, it only worked for a subset of expressions, namely those with little-to-no backtracking (e.g. non-atomic loops and alternations weren't supported).  Since then, we've improved it to the point where everything other than RegexOptions.RightToLeft (which implicitly means lookbehinds as well) is supported, and we've agreed it's ok to drop compilation for those constructs; if they ever become an issue, we can add support for them via the new compilation scheme.

As such, this PR:
- Deletes all of the code associated with the older code generation scheme
- Updates the Regex ctor to fall back to selecting the interpreter if the expression can't be compiled
- Updates the source generator to fall back to just emitting a cached use of Regex if the expression can't be compiled (and issuing a diagnostic in that case)
- Adds several tests that now pass with the new scheme that didn't with the old (and that still don't with the interpreter)

* Make the addition of more declarations a bit more robust

* Reduce backtracking code gen when nodes are atomic

Also added some comments and renamed a few methods for consistency between RegexCompiler and RegexGenerator.Emitter

* Fix tests on mono interpreter
Collect also does a replay, so needs all the replay arguments set to something.
Fixes this error: `'CoreclrArguments' object has no attribute 'compile'`
superpmi.exe has the concept of an exclusion list file which is automatically read
and processed when reading a .mch file. (I'm not sure if anyone actually uses it.)
So, when opening a `t.mch` file, it looks for an adjacent `t.mch.exc` and then `t.exc`
file.

There was a bug where it would also look for a `t` file (the comments say it takes t.exc.mch
and looks for t.exc, but it didn't check for that). In my case when I was testing, I actually
had a `t` directory (not file), which it found, but then emitted an error trying to load.

So, two fixes:
1. For `t.mch`, don't look for `t`.
2. Check all cases for being a directory, and fail if any name is a directory.
* Use Spans on Emitter Go() methods

* Use Spans on Compiler Go() methods

* Rename localSpan to slice in order to avoid confusion with runtextSpan
Found when trying to enable OSR by default and prospecting for arm64 support.

* Explicitly initalize the OSR step variable.
* Prevent `fgOptimizeUncondBranchToSimpleCond` from changing the scratch entry
  BB to have conditional flow.
* Retain runtime supplied patchpoint info when cleaning up after an altjit failure.
Copy link
Collaborator

@UnityAlex UnityAlex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to verify once the buildscirpts land that nothing went backwards.

@joshpeterson
Copy link
Author

Will need to verify once the buildscirpts land that nothing went backwards.

Do you want to wait and land the macOS and Linux CI stuff first? We can rebase this branch on unity-main once that is in.

@joshpeterson joshpeterson merged commit 59469f5 into unity-main Dec 7, 2021
@joshpeterson joshpeterson deleted the bot-upstream-main-merge-2021-12-04 branch December 7, 2021 18:41
yamato-ci-bot pushed a commit that referenced this pull request Jan 15, 2022
…otnet#63598)

* Fix native frame unwind in syscall on arm64 for VS4Mac crash report.

Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0.

Fix ReadCompactEncodingRegister so it actually decrements the addr.

Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping.

arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind
loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the
correct order.

Add getting the unwind info for tail called functions like this:

__ZL14PROCEndProcessPvji:
   36630:       f6 57 bd a9     stp     x22, x21, [sp, #-48]!
   36634:       f4 4f 01 a9     stp     x20, x19, [sp, #16]
   36638:       fd 7b 02 a9     stp     x29, x30, [sp, #32]
   3663c:       fd 83 00 91     add     x29, sp, #32
...
   367ac:       e9 01 80 52     mov     w9, #15
   367b0:       7f 3e 02 71     cmp     w19, #143
   367b4:       20 01 88 1a     csel    w0, w9, w8, eq
   367b8:       2e 00 00 94     bl      _PROCAbort
_TerminateProcess:
-> 367bc:       22 00 80 52     mov     w2, #1
   367c0:       9c ff ff 17     b       __ZL14PROCEndProcessPvji

The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this
get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame.

Always address module segment to IP lookup list instead of checking the module regions.

Strip pointer authentication bits on PC/LR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.