Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a basic test for name poisoning #4572

Open
wants to merge 46 commits into
base: trunk
Choose a base branch
from

Conversation

bricknerb
Copy link
Contributor

Demonstrate where failures should happen.

Demonstrate where failures should happen.
jonmeow and others added 22 commits November 28, 2024 09:52
…arbon-language#4555)

This is for more clearly distinct names, and to make it a clearer
transition from `BuiltinInst` for name conflicts. `FloatType` is also an
instruction, and we have `Carbon::Error` (common/error.h). This avoids
affecting tests, although the name is embedded in the builtin test.

In `LegacyFloatType`, `Legacy` because I was having trouble coming up
with a more appropriate name. I'm not clear this is a `FloatLiteralType`
at present, it needs some work to mirror `IntLiteralType`.

In `ErrorInst`, the suffix `Inst` was discussed as good and similar to
`BuiltinInst` (although I'm trying to get rid of that).
These tests typically take 10-20s, but I'm seeing some timeouts
[here](https://github.com/carbon-language/carbon-lang/actions/runs/11899548036/job/33158400417).
This seemed particularly suspicious due to the _absence_ of output
(copied below). That got me looking, and maybe the subprocessing tickles
a cpu bottleneck, so proposing this approach to remove the exec. Even if
this doesn't solve the flakiness, I think it's a simpler implementation.

Note I believe this is intended to work. The `sh` rules rely on shebangs
(as noted at https://bazel.build/reference/be/shell#sh_test), and are
essentially just subprocessing to the input. Note this could've also had
`args` on a `cc_test` rule, but I'd expect the same args to be passed to
`run` where instead the benchmark behavior should be default (and I'm
assuming you'd rather not have args there). Fundamentally this becomes a
symlink:

```
bazel-bin/common/map_benchmark_test -> .../execroot/_main/bazel-out/k8-fastbuild/bin/common/map_benchmark
```

Copying snippet from timeout below:

```
==================== Test output for //common:map_benchmark_test:
      /private/var/tmp/_bazel_runner/e591f63ed099023de1f206992dfce127/execroot/_main/bazel-out/darwin_arm64-fastbuild/testlogs/common/map_benchmark_test/test.log
-- Test timed out at 2024-11-18 19:32:13 UTC --
INFO: From Testing //common:map_benchmark_test:
================================================================================
```
I'm looking at adding more significant logic to checking, particularly
for interop. But check.cpp is getting large, and I think just adding
more logic will make it harder to reason about, so I'm looking at
splitting it up. This moves out NodeIdTraversal and
DeferredDefinitionWorklist because they're already independent from the
other code, and are reasonably sized to have their own files.
…e#4556)

Adds per-builtin instructions, removing `BuiltinInst`. This collapses
`builtin_inst_kind.def` into `inst_kind.def` so that we have a single
place for all macro uses. I still want to remove `BuiltinInstKind`, but
it's something I think is better separated from the `BuiltinInst`
removal.

I'm collapsing the build targets `ids` and `inst_kind` into one because
they both have links to builtin kind information now. It's hard to
separate without a cycle. I'm using the `typed_insts` name because that
seems like the actual most significant thing there, and more interesting
relative to the `inst` target.
…4551)

This adds language to explain where the types of constraints can or can
not appear (attached-to an impl-as vs in-a type expression). And
describes the impact of using a rewrite vs same-type constraint inside
the body of the affected code, and thus why a rewrite is preferable when
the constraint is of a single facet type.
Replace the special case adding the base class to the list of extended
scopes with a fully general approach. We use an ImportRef to lazily
import the extended scopes on first lookup.
The new `FacetValue` instruction represents `C as I` for some type `C`
and facet type `I`. It is named `FacetValue` instead of just `Facet` to
parallel the `FacetType` instruction.

This PR uses this instruction represent the facet value `Self` in an
`impl` declaration. This instruction will be used in the future to also
support things like:

* `C as I` where `C` is a class; and
* forming a specific for a generic with a `T:! I` parameter where `T` is
being given a concrete value.

(Here `I` is an interface or other non-`type` facet type.)

Also do some renaming and add some comments to make things a bit more
clear.

* `FacetTypeAccess` -> `FacetAccessType` to clarify this is not access
of a facet type, but access of the type of a facet
* `.facet_id` -> `.facet_value_inst_id` to parallel the `FacetValue`
instruction

`FacetAccessWitness` will be in a future PR.

---------

Co-authored-by: Josh L <[email protected]>
Co-authored-by: Richard Smith <[email protected]>
I'm working on speeding up the benchmark tests and noticed they weren't
using our main, seemed worth cleaning that up.
This is stamping out the per-instruction structs, similar to what we do
elsewhere. `BuiltinInstKind::label` then finishes shifting to
`InstKind::ir_name`.
… updated for 2 years and link the Minutes folder instead (carbon-language#4554)

These docs do not seem to be the standard entries anymore given than
it's been 2 years since they were used.

---------

Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Carbon Infra Bot <[email protected]>
…gnosing that the Core package is not found (carbon-language#4571)

I believe this makes the error easier to understand.
…uage#4570)

After some poking, it would take a more significant change to
restructure the string generation to take less time when run under ASan,
and it's not worth it at the moment.

For future reference, nearly half the time here is in building the
global data structures of random string contents, not in the actual
benchmark functions. If/when we want to improve this, we should switch
to a growing pool of random strings similar to what `SourceGen` uses.
That lets it not allocate the full size of data when just testing that
the benchmark doesn't crash.

I thought about having these benchmarks switch to use `SourceGen`, but
I'd like to keep them stand-alone if easy, and there are some important
differences that would have to be adapted around which wouldn't be
trivial. I'd rather come back in with a better generation strategy than
re-use the source code one here.
…ge#4573)

clangd-17 reports includes as being unused when they are used, such as
in common/ostream.h. There it says the `<concepts>` include is unused
but `std::derived_from` is present in the same file (and clangd points
out that it is coming from `<concepts>` on hover).
…e#4563)

This is really a set of closely related changes:

1) We really shouldn't be creating a Check::Unit for _Lower_. This fixes
that by storing the diagnostic converter for reuse.
2) Rather than passing in node converters to Check as their own array,
pass diagnostic converters as part of Check::Unit.
3) To support creating the converters early, pass SemIR::File
pre-constructed.
4) Since SemIR::File construction was used to track "checked", add
`is_checked` for that.
5) Clarifies a subtle edge case around `input_filename_` use with `-`.

Note the key consequence of this change, where I actually started, is
that `Check` only has one array of `Check::Unit` instead of receiving
`NodeLocConverter` as a separate array.
This introduces `calling_convention_param_ids`, a single block that
consolidates all the information that was being used by consumers of
`param_refs` and `implicit_param_refs`, in a form that's easier to
produce and typically easier to consume.

See also [this Discord
discussion](https://discord.com/channels/655572317891461132/655578254970716160/1300545448909738125)
regarding the decision to keep the return slot last in the SemIR calling
convention, even though it goes first in the LLVM calling convention.

---------

Co-authored-by: Jon Ross-Perkins <[email protected]>
Trying to make the .gitignore a little more broad. Accidentally added in
carbon-language#4553, I'm guessing as part of the force-push.
Not sure if this is the most robust way to do it - I guess the
alternative is doing name lookup into the dest struct fields too?
The ConsoleDiagnosticConsumer and ErrorTrackingDiagnosticConsumer have
moved to new places.
…4586)

There is no `Set` method, so remove that, and include `AddDefaultValue`
instead.
…4587)

- Rename TypedInstArgsInfo references to InstLikeTypeInfo. The type was
renamed in 07efa02.
- Update and correct the link to search for ValueStore (and similar) in
the Context class. We must avoid the google-doc-style checks rewriting
`repo:` to `repository:` in the URL.
- Mention the existance of many types of Store collections now.
- Remove pre-C++20 idioms in Field detection. Correct the concept based
idioms to work.
- Fix code indenting consistency.

---------

Co-authored-by: Richard Smith <[email protected]>
josh11b and others added 20 commits November 28, 2024 09:52
…ge#4589)

Previously it would allow interface mismatches. We only need to support
the case where there is a single interface, though, which makes checking
much more straightforward.

Co-authored-by: Josh L <[email protected]>
Adds `FacetAccessWitness` instruction and uses it in `member_access.cpp`
to support accessing members of facets. Still to do: interface witness
access is producing runtime values when it should produce symbolic
values.

---------

Co-authored-by: Josh L <[email protected]>
…nguage#4584)

In preparation for further refactoring, switch away from member
functions for most of `ImportRefResolver`.

Split `ImportRefResolver` into a context class that exposes the value
stores for the source and destination files, and a derived class that
maintains a worklist. The idea is to statically enforce that functions
that take `ImportContext` cannot accidentally add new work, because they
don't have access to the work queue.
* Change format:
  * previously there was a single link per talk, the video if available
  * now there are separate "video" and "slide" links
* Adds link to video for 2024 CppNorth talk
* Adds link to slides for 2024 LLVM Developers' Meeting
* Restores links to slides from past years

---------

Co-authored-by: Josh L <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
Co-authored-by: Chandler Carruth <[email protected]>
I had removed most but not all of the hashtable prefetching during
development because I wasn't confident in the benchmarking results.
However, I never revisited this once the benchmarking infrastructure
improved and there were solid and stable results.

This factors the two interesting prefetch patterns I've seen for this
style of hashtable into helpers that are always called, and provides
macros that can be used during the build to configure exactly which
prefetch strategies are enabled.

Benchmarking these and gaining confidence is very frustrating -- even
now with the improved infrastructure, the noise is much higher than I
would like. But it seems clear that *some* prefetching is a significant
win. It also seems like enabling both results in too much prefetch
traffic. And the entry group prefetch appears to be significantly more
effective, both for the most interesting of the microbenchmarks and
maybe most importantly for our compilation benchmarks. There, AMD is
helped substantially and M1 seems to be helped some (although harder to
measure).

AMD server benchmark numbers:
```
name                                              old cpu/op   new cpu/op   delta
BM_CompileAPIFileDenseDecls<Phase::Lex>/256       35.0µs ± 2%  34.2µs ± 2%  -2.40%  (p=0.000 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Lex>/1024       156µs ± 2%   151µs ± 2%  -3.18%  (p=0.000 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Lex>/4096       625µs ± 1%   605µs ± 1%  -3.22%  (p=0.000 n=19+18)
BM_CompileAPIFileDenseDecls<Phase::Lex>/16384     2.79ms ± 1%  2.69ms ± 2%  -3.67%  (p=0.000 n=17+19)
BM_CompileAPIFileDenseDecls<Phase::Lex>/65536     12.1ms ± 1%  11.6ms ± 1%  -4.30%  (p=0.000 n=17+18)
BM_CompileAPIFileDenseDecls<Phase::Lex>/262144    56.6ms ± 1%  53.8ms ± 1%  -5.00%  (p=0.000 n=18+17)
BM_CompileAPIFileDenseDecls<Phase::Parse>/256     61.1µs ± 2%  61.7µs ± 1%  +0.87%  (p=0.000 n=19+19)
BM_CompileAPIFileDenseDecls<Phase::Parse>/1024     288µs ± 1%   290µs ± 1%  +0.55%  (p=0.004 n=20+20)
BM_CompileAPIFileDenseDecls<Phase::Parse>/4096    1.16ms ± 1%  1.16ms ± 1%  -0.54%  (p=0.000 n=17+19)
BM_CompileAPIFileDenseDecls<Phase::Parse>/16384   4.98ms ± 1%  4.91ms ± 1%  -1.39%  (p=0.000 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Parse>/65536   20.9ms ± 1%  20.5ms ± 1%  -1.86%  (p=0.000 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Parse>/262144  92.1ms ± 1%  90.2ms ± 1%  -2.12%  (p=0.000 n=18+20)
BM_CompileAPIFileDenseDecls<Phase::Check>/256     1.16ms ± 2%  1.16ms ± 1%    ~     (p=0.931 n=19+19)
BM_CompileAPIFileDenseDecls<Phase::Check>/1024    2.17ms ± 2%  2.16ms ± 1%    ~     (p=0.247 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Check>/4096    6.07ms ± 1%  6.04ms ± 1%  -0.48%  (p=0.007 n=19+19)
BM_CompileAPIFileDenseDecls<Phase::Check>/16384   22.4ms ± 1%  22.2ms ± 1%  -0.99%  (p=0.000 n=20+19)
BM_CompileAPIFileDenseDecls<Phase::Check>/65536   93.3ms ± 1%  92.2ms ± 1%  -1.23%  (p=0.000 n=20+18)
BM_CompileAPIFileDenseDecls<Phase::Check>/262144   400ms ± 1%   391ms ± 1%  -2.15%  (p=0.000 n=20+18)
```
…nguage#4592)

The switch defaults to true, so it's displayed as `--no-debug-info` in
the help. The description should be agnostic about whether the flag will
enable or disable the behaviour.
We considered a couple of other options for this:
* carbon-language#4515 Keep the
`ElementIndex` numbering vptr-ignorant, and do +1 offsets as needed -
seems subtle/easy to miss
* carbon-language#4517 Always have a
zeroth element in the object representation, make it zero-size in the
case of no-vptr - @zygoloid was concerned this would add overhead
especially to stateless objects used in type-trait-like things.

But currently moving forward with this direction - of initializing field
indexes with an invalid value until the end of the class definition,
then assigning field indexes during construction of the class's object
representation struct type. This direction might reinforce/help avoid
premature access to the object representation before the class is
complete, and give a single place where class layout is done (at class
completion) if we want to add more options there, such as class layout
optimizations, etc.

This patch still has problems with object initialization (that carbon-language#4515
does not have/does address) but does address normal `obj.member` access
correctly.

---------

Co-authored-by: Richard Smith <[email protected]>
…age#4568)

This changes to an `Error` return to let the driver do the "error: "
prefix, except for one case with `help` that needs more work to change
(I'm not planning on picking up that TODO). It also changes
capitalization, backtick use, and a few minor punctuation things to try
to better match the diagnostic style.

This also adds `Error` matchers so that the changes to command line
testing are clearer.
Co-authored-by: Josh L <[email protected]>
Co-authored-by: Richard Smith <[email protected]>
* Make the step stack into a class
* Make the operations on the stack (pushing, popping, test for done)
into methods on the stack class.
* Add more kinds of steps (array bound and name).
* Rewrite cases to use the new kinds of steps. Afterward, none use
`Step::Next()` or the step index, so those get removed.
* Add another convenience method `PushTypeId`.
* Remove the `SemIR::File&` member from the steps, since it doesn't
change.

Hopefully using `step_stack.Push`... calls makes it clear that they are
resolved in the reverse order they are executed.

---------

Co-authored-by: Josh L <[email protected]>
Get a compiler error when a case isn't handled instead of a runtime
CHECK failure.
…on lexical scope over a lexical scope (carbon-language#4591)

This adds missing test coverage.
See discussion in
carbon-language#4574.
…uage#4564)

Represent the type as an `InstId` rather than as a `TypeId` to preserve
how it was written and better support tracking its value in a generic.
Add accessors to `Class` to get the base and adapted type to reduce code
duplication, and add `TypeStore::GetObjectRepr` to make it easier to map
from a type to its possibly-adapted object representation type. In
passing, also move `GetIntTypeInfo` and `GetUnqualifiedType` into
`TypeStore`.

This fixes specifics of generic adapters to properly look at the
specific adapted type, and also fixes importing of adapters.

---------

Co-authored-by: Jon Ross-Perkins <[email protected]>
…arbon-language#4599)

This seems slightly redundant for a locally-defined class, where there
will be a complete_type_witness instruction earlier in the class, but is
important for imported classes, where we're currently doing the wrong
thing in a way that's invisible in formatted SemIR.
…on-language#4600)

We need to create an instruction on import to attach the generic
constant value to.
auto-merge was automatically disabled November 28, 2024 08:52

Head branch was pushed to by a user without write access

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants