Skip to content

Commit

Permalink
Update instructions for adding a SemIR instruction. (#4348)
Browse files Browse the repository at this point in the history
Co-authored-by: Josh L <[email protected]>
Co-authored-by: Jon Ross-Perkins <[email protected]>
  • Loading branch information
3 people authored Oct 10, 2024
1 parent 1a1bfd2 commit 9e5e330
Show file tree
Hide file tree
Showing 2 changed files with 83 additions and 16 deletions.
2 changes: 1 addition & 1 deletion toolchain/check/context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1099,7 +1099,7 @@ class TypeCompleter {
auto BuildValueRepr(SemIR::TypeId type_id, SemIR::Inst inst) const
-> SemIR::ValueRepr {
// Use overload resolution to select the implementation, producing compile
// errors when BuildTypeForInst isn't defined for a given instruction.
// errors when BuildValueReprForInst isn't defined for a given instruction.
CARBON_KIND_SWITCH(inst) {
#define CARBON_SEM_IR_INST_KIND(Name) \
case CARBON_KIND(SemIR::Name typed_inst): { \
Expand Down
97 changes: 82 additions & 15 deletions toolchain/docs/adding_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Parse](#parse)
- [Typed parse node metadata implementation](#typed-parse-node-metadata-implementation)
- [Check](#check)
- [Adding a new SemIR instruction](#adding-a-new-semir-instruction)
- [SemIR typed instruction metadata implementation](#semir-typed-instruction-metadata-implementation)
- [Lower](#lower)
- [Tests and debugging](#tests-and-debugging)
Expand Down Expand Up @@ -232,42 +233,108 @@ Note: this is broadly similar to
Each parse node kind requires adding a `Handle<kind>` function in a
`check/handle_*.cpp` file.

### Adding a new SemIR instruction

If the resulting SemIR needs a new instruction:

- add a new kind to [sem_ir/inst_kind.def](/toolchain/sem_ir/inst_kind.def)
- Add a new kind to [sem_ir/inst_kind.def](/toolchain/sem_ir/inst_kind.def).

- Add a `CARBON_SEM_IR_INST_KIND(NewInstKindName)` line in alphabetical
order
- a new struct definition to

- Add a new struct definition to
[sem_ir/typed_insts.h](/toolchain/sem_ir/typed_insts.h), such as:

```cpp
struct NewInstKindName {
static constexpr auto Kind = InstKind::NewInstKindName.Define(
// the name used in textual IR
"new_inst_kind_name"
// Optional: , TerminatorKind::KindOfTerminator
static constexpr auto Kind =
// `Parse::SomeId` should be one of:
// - A node ID from `parse/node_ids.h`,
// specifying the kind of parse nodes for this instruction.
// This could be a node kind from `parse/node_kind.def`
// suffixed by `Id`, or one of the `Any`...`Id` alias
// declarations that match multiple kinds of parse nodes.
// - `Parse::NodeId` if it can be any kind of parse node.
// - `Parse::InvalidNodeId` if no associated parse node.
InstKind::NewInstKindName.Define<Parse::SomeId>(
// The name used in textual IR:
{.ir_name = "new_inst_kind_name"}
// Other parameters have defaults.
);

// Optional: omit if not associated with a parse node.
Parse::Node parse_node;

// Optional: omit if this sem_ir instruction does not produce a value.
// Optional: Include if this instruction produces a value used in
// an expression.
TypeId type_id;

// 0-2 id fields, with types from sem_ir/ids.h or sem_ir/builtin_kind.h
// For example, fields would look like:
// 0-2 id fields, with types from sem_ir/ids.h or
// sem_ir/builtin_kind.h. For example, fields would look like:
StringId name_id;
InstId value_id;
};
```

Adding an instruction will also require a handler in the Lower step.
- [`sem_ir/inst_kind.h`](/toolchain/sem_ir/inst_kind.h) documents the
different options when defining a new instruction, as well as their
defaults, see `InstKind::DefinitionInfo`.
- If an instruction always produces a type:

- Set `.is_type = InstIsType::Always` in its `Kind` definition.
- When constructing instructions of this kind, pass
`SemIR::TypeId::TypeType` in as the value of the `type_id` field, as
in:

```
SemIR::InstId inst_id = context.AddInst<SemIR::NewInstKindName>(
node_id, {.type_id = SemIR::TypeId::TypeType, ...});
```

- Although most instructions have distinct types represented by
instructions like `ClassType`, we also have builtin types for cases
where types don't need to be distinct per-entity. This is rare, but
used, for example, when an expression implicitly uses a value as part of
SemIR evaluation or as part of desugaring. We have builtin types for
bound methods, namespaces, witnesses, among others. These are defined in
[`sem_ir/builtin_inst_kind.def`](/toolchain/sem_ir/builtin_inst_kind.def).
To get a type id for one of these builtin types, use something like
`context.GetBuiltinType(SemIR::BuiltinInstKind::WitnessType)`, as in:

```
SemIR::TypeId witness_type_id =
context.GetBuiltinType(SemIR::BuiltinInstKind::WitnessType);
SemIR::InstId inst_id = context.AddInst<SemIR::NewInstKindName>(
node_id, {.type_id = witness_type_id, ...});
```

- Instructions without types may still be used as arguments to
instructions.

Once those are added, a rebuild will give errors showing what needs to be
updated. The updates needed, can depend on whether the instruction produces a
type. Look to the comments on those functions for instructions on what is
needed.

Instructions won't be given a name unless
[`InstNamer::CollectNamesInBlock](/toolchain/sem_ir/inst_namer.cpp) is called on
the `InstBlockId` they are a member of. As of this writing,
`InstNamer::CollectNamesInBlock` should only be called once per `InstBlockId`.
To accomplish this, there should be one instruction kind that "owns" the
instruction block, and will have a case in `InstNamer::CollectNamesInBlock` that
visits the `InstBlockId`. That instruction kind will typically use
`FormatTrailingBlock` in the `sem_ir/formatter.cpp` to list the instructions in
curly braces (`{`...`}`). Other instructions that reference that `InstBlockId`
will use the default rendering that has just the instruction names in parens
(`(`...`)`).

Adding an instruction will generally also require a handler in the Lower step.

Most new instructions will automatically be formatted reasonably by the SemIR
formatter.
formatter. If not, then add a `FormatInst` overload to
[`sem_ir/formatter.cpp`](/toolchain/sem_ir/formatter.cpp). If only the arguments
need custom formatting, then a `FormatInstRHS` overload can be implemented
instead.

If the resulting SemIR needs a new built-in, add it to
[builtin_inst_kind.def](/toolchain/sem_ir/builtin_inst_kind.def).
[`sem_ir/builtin_inst_kind.def`](/toolchain/sem_ir/builtin_inst_kind.def).

### SemIR typed instruction metadata implementation

Expand Down

0 comments on commit 9e5e330

Please sign in to comment.