Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tools,topgen] Enhance and simplify topgen #25933

Merged
merged 6 commits into from
Feb 7, 2025

Conversation

matutem
Copy link
Contributor

@matutem matutem commented Jan 17, 2025

This is a big change in topgen. Some salient changes:

  • Regenerates from scratch the ipgen in-memory configurations (as IpBlock objects),
    so it is truly incremental: the previous flow reused the pre-existing generated ipgen
    hjson if found.
  • Delays the actual ipgen file generation until the complete top config is generated.
    This is a step towards breaking topgen into smaller and mutually independent steps,
    thus enabling parallelizing them, and creating smaller and simpler tools to tackle each.
  • It converges with a single pass of _process_top.

In order to accomplish this some important changes are made:

  • Call some of the merge_top functions suitable for each specific ipgen prior to
    generating the in-memory hjson config. This means not all IpBlock objects are
    available, so these merge functions are instrumented to skip blocks missing.
  • The generation of ipgens are ordered according to dependencies. This order created
    from the expected block functionality, so is not derived from a constructed graph, but
    converging in one pass means it is correct.
  • Calling merge functions early means some of the objects in the in-memory top config
    are not plain dictionaries but are classes, so the code needs to handle either.

Part of #25920

@matutem matutem requested review from pamaury, a-will and Razer6 January 17, 2025 21:25
@matutem matutem force-pushed the topgen_split branch 4 times, most recently from 0161920 to e7d519a Compare January 29, 2025 20:22
@matutem matutem marked this pull request as ready for review January 29, 2025 20:22
@matutem matutem requested a review from msfschaffner as a code owner January 29, 2025 20:22
@matutem matutem marked this pull request as draft January 29, 2025 20:28
@matutem matutem changed the title [tools,topgen] DRAFT OF Simplify topgen [tools,topgen] Enhance and simplify topgen Jan 29, 2025
@matutem matutem marked this pull request as ready for review January 29, 2025 23:11
@matutem
Copy link
Contributor Author

matutem commented Jan 29, 2025

CHANGE AUTHORIZED: hw/top_earlgrey/data/top_earlgrey.hjson

@matutem matutem force-pushed the topgen_split branch 2 times, most recently from 3936969 to 953d787 Compare January 31, 2025 10:14
Copy link
Contributor

@a-will a-will left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are probably mostly useless comments and nits. I still have a lot more to look at. 😄

Comment on lines +628 to +629
if not isinstance(top['clocks'], Clocks):
top['clocks'] = Clocks(top['clocks'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a wee bit funky (a non-sequitur) that we modify the top argument here, in a function named extract_clocks(). Is the promotion to a class unable to happen until a specific point in the flow, and this happened to be a convenient place?

It's similar for amend_resets() below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior was not changed in this PR, rather this quirky behavior was exposed because we run merge operations more often. I agree it would be nice to keep the consisting of json objects only.

Comment on lines 302 to 319
reset_requests_optional = {
'int': ['s', 'internal request list'],
'debug': ['s', 'debug request list'],
'peripheral': ['s', 'peripheral request list'],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an arbitrary breakdown. What's the difference between these three categories? Is there any different handling between "int" resets and "debug" resets?

This is mostly a documentation nudge. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point: I extended the description.

@@ -1431,7 +1431,8 @@ end
tlul_socket_m1 #(
.HReqDepth (16'h0),
.HRspDepth (16'h0),
.DRspPass (1'b0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting this is a timing change (albeit what the hjson file asked for, hehe).

{
"name": "Esc",
"desc": "escalation reset request",
"module": "alert_handler"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, but englishbreakfast doesn't have an alert_handler. Does this field not actually do anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBD: notice the earlgrey and englishbreakfast use the same toplevel.sv.tpl, so there are plenty of questions about this.

'default': ['s', 'the default value of the parameter'],
}
param_optional = {
'expose': ['s', 'seems redundant TODO'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is whether the parameter bubbles up to the top of the generated toplevel.sv file. We probably need to redo some of the template stuff... but that's for another day!

'expose': ['s', 'seems redundant TODO'],
'local': ['s', 'whether it is a localparam, interpreted as boolean'],
'name_top': ['s', 'the name in the top-level'],
'randcount': ['d', 'TODO'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is apparently the number of bits requested for a random number inserted into toplevel_rnd_cnst_pkg.sv. If I'm not misreading the code, it looks like randwidth is added ...and is just equal to randcount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are not always the same, which is what triggered the TODO. I think we need a separate PR which can generate documentation about the format of the hjson files partly based on the key's documentation.

util/topgen/validate.py Outdated Show resolved Hide resolved
@matutem matutem marked this pull request as draft February 1, 2025 16:03
@matutem matutem force-pushed the topgen_split branch 2 times, most recently from b7c8e3d to d71541e Compare February 3, 2025 18:29
@matutem matutem marked this pull request as ready for review February 3, 2025 18:43
Copy link
Contributor

@a-will a-will left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit my eyes started to glaze over after awhile of reading the refactoring. Also, we probably could use a canonical config output order, so we don't get these huge diffs that simply move the same information around.

However, from what I have been able to read, LGTM

Comment on lines +86 to +85
ip_template = IpTemplate.from_template_path(IP_TEMPLATES_PATH /
template_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if at some point, we'll need to allow for multiple roots for IP templates, including out-of-tree roots. I'm not sure what that would look like, though, since topgen has embedded how to translate configuration in the toplevel.hjson file to the actual ipgen attributes.

A problem for another day, if it ever comes up.

Comment on lines 192 to 191
gencmd = (f"// util/topgen.py -t hw/{top_name}/data/{top_name}.hjson "
f"-o hw/{top_name}/\n\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change, since it's not this PR's fault, but the paths here are kind of wrong. They assume in-tree paths for user files, and they should not. We are provided the path to the top-level hjson file, so we ought to use it. Not that it matters here, since this is just a comment.

topname = topcfg["name"]
module_name = module["type"]
module_instance_name = params.get("module_instance_name")
assert not module_instance_name or module_instance_name == module_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly lost--What is this assertion guarding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking we will always set module_instance_name to module["type"]. This is the case now, and I expect it will continue to be the case, since we intend to use the top config to templatize the individual core files due to the renaming under ip_autogen, since we will create ip_autogen/.hjson.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, understood. This is a sanity check that module["type"] got passed to the template's module_instance_name .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it is important to keep that invariant.

Comment on lines +485 to +484
return {
"src_clks":
OrderedDict({name: vars(obj)
for name, obj in clocks.srcs.items()}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the correct indentation for style / readability? When keys and values are on separate lines, the values don't get indented?

(just a question -- it feels a bit hard to read for me)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the linters suggest and I think recommend it in the style guides, or else we will get into formatting flip-flops which are a waste of time.

Comment on lines -642 to -648
if reggen_only and alt_hjson_path:
hjson_path = Path(alt_hjson_path)
else:
hjson_path = ip_out_path / "data" / f"{ip}.hjson"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we should be killing this feature? I don't know who uses it, but there doesn't seem to be a great reason to remove it.

That said... does it even work anymore? It looks like this was broken by #25129, undoing the support that was completed for #8207.

In a world where the tops in the open-source repo are reference tops and Nuvoton's top hjson is out-of-tree, I guess we don't really need that argument anymore, do we? If an integrator wanted to work with partners using a sort of "redacted" version of their internal top, I guess they could provide a separate toplevel.hjson that has the proprietary IPs replaced with shareable models.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Since we now support multiple tops, I. think that may not be that beneficial and just adds complexity. We don't use that feature so happy to get rid of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag is not even sensible: imagine we had multiple reggens, then they would all use the same hjson file? I vaguely recall talking to someone else about this and agreeing we don't need it anyway.

Copy link
Member

@Razer6 Razer6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. So far, this looks really good and simplifies a lot. I will also apply these changes downstream and check if they work with other tops as well.

topname = topcfg["name"]
module_name = module["type"]
module_instance_name = params.get("module_instance_name")
assert not module_instance_name or module_instance_name == module_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the second part of the assertion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to this comment #25933 (comment)

"typed_clocks":
OrderedDict({ty: d
for ty, d in typed_clks.items() if d}),
"hint_names":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this wrapping of key and value really more readable? Personally, for me it's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let lintpy.py decide on formatting, this way we don't end up with changes that will later be undone to clean up lint. I encourage you to do the same, or we will have flip-flops on plain formatting.

# were expanded already.
# If racl_mappings is expanded the path is one of the fields
for if_name, mapping_path in racl_mappings.items():
if isinstance(mapping_path, dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be more of an assert? That's done once and should be there anymore, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amend_racl can be called multiple times, so this is simple and idempotent.

"lpg_map": lpg_map,
"top_pkg_vlnv": f"lowrisc:constants:top_{topname}_top_pkg",
}

ipgen_render("alert_handler", topname, params, out_path)

def generate_alert_handler(top: Dict[str, object], module: Dict[str, object],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of the generate_xyc functions? They all do exactly the same (except RACL but the same check is done at the outside). These functions do not serve any business logic. I guess that can be wrapped in a single location. Maybe pass the generate_modules function just the params function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some subtleties: the get_params functions have slightly different signature so passing the get_params would be quirky; passing the params themselves won't work when and if we support different multiply-parameterized instances of ipgens. I'd rather punt on changing this in this PR.

Comment on lines -642 to -648
if reggen_only and alt_hjson_path:
hjson_path = Path(alt_hjson_path)
else:
hjson_path = ip_out_path / "data" / f"{ip}.hjson"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Since we now support multiple tops, I. think that may not be that beneficial and just adds complexity. We don't use that feature so happy to get rid of it.

util/topgen.py Outdated
insert_ip_attrs(ipgen_instances["ac_range_check"][0]["type"],
_get_ac_range_check_params(topcfg))
# Pinmux depends on flash_ctrl and otp_ctrl
amend_pinmux_io(topcfg, name_to_block)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this call to amend_pinmux_io should be within the if below. Not all tops may have a pinmux so don't add a requirement here on the necessary keys in the HJSON (needed by the amend function).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I guess for tops without a pinmux we may need an alternative flow to potentially connect to pads... though I imagine you may never connect pads directly to these models.

util/topgen.py Outdated
topcfg['incoming_interrupt'] = OrderedDict()

for m in topcfg['module']:
if m['type'] == 'alert_handler':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if m.get("template_type") == "alert_handler"):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

util/topgen.py Outdated
Path(args.topcfg).parent / alert_mappings_path)
for alert_group, alerts in mapping.items():
topcfg['incoming_alert'][alert_group] = alerts
elif m['type'] == 'rv_plic':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if m.get("template_type") == "rv_plic"):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right! At this point we support no more than 1 of each. By the way, we may want to flag an error if there are none, probably in another PR.

@@ -1211,8 +1469,6 @@ def main():
log.error('Seed "rnd_cnst_seed" not found in configuration HJSON.')
exit(1)

secure_prng.reseed(topcfg["rnd_cnst_seed"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason of this removal? Probably the follow up question is what was this used for previously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to where it actually matters, which is prior to each loop of _process_top: without running each pass with the same seed we could not compare the output file for stability.

util/topgen.py Outdated

topname = topcfg["name"]
cfg_copy = deepcopy(topcfg)
for pass_idx in range(1, maximum_passes + 1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the loop run like for pass_idx in range(maximum_passes )? May Matlab times are far behind where you start counting with 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to make f-strings be easier, but that is too much of a detail. Anyway, python range has three ways to be called, and with two arguments (a, b) it yields a ... b-1, and with three a, b, s it does as with two except s becomes the increment.

I undid it anyway and also noticed process_top doesn't need pass_idx.

@matutem
Copy link
Contributor Author

matutem commented Feb 6, 2025

@a-will regarding #25933 (review), I expect the new hjson files will be stable. The changes for this change in topgen are the cause of the present difference, but I expect going forward there should be no gratuitous diffs. If there are we can use OrderedDicts for problematic dicts.

Do not require the existence of "int", "debug", or even "peripheral"
resets.

Signed-off-by: Guillermo Maturana <[email protected]>
Do not require the existence of "int", "debug", or even "peripheral"
resets.

Signed-off-by: Guillermo Maturana <[email protected]>
The reset_requests top config entry should really be initialized from
the original hjson file rather than explicitly in merge.py, since it
ends up making assumptions about what reset_requests should have.

Part of lowRISC#25920

Signed-off-by: Guillermo Maturana <[email protected]>
This makes the complete top config hjson have the same names
for attributes. Previously the original name for this was "clk"
but the complete config renames it to "clock" because of the
Clocks class field names. The Clocks class is used quite a bit
more in topgen, so this rename creates less canges, and it is
arguably a better name.

Part of lowRISC#25920

Signed-off-by: Guillermo Maturana <[email protected]>
This is a big change in topgen. Some salient changes:
- It always regenerates from scratch the ipgen in-memory configurations
  (as IpBlock objects) so it is truly incremental: the previous flow
  reused the pre-existing generated ipgen hjson if found.
- It delays the actual ipgen file generation until the complete top config
  is generated. This is a step towards breaking topgen into smaller and
  mutually independent steps, thus enabling parallelizing them.
- On each pass it uses the most up-to-date top config, and detects
  convergence if the top configs from one pass and the previous one match.
- It converges with a single pass of process_top (though a second pass is
  done to enable the comparison).
- The generated RTL with and without these changes is identical.
In order to accomplish this some important changes are made:
- Call some of the merge_top functions suitable for each specific ipgen
  prior to generating the in-memory hjson config. This means not all
  IpBlock objects are available, so these merge functions are instrumented
  to skip missing blocks.
- The generation of ipgens are ordered according to dependencies. This
  order is created from the expected block functionality, so is not derived
  from a constructed graph, but converging in one pass means it is correct.
- Calling merge functions early means some of the objects in the in-memory
  top config are not plain dictionaries but are classes, so the code needs
  to handle either.

Part of lowRISC#25920

Signed-off-by: Guillermo Maturana <[email protected]>
Add support for added top config attributes.
Dive deeper into a few attributes.
Left a few TODOs for some attributes, most of these are not
yet populated.

Part of lowRISC#25920

Signed-off-by: Guillermo Maturana <[email protected]>
@Razer6
Copy link
Member

Razer6 commented Feb 6, 2025

CHANGE_AUTHORIZED: hw/top_earlgrey/data/top_earlgrey.hjson

@a-will
Copy link
Contributor

a-will commented Feb 6, 2025

CHANGE AUTHORIZED: hw/top_earlgrey/data/top_earlgrey.hjson

A property name changed, and some other properties were added. However, no functional changes to earlgrey were made (except potentially for the values of random numbers, but those aren't protected).

@matutem matutem merged commit 530603b into lowRISC:master Feb 7, 2025
39 checks passed
@matutem matutem deleted the topgen_split branch February 7, 2025 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants