Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple file read #476

Closed
podhrmic opened this issue May 3, 2022 · 3 comments
Closed

Simple file read #476

podhrmic opened this issue May 3, 2022 · 3 comments

Comments

@podhrmic
Copy link

podhrmic commented May 3, 2022

RegFileLoad.v does not seem to map to Embedded Block Ram blocks on Lattice ECP5 FPGA (see YosysHQ/prjtrellis#189 ). Is there a simpler memory construct in Bluespec that could be used instead of RegFileLoad?

@quark17
Copy link
Collaborator

quark17 commented May 4, 2022

There are a few ways to go here.

(1) Explicit instantiation

My experience is working with Xilinx FPGAs and tools. One option there is to directly instantiate a macro for the IP that I want -- write an import-BVI for the macro you want to instantiate. If your code only needs to work for a specific FPGA and if Yosys supports referencing a specific macro, then you could use this.

(2) Adjusting or annotating the Verilog to help the tool infer

Obviously, it would be more portable to use generic Verilog and have the synthesis tool infer the appropriate IP. In my experience, synthesis tools are picky about how the Verilog must be written for a BRAM to be inferred. To support this, BSC even includes alternate versions of some library primitives, to use with different synthesis tools. For example, in your BSC instantiation you'll notice that there is not only a Verilog directory, but also a Verilog.Vivado directory. The Verilog.Vivado directory contains alternate versions of some modules, to work better with Xilinx's Vivado synthesis tool.

If you notice, Verilog.Vivado contains a variation of RegFile.v which adds the following attribute to the register array (defined inside RegFile), to help Vivado infer correctly:

(* RAM_STYLE = "DISTRIBUTED" *)
reg [data_width - 1 : 0]    arr[lo:hi];

Perhaps we need to create a Verilog.Yosys directory, containing variations of RegFile.v and RegFileLoad.v? I don't know enough about Yosys, though. Is there documentation showing what kind of Verilog Yosys looks for when inferring BRAMs?

(3) Other modules

There is a BRAM library, which defines modules for a number of different BRAM configurations (one and two port, byte-enables, etc). I wouldn't say that these are "simpler" though -- just the opposite! I've not used the BRAM library myself, but if you're doing heavy-duty designs for FPGAs, then you may want to look into it. It is documented in the Library Reference Guide (/doc/libraries_ref_guide/).

But here, again, the Verilog.Vivado directory contains special versions of the many BRAM*.v modules, tweaked to infer correctly for Vivado. Something similar may be needed for Yosys.

Anyway...

I would have suspected that option 2 here is where you want to look. But there's this comment on the Yosys issue:

Those are very big, multi port memories with async reads so they won't in general map to EBR. You should do something with a synchronous (clocked) read instead.

That sounds like suggesting that you need option 3: modules that are written to use the underlying array in a specific way, to get inferred. So maybe try something from the BRAM library. If the default implementation in Verilog works, then great! But if not, then maybe a variation (to go in a new Verilog.Yosys directory) needs to be written. (In that case, you might want to reply on the Yosys Issue by pointing them to the BRAM*.v file you end up using, instead of the RegFileLoad.v file.)

I guess I should add that there's maybe a fourth (future) option:

(4) Use a target-agnostic backend to BSC, like FIRRTL

This doesn't exist, but was suggested in #468. My understanding is that FIRRTL is an intermediate representation to allow the same design to synthesize to different targets (FPGA, ASIC, Verilator). So that might make it easier for the synthesis tool to infer appropriate HW for the target.

@podhrmic
Copy link
Author

podhrmic commented May 5, 2022

@quark17 thank you for the suggestions - what was needed was to add an output register as suggested in Figure 7: RAM with Registered Output in Verilog in Lattice Synthesis Engine for Diamond User Guide.

Rather than making my own implementation of RegFileLoad I switched to BRAM library (specifically BRAM1Load - it already has an output register) and the synthesis tool understood it immediately.

However, now the reads are delayed be one clock cycle - is there an example how to deal with delayed memory loads/stores in a canonical way?

@podhrmic
Copy link
Author

podhrmic commented May 5, 2022

However, now the reads are delayed be one clock cycle - is there an example how to deal with delayed memory loads/stores in a canonical way?

Perhaps not surprisingly, a simple state machine was all that was needed.

@podhrmic podhrmic closed this as completed May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants