Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote AXI-Read implementation to a complete AXI wrapper #1842

Merged
merged 82 commits into from
Jan 12, 2024
Merged

Conversation

nathanielnrn
Copy link
Contributor

This (large) PR removes the previous (#1820) axi-reads-calyx.futil read channel implementation in favor of a complete wrapper around the example vec-add program (with some slight modification from std_mems to seq_mems).

This calyx program successfully reads from cocotb-axi-rams, performs computations on this data, and writes back to the cocotb-axi-rams.

Before running this on actual Xilinx FPGAs, we will need to:

  1. Work on a subordinate control component based on all of the weird Xilinx specs and
  2. Figure out the correct interfacing with XML definitions and maybe other things I'm forgetting.

Sorry for the big PR. Happy to talk through stuff synchronously. The code is heavily commented as well, FWIW.

A few things that might stick out as possible issues that I have deprioritized because the way I see it, the goal is to eventually delete this hardcoded implementation from the repo in favor of an eventual .expect that Runt compares against.

  • Some code duplication in the cocotb testbench that is shared with the old xilinx cocotb testbench.
  • Probably some clean up to be done in the calyx itself. Hoping to polish things up while I transition to the generator.
  • Lack of a convenient way to simulate/execute things. I currently have a bash script that does this locally. Can add it but not sure how many people will end up using it before the generator is ready, at which point we could hopefully use fud[2] for more complete integration.

nathanielnrn and others added 30 commits December 22, 2023 11:56
TBD if this actually implements AXI correctly.

There are currently some hacks in place (marked with todos)
to get this to compile, namely some splicing
that doesn't consider what we actually want to splice
(it just takes [31:0]) as opposed to dynamically considering actual
bits we want.

A few other things that should be cleaned up eventually

Need to create a cocotb testbench to test correctness
Maybe this shouldn't be here, but for now (having deleted my
working directory earlier) putting it here
Simply run make from the cocotb directory and axi-read-tests
will be executed
We tie ARID low in our manager
Prefixes should not contain trailing "_"
Got rid of "assert_val" and "block_transfer" groups
and instead perform these things inside "do_ar_transfer", this is
required because we cant assert valid before we drive the data
correctly, so needs to happen in parallel.

Currently: This seems to write 16 times to same place, this is due to
hardcoding of 16 in ar transfer, not sure why address doesn't
increment this is tbd (and next TODO)
This is part of read channel control sequence
Also reduces data bus width to 32
@nathanielnrn nathanielnrn added the C: FPGA Changes for the FPGA backend label Jan 10, 2024
@sampsyo
Copy link
Contributor

sampsyo commented Jan 11, 2024

Looks awesome!! Actual code review coming next, but it is super exciting that this "wrapper" version works end-to-end against the standard cocotb component.

Lack of a convenient way to simulate/execute things. I currently have a bash script that does this locally. Can add it but not sure how many people will end up using it before the generator is ready, at which point we could hopefully use fud[2] for more complete integration.

I guess I would lean toward just including it, even if it's not all that useful? This could just offer a tiny bit of context to build on as we transition from the shell script to something more sophisticated.

This also honestly just makes me wonder a bit about what a fud2 op would look like that encapsulates "run program X using cocotb testbench Y." Could be convenient for many uses! I guess this is yet another thing to put on the ol' to-do list.

Copy link
Contributor

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing. This looks like a tremendous amount of work! Very very cool!!!

I think this is good to go; my comments mostly amount to things to think about for the future. Reading over this, and especially the main component, increases my confidence that the right next-ish step is generator-ization: that is, that we could reduce a lot of manual effort by using a generator here, which will in turn make it faster to add features and stuff.

👏

yxi/axi-calyx/axi-combined-calyx.futil Outdated Show resolved Hide resolved
yxi/axi-calyx/axi-combined-calyx.futil Outdated Show resolved Hide resolved
Comment on lines +231 to +235
group init_n_RLAST {
n_RLAST.in = 1'b1;
n_RLAST.write_en = 1'b1;
init_n_RLAST[done] = n_RLAST.done;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that simple register writes like this could be expressed with invokes:

invoke n_RLAST(in=1'b1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something about how invoke works. My understanding was that an invoke runs the control sequence of an instance. Because std_reg is defined as verilog what would invoking on a std_reg do?
Does @go write-en: 1 mean that the go signal is tied to write_en? If so it makes sense that invoking a reg would drive write_en and we'd only need to supply a data input.

I could also imagine a world where we would also need to drive write_en explicitly:

invoke n_RLASt(in=1'b1, write_en=1'b1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, I will address this here & in the generator in a future "simplifications" PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, to slightly generalize that understanding of invoke: it delegates control to the invoked cell and lets it run to completion. That actually works for both types of components (primitives and non-primitives), but it means different things mechanically: for primitives, it means "follow the go/done protocol to do an invocation of the Verilog module"; for non-primitives, it means "run the Calyx component's control program".

Does @go write-en: 1 mean that the go signal is tied to write_en?

Yes, exactly! The @go and @done annotations can be attached to any signal; they then play that role in the go/done interface.

So you don't need to supply write_en in the invoke because that's implicit in the invocation "calling convention."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very useful! I'll test these changes out and probably push them to #1846

yxi/axi-calyx/axi-combined-calyx.futil Show resolved Hide resolved

component vec_add() -> () {
cells {
//Modified to 64 width address because XRT expects 64 bit memory addresses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense; I suppose some task for the future (after the generator-ization) will be adding the "narrowing" slices required to adapt address widths.

Comment on lines +783 to +784
//TODO(nathanielnrn): this is axi_wrapper, prefer to use @toplevel attribute but its not working
// See individual channel components for explanations of signals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's too bad; I suppose we should look into why @toplevel is not working!


//TODO(nathanielnrn): this is axi_wrapper, prefer to use @toplevel attribute but its not working
// See individual channel components for explanations of signals
component main(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that, hopefully, this can be a big win for generator-ization: one needn't write out each signal in the top-level wrapper by hand; instead, the generator can literally do something like:

IN_SIGNALS = ['ARESET', 'ARREADY', 'RVALID', ...]
for mem in memories:
    for signal in IN_SIGNALS:
        yield f'{mem}_{signal}'

@nathanielnrn
Copy link
Contributor Author

Added the bash & python script I've been using to test things with a single command. Namely bash sim.sh should do the trick. Merging!

@nathanielnrn nathanielnrn merged commit 4f46a40 into main Jan 12, 2024
7 checks passed
@nathanielnrn nathanielnrn deleted the axi-writes branch January 12, 2024 02:19
rachitnigam added a commit that referenced this pull request Feb 16, 2024
* init commit of hardcoded axi wrapper for a 'main' kernel

* add axi-reads-calix

* hook up inputs to channels in the wrapper. tbd if this works

* Working calyx verison of AR and R

TBD if this actually implements AXI correctly.

There are currently some hacks in place (marked with todos)
to get this to compile, namely some splicing
that doesn't consider what we actually want to splice
(it just takes [31:0]) as opposed to dynamically considering actual
bits we want.

A few other things that should be cleaned up eventually

Need to create a cocotb testbench to test correctness

* Track output of compiled calyx read channel

Maybe this shouldn't be here, but for now (having deleted my
working directory earlier) putting it here

* update gitignore to get rid of sim_build and other cocotb artifacts

* Working make files  for running cocotb tests

Simply run make from the cocotb directory and axi-read-tests
will be executed

* Add xID signals for cocotb compatability

We tie ARID low in our manager

* Fix prefix issue on cocotb axi test bench

Prefixes should not contain trailing "_"

* commit to repro 'make WAVES=1' cocotb error from axi-reads-calyx.futil

* axi-reads patch

* sync debug

* Add txn_len initialization to 16 in calyx program

* AXI Read fixed to get to read channel start

Got rid of "assert_val" and "block_transfer" groups
and instead perform these things inside "do_ar_transfer", this is
required because we cant assert valid before we drive the data
correctly, so needs to happen in parallel.

Currently: This seems to write 16 times to same place, this is due to
hardcoding of 16 in ar transfer, not sure why address doesn't
increment this is tbd (and next TODO)

* Add integer byte conversion for tests on Calyx AXI testharness

* WIP get reads to work. Add incr_curr_addr group

This is part of read channel control sequence

* remove .fst from tracking

* Add more data to testbench to make waveform viewing easier

* Reads seem to be terminating correctly at RLAST

* AR transfers seem to work, valid is high for 1 cycle

* Unreduced axi-reads-calyx.futil

Also reduces data bus width to 32

* Cocotb testbench now passes

* Formatted and passing axi-read-tests

* Reduce and comment axi-reads-calyx.futil

* remove axi-reads.v from being tracked

* add a todo

* add required ARPROT signal. This is hardcoded to be priviliged

* rename directories to yxi/axi-calyx

* initial commit of axi-writes-calyx, a copy of axi-reads-calyx

* WIP axi writes

* rename directories

* WIP imlpementing writes

* add testing for writes, note makefile is overwritten so now tests writes, not reads

* passing axi writes and testing

* init commit of hardcoded axi wrapper for a 'main' kernel

* add axi-reads-calix

* hook up inputs to channels in the wrapper. tbd if this works

* Working calyx verison of AR and R

TBD if this actually implements AXI correctly.

There are currently some hacks in place (marked with todos)
to get this to compile, namely some splicing
that doesn't consider what we actually want to splice
(it just takes [31:0]) as opposed to dynamically considering actual
bits we want.

A few other things that should be cleaned up eventually

Need to create a cocotb testbench to test correctness

* Track output of compiled calyx read channel

Maybe this shouldn't be here, but for now (having deleted my
working directory earlier) putting it here

* Working make files  for running cocotb tests

Simply run make from the cocotb directory and axi-read-tests
will be executed

* Add xID signals for cocotb compatability

We tie ARID low in our manager

* Fix prefix issue on cocotb axi test bench

Prefixes should not contain trailing "_"

* commit to repro 'make WAVES=1' cocotb error from axi-reads-calyx.futil

* axi-reads patch

* sync debug

* Add txn_len initialization to 16 in calyx program

* AXI Read fixed to get to read channel start

Got rid of "assert_val" and "block_transfer" groups
and instead perform these things inside "do_ar_transfer", this is
required because we cant assert valid before we drive the data
correctly, so needs to happen in parallel.

Currently: This seems to write 16 times to same place, this is due to
hardcoding of 16 in ar transfer, not sure why address doesn't
increment this is tbd (and next TODO)

* Add integer byte conversion for tests on Calyx AXI testharness

* WIP get reads to work. Add incr_curr_addr group

This is part of read channel control sequence

* remove .fst from tracking

* Add more data to testbench to make waveform viewing easier

* Reads seem to be terminating correctly at RLAST

* AR transfers seem to work, valid is high for 1 cycle

* Unreduced axi-reads-calyx.futil

Also reduces data bus width to 32

* Cocotb testbench now passes

* Formatted and passing axi-read-tests

* Reduce and comment axi-reads-calyx.futil

* remove axi-reads.v from being tracked

* add a todo

* add required ARPROT signal. This is hardcoded to be priviliged

* rename directories to yxi/axi-calyx

* initial commit of axi-writes-calyx, a copy of axi-reads-calyx

* WIP axi writes

* rename directories

* WIP imlpementing writes

* add testing for writes, note makefile is overwritten so now tests writes, not reads

* passing axi writes and testing

* Work on full AXI wrapper, reads and compute works

* cleaned up combined futil and tests

* delete axi-reads* which is subsumed by axi-combined

* add axi-combined-tests.py

* remove axi-writes as it is subsumed by axi-combined

* formatting

* Update yxi/axi-calyx/axi-combined-calyx.futil

Co-authored-by: Adrian Sampson <[email protected]>

* formatting

* add sim.sh which goes from calyx to running tests

* add python file that enables waveform (vcd/fst) generation

* formatting

---------

Co-authored-by: Rachit Nigam <[email protected]>
Co-authored-by: Adrian Sampson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: FPGA Changes for the FPGA backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants