Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AXI Memory Over TL Serial Link #812

Merged
merged 34 commits into from
Mar 23, 2021
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
79eccce
Small comments to Clocks.scala
abejgonzalez Feb 27, 2021
a3e22c7
First attempt at getting Offchip AXI port
abejgonzalez Feb 28, 2021
1d287be
Enlarge serial width | Bugfix loadmem disable | Add TracerV
abejgonzalez Mar 3, 2021
f850df7
General renaming / cleanup
abejgonzalez Mar 3, 2021
c52fce7
Fix FireChip compilation | Remove extra DefaultSerialTL in bridges
abejgonzalez Mar 3, 2021
3d96218
Cleanup | Fix BlockDevice clocking issues
abejgonzalez Mar 3, 2021
3d9cd61
Slightly cleaner implementation
abejgonzalez Mar 4, 2021
d2a6dd6
Add support for harness pll
abejgonzalez Mar 5, 2021
60a616e
1st pass at connecting to harness PLL | Put UART adapter on harnessCl…
abejgonzalez Mar 5, 2021
2b7e359
Cleanup config + fragments | Remove reference clk div/rst catch in ha…
abejgonzalez Mar 5, 2021
562d8e5
Distinguish between implicit clock/reset and reference harnessClock/R…
abejgonzalez Mar 6, 2021
6ab8f8f
Update FireSim to support harness clocks | Small config renaming
abejgonzalez Mar 8, 2021
e4ccfe1
Renaming updates | Have FireSim clocks request frequency by default
abejgonzalez Mar 8, 2021
ade8457
First doc pass (no updated imgs) [ci skip]
abejgonzalez Mar 9, 2021
d5d547d
Update doc images [ci skip]
abejgonzalez Mar 9, 2021
ed6d10a
Merge remote-tracking branch 'origin/dev' into offchip-axi-setup
abejgonzalez Mar 9, 2021
6e1b942
Fix docs harness binders reference
abejgonzalez Mar 9, 2021
d204ccd
Clean up the chip communication docs a bit more [ci skip]
abejgonzalez Mar 10, 2021
1ebc0f7
Allow the PLL to request the max freq
abejgonzalez Mar 11, 2021
30c9b63
More clarifications on harness clocks
abejgonzalez Mar 11, 2021
6476c7e
Small renaming/cleanup | Use LinkedHashMaps
abejgonzalez Mar 15, 2021
3439266
Small renaming in docs
abejgonzalez Mar 15, 2021
5301723
Use def instead of var Option for ref frequency
abejgonzalez Mar 17, 2021
7b7bcf7
Merge remote-tracking branch 'origin/dev' into offchip-axi-setup
abejgonzalez Mar 19, 2021
4a56508
Small spacing fixes
abejgonzalez Mar 19, 2021
0d6e971
Update docs/Advanced-Concepts/Chip-Communication.rst
abejgonzalez Mar 19, 2021
1e42113
Splitting up FireSim default frequencies into a separate config frag.
abejgonzalez Mar 20, 2021
b729a5f
Allow run-asm/bmark debug make targets to specify random seed
abejgonzalez Mar 20, 2021
87fa481
Fix TileResetCtrl so that tiles come out of reset after rest of uncore
abejgonzalez Mar 20, 2021
d24bd11
Merge branch 'offchip-axi-setup' of github.com:ucb-bar/chipyard into …
abejgonzalez Mar 20, 2021
f59a790
Bump testchipip
abejgonzalez Mar 20, 2021
5526397
Use async queue to connect serdesser + other components
abejgonzalez Mar 20, 2021
5ffad32
Bump testchipip
abejgonzalez Mar 21, 2021
09ef82c
Update harnessClk/Rst naming to buildtop | Small docs cleanup
abejgonzalez Mar 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,7 @@ jobs:
tools-version: "esp-tools"
group-key: "group-accels"
project-key: "chipyard-hwacha"
timeout: "30m"
chipyard-gemmini-run-tests:
executor: main-env
steps:
Expand Down
169 changes: 126 additions & 43 deletions docs/Advanced-Concepts/Chip-Communication.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ There are two types of DUTs that can be made: `tethered` or `standalone` DUTs.
A `tethered` DUT is where a host computer (or just host) must send transactions to the DUT to bringup a program.
This differs from a `standalone` DUT that can bringup itself (has its own bootrom, loads programs itself, etc).
An example of a tethered DUT is a Chipyard simulation where the host loads the test program into the DUTs memory and signals to the DUT that the program is ready to run.
An example of a standalone DUT is a Chipyard simulation where a program can be loaded from an SDCard by default.
An example of a standalone DUT is a Chipyard simulation where a program can be loaded from an SDCard out of reset.
In this section, we mainly describe how to communicate to tethered DUTs.

There are two ways the host (otherwise known as the outside world) can communicate with a tethered Chipyard DUT:
Expand Down Expand Up @@ -45,33 +45,21 @@ Using the Tethered Serial Interface (TSI)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, Chipyard uses the Tethered Serial Interface (TSI) to communicate with the DUT.
TSI protocol is an implementation of HTIF that is used to send commands to the
RISC-V DUT. These TSI commands are simple R/W commands
that are able to probe the DUT's memory space. During simulation, the host sends TSI commands to a
simulation stub called ``SimSerial`` (C++ class) that resides in a ``SimSerial`` Verilog module
(both are located in the ``generators/testchipip`` project). This ``SimSerial`` Verilog module then
sends the TSI command recieved by the simulation stub into the DUT which then converts the TSI
command into a TileLink request. This conversion is done by the ``SerialAdapter`` module
(located in the ``generators/testchipip`` project). In simulation, FESVR
resets the DUT, writes into memory the test program, and indicates to the DUT to start the program
through an interrupt (see :ref:`customization/Boot-Process:Chipyard Boot Process`). Using TSI is currently the fastest
mechanism to communicate with the DUT in simulation.

In the case of a chip tapeout bringup, TSI commands can be sent over a custom communication
medium to communicate with the chip. For example, some Berkeley tapeouts have a FPGA
with a RISC-V soft-core that runs FESVR. The FESVR on the soft-core sends TSI commands
to a TSI-to-TileLink converter living on the FPGA (i.e. ``SerialAdapter``). After the transaction is
converted to TileLink, the ``TLSerdesser`` (located in ``generators/testchipip``) serializes the
transaction and sends it to the chip (this ``TLSerdesser`` is sometimes also referred to as a
serial-link or serdes). Once the serialized transaction is received on the
chip, it is deserialized and masters a bus on the chip. The following image shows this flow:

.. image:: ../_static/images/chip-bringup.png

.. note::
The ``TLSerdesser`` can also be used as a slave (client), so it can sink memory requests from the chip
and connect to off-chip backing memory. Or in other words, ``TLSerdesser`` creates a bi-directional TileLink
interface.
TSI protocol is an implementation of HTIF that is used to send commands to the RISC-V DUT.
These TSI commands are simple R/W commands that are able to probe the DUT's memory space.
abejgonzalez marked this conversation as resolved.
Show resolved Hide resolved
During simulation, the host sends TSI commands to a simulation stub in the test harness called ``SimSerial``
(C++ class) that resides in a ``SimSerial`` Verilog module (both are located in the ``generators/testchipip``
project).
This ``SimSerial`` Verilog module then sends the TSI command recieved by the simulation stub
to an adapter that converts the TSI command into a TileLink request.
This conversion is done by the ``SerialAdapter`` module (located in the ``generators/testchipip`` project).
After the transaction is converted to TileLink, the ``TLSerdesser`` (located in ``generators/testchipip``) serializes the
transaction and sends it to the chip (this ``TLSerdesser`` is sometimes also referred to as a digital serial-link or SerDes).
Once the serialized transaction is received on the chip, it is deserialized and masters a TileLink bus on the chip
which handles the request.
In simulation, FESVR resets the DUT, writes into memory the test program, and indicates to the DUT to start the program
through an interrupt (see :ref:`customization/Boot-Process:Chipyard Boot Process`).
Using TSI is currently the fastest mechanism to communicate with the DUT in simulation and is also used by FireSim.
colinschmidt marked this conversation as resolved.
Show resolved Hide resolved

Using the Debug Module Interface (DMI)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -90,26 +78,25 @@ command into a TileLink request. This conversion is done by the DTM named ``Debu
When the DTM receives the program to load, it starts to write the binary byte-wise into memory.
This is considerably slower than the TSI protocol communication pipeline (i.e. ``SimSerial``/``SerialAdapter``/TileLink)
which directly writes the program binary to memory.
Thus, Chipyard removes the DTM by default in favor of the TSI protocol for DUT communication.

Starting the TSI or DMI Simulation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All default Chipyard configurations use TSI to communicate between the simulation and the simulated SoC/DUT. Hence, when running a
software RTL simulation, as is indicated in the :ref:`simulation/Software-RTL-Simulation:Software RTL Simulation` section, you are in-fact using TSI to communicate with the DUT. As a
reminder, to run a software RTL simulation, run:
All default Chipyard configurations use TSI to communicate between the simulation and the simulated SoC/DUT.
Hence, when running a software RTL simulation, as is indicated in the
:ref:`simulation/Software-RTL-Simulation:Software RTL Simulation` section, you are in-fact using TSI to communicate with the DUT.
As a reminder, to run a software RTL simulation, run:

.. code-block:: bash

cd sims/verilator
# or
cd sims/vcs

make CONFIG=LargeBoomConfig run-asm-tests

FireSim FPGA-accelerated simulations use TSI by default as well.
make CONFIG=RocketConfig run-asm-tests

If you would like to build and simulate a Chipyard configuration with a DTM configured for DMI communication, then you must tie-off the TSI interface, and instantiate the `SimDTM`. Note that we use `WithTiedOffSerial ++ WithSimDebug` instead of `WithTiedOffDebug ++ WithSimSerial`.
If you would like to build and simulate a Chipyard configuration with a DTM configured for DMI communication,
then you must tie-off the serial-link interface, and instantiate the `SimDTM`.

.. literalinclude:: ../../generators/chipyard/src/main/scala/config/RocketConfigs.scala
:language: scala
Expand All @@ -129,14 +116,110 @@ Then you can run simulations with the new DMI-enabled top-level and test-harness
Using the JTAG Interface
------------------------

The main way to use JTAG with a Rocket Chip based system is to instantiate the Debug Transfer Module (DTM)
and configure it to use a JTAG interface. The default Chipyard designs instantiate the DTM and configure it
to use JTAG. You may attach OpenOCD and GDB to any of the default JTAG-enabled designs.
Another way to interface with the DUT is to use JTAG.
Similar to the :ref:`Advanced-Concepts/Chip-Communication:Using the Debug Module interface (DMI)` section, in order to use the JTAG protocol,
the DUT needs to contain a Debug Transfer Module (DTM) configured to use JTAG instead of DMI.
Once the JTAG port is exposed, the host can communicate over JTAG to the DUT through a simulation stub
called ``SimJTAG`` (C++ class) that resides in a ``SimJTAG`` Verilog module (both reside in the ``generators/rocket-chip`` project).
This simulation stub creates a socket that OpenOCD and GDB can connect to when the simulation is running.
The default Chipyard designs instantiate the DTM configured to use JTAG (i.e. ``RocketConfig``).

.. note::
As mentioned, default Chipyard designs are enabled with JTAG.
However, they also use TSI/Serialized-TL with FESVR in case the JTAG interface isn't used.
This allows users to choose how to communicate with the DUT (use TSI or JTAG).

Debugging with JTAG
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~

Roughly the steps to debug with JTAG in simulation are as follows:

1. Build a Chipyard JTAG-enabled RTL design. Remember default Chipyard designs are JTAG ready.

.. code-block:: bash

cd sims/verilator
# or
cd sims/vcs

make CONFIG=RocketConfig

2. Run the simulation with remote bit-bang enabled. Since we hope to load/run the binary using JTAG,
we can pass ``none`` as a binary (prevents FESVR from loading the program). (Adapted from: https://github.com/chipsalliance/rocket-chip#3-launch-the-emulator)

.. code-block:: bash

# note: this uses Chipyard make invocation to run the simulation to properly wrap the simulation args
make CONFIG=RocketConfig BINARY=none SIM_FLAGS="+jtag_rbb_enable=1 --rbb-port=9823" run-binary
abejgonzalez marked this conversation as resolved.
Show resolved Hide resolved

Please refer to the following resources on how to debug with JTAG.
3. `Follow the instructions here to connect to the simulation using OpenOCD + GDB. <https://github.com/chipsalliance/rocket-chip#4-launch-openocd>`__

.. note::
This section was adapted from the instruction in Rocket Chip and riscv-isa-sim. For more information refer
to that documentation: `Rocket Chip GDB Docs <https://github.com/chipsalliance/rocket-chip#-debugging-with-gdb>`__,
`riscv-isa-sim GDB Docs <https://github.com/riscv/riscv-isa-sim#debugging-with-gdb>`__

Example Test Chip Bringup Communication
---------------------------------------

Intro to Typical Chipyard Test Chip
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Most, if not all, Chipyard configurations are tethered using TSI (over a serial-link) and have access
colinschmidt marked this conversation as resolved.
Show resolved Hide resolved
to external memory through an AXI port (backing AXI memory).
The following image shows the DUT with these set of default signals:

.. image:: ../_static/images/default-chipyard-config-communication.png

In this setup, the serial-link is connected to the TSI/FESVR peripherals while the AXI port is connected
to a simulated AXI memory.
However, AXI ports tend to have many signals associated with them so instead of creating an AXI port off the DUT,
abejgonzalez marked this conversation as resolved.
Show resolved Hide resolved
one can send the memory transactions over the bi-directional serial-link (``TLSerdesser``) so that the main
interface to the DUT is the serial-link (which has comparatively less signals than an AXI port).
This new setup (shown below) is a typical Chipyard test chip setup:

.. image:: ../_static/images/bringup-chipyard-config-communication.png

Simulation Setup of the Example Test Chip
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To test this type of configuration (TSI/memory transactions over the serial-link), most of the same TSI collateral
would be used.
The main difference is that the TileLink-to-AXI converters and simulated AXI memory resides on the other side of the
serial-link.

.. image:: ../_static/images/chip-bringup-simulation.png

.. note::
Here the simulated AXI memory and the converters can be in a different clock domain in the test harness
than the reference clock of the DUT.
For example, the DUT can be clocked at 3.2GHz while the simulated AXI memory can be clocked at 1GHz.
This functionality is done in the harness binder that instantiates the TSI collateral, TL-to-AXI converters,
and simulated AXI memory.
See :ref:`Advanced-Concepts/Harness-Clocks:Creating Clocks in the Test Harness` on how to generate a clock
in a harness binder.

This type of simulation setup is done in the following multi-clock configuration:

.. literalinclude:: ../../generators/chipyard/src/main/scala/config/RocketConfigs.scala
:language: scala
:start-after: DOC include start: MulticlockAXIOverSerialConfig
:end-before: DOC include end: MulticlockAXIOverSerialConfig

Bringup Setup of the Example Test Chip after Tapeout
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Assuming this example test chip is taped out and now ready to be tested, we can communicate with the chip using this serial-link.
For example, a common test setup used at Berkeley to evaluate Chipyard-based test-chips includes an FPGA running a RISC-V soft-core that is able to speak to the DUT (over an FMC).
This RISC-V soft-core would serve as the host of the test that will run on the DUT.
This is done by the RISC-V soft-core running FESVR, sending TSI commands to a ``SerialAdapter`` / ``TLSerdesser`` programmed on the FPGA.
Once the commands are converted to serialized TileLink, then they can be sent over some medium to the DUT
(like an FMC cable or a set of wires connecting FPGA outputs to the DUT board).
Similar to simulation, if the chip requests offchip memory, it can then send the transaction back over the serial-link.
Then the request can be serviced by the channel of FPGA DRAM.
abejgonzalez marked this conversation as resolved.
Show resolved Hide resolved
The following image shows this flow:

.. image:: ../_static/images/chip-bringup.png

* https://github.com/chipsalliance/rocket-chip#-debugging-with-gdb
* https://github.com/riscv/riscv-isa-sim#debugging-with-gdb
In fact, this exact type of bringup setup is what the following section discusses:
:ref:`Prototyping/VCU118:Introduction to the Bringup Platform`.
38 changes: 38 additions & 0 deletions docs/Advanced-Concepts/Harness-Clocks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
.. _harness-clocks:

Creating Clocks in the Test Harness
===================================

Chipyard currently allows the SoC design (everything under ``ChipTop``) to
have independent clock domains through diplomacy.
This implies that some reference clock enters the ``ChipTop`` and then is divided down into
separate clock domains.
From the perspective of the ``TestHarness`` module, the ``ChipTop`` clock and reset is
provided from the harness clock and reset (called ``harnessClock`` and ``harnessReset``).
In the default case, this ``harnessClock`` and ``harnessReset`` is directly wired to the
clock and reset IO's of the ``TestHarness`` module.
However, the ``TestHarness`` has the ability to generate a standalone clock and reset signal
that is separate from the reference clock/reset of ``ChipTop``.
This allows harness components (including harness binders) the ability to "request" a clock
for a new clock domain.
This is useful for simulating systems in which modules in the harness have independent clock domains
from the DUT.

Requests for a harness clock is done by the ``HarnessClockInstantiator`` class in ``generators/chipyard/src/main/scala/TestHarness.scala``.
This class is accessed in harness components by referencing the Rocket Chip parameters key ``p(HarnessClockInstantiatorKey)``.
Then you can request a clock and syncronized reset at a particular frequency by invoking the ``requestClockBundle`` function.
Take the following example:

.. literalinclude:: ../../generators/chipyard/src/main/scala/HarnessBinders.scala
:language: scala
:start-after: DOC include start: HarnessClockInstantiatorEx
:end-before: DOC include end: HarnessClockInstantiatorEx

Here you can see the ``p(HarnessClockInstantiatorKey)`` is used to request a clock and reset at ``memFreq`` frequency.

.. note::
In the case that the reference clock entering ``ChipTop`` is not the overall reference clock of the simulation
(i.e. not the clock/reset coming into the ``TestHarness`` module), the ``harnessClock`` and ``harnessReset`` can
differ from the implicit ``TestHarness`` clock and reset. For example, if the ``ChipTop`` reference is 500MHz but an
extra harness clock is requested at 1GHz, the ``TestHarness`` implicit clock/reset will be at 1GHz while the ``harnessClock``
abejgonzalez marked this conversation as resolved.
Show resolved Hide resolved
and ``harnessReset`` will be at 500MHz.
1 change: 1 addition & 0 deletions docs/Advanced-Concepts/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ They expect you to know about Chisel, Parameters, configs, etc.
Debugging-BOOM
Resources
CDEs
Harness-Clocks

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/images/chip-bringup-simulation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/_static/images/chip-bringup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/_static/images/chip-communication.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 7 additions & 3 deletions generators/chipyard/src/main/scala/ChipTop.scala
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ import barstools.iocell.chisel._

case object BuildSystem extends Field[Parameters => LazyModule]((p: Parameters) => new DigitalTop()(p))

trait HasReferenceClockFreq {
def refClockFreqMHz: Double
}

/**
* The base class used for building chips. This constructor instantiates a module specified by the BuildSystem parameter,
Expand All @@ -24,15 +27,16 @@ case object BuildSystem extends Field[Parameters => LazyModule]((p: Parameters)
*/

class ChipTop(implicit p: Parameters) extends LazyModule with BindingScope
with HasTestHarnessFunctions with HasIOBinders {
with HasTestHarnessFunctions with HasReferenceClockFreq with HasIOBinders {
// The system module specified by BuildSystem
lazy val lazySystem = LazyModule(p(BuildSystem)(p)).suggestName("system")

// The implicitClockSinkNode provides the implicit clock and reset for the System
// The implicitClockSinkNode provides the implicit clock and reset for the system (connected by clocking scheme)
val implicitClockSinkNode = ClockSinkNode(Seq(ClockSinkParameters(name = Some("implicit_clock"))))

// Generate Clocks and Reset
p(ClockingSchemeKey)(this)
val mvRefClkFreq = p(ClockingSchemeKey)(this)
colinschmidt marked this conversation as resolved.
Show resolved Hide resolved
def refClockFreqMHz: Double = mvRefClkFreq.getWrappedValue

// NOTE: Making this a LazyRawModule is moderately dangerous, as anonymous children
// of ChipTop (ex: ClockGroup) do not receive clock or reset.
Expand Down
Loading