Skip to content

Commit

Permalink
Merge pull request #344 from stnolting/wishbone_gating
Browse files Browse the repository at this point in the history
[rtl] add Wishbone output "gating"
  • Loading branch information
stnolting authored Jun 11, 2022
2 parents 16e7e49 + a5da45a commit 47d7c99
Show file tree
Hide file tree
Showing 4 changed files with 74 additions and 61 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ mimpid = 0x01040312 => 01.04.03.12 => Version 01.04.03.12 => v1.4.3.12

| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 10.06.2022 | 1.7.2.6 | **Wishbone** interface now _gates_ all outgoing signals (= signals remain stable if there is no active Wishbone access); [#344](https://github.com/stnolting/neorv32/pull/344) |
| 09.06.2022 | 1.7.2.5 | reworked **TWI** module fixing several interface timing issues; :warning: removed "START condition done interrupt" and "STOP condition done interrupt"; [#340](https://github.com/stnolting/neorv32/pull/340) |
| 06.06.2022 | 1.7.2.4 | split executable images into package and body; [#338](https://github.com/stnolting/neorv32/pull/338) |
| 04.06.2022 | 1.7.2.3 | :bug: fixed bug in **SPI** and **XIP** modules: phase offset between SPI clock and SPI data; [#336](https://github.com/stnolting/neorv32/pull/336) |
Expand Down
76 changes: 42 additions & 34 deletions docs/datasheet/soc_wishbone.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_wishbone.vhd |
| Software driver file(s): | none | _implicitly used_
| Software driver file(s): | none | _implicitly used_
| Top entity port: | `wb_tag_o` | request tag output (3-bit)
| | `wb_adr_o` | address output (32-bit)
| | `wb_dat_i` | data input (32-bit)
Expand All @@ -29,34 +29,37 @@


The external memory interface provides a Wishbone b4-compatible on-chip bus interface. The bus interface is
implemented when the _MEM_EXT_EN_ generic is _true_. This interface can be used to attach external memories,
custom hardware accelerators, additional IO devices or all other kinds of IP blocks.
implemented if the _MEM_EXT_EN_ generic is _true_. This interface can be used to attach external memories,
custom hardware accelerators, additional IO devices or all other kinds of IP blocks to the processor.

The external interface is _not_ mapped to a _specific_ address space region. Instead, all CPU memory accesses that
do not target a processor-internal module are delegated to the external memory interface. In summary, a CPU load/store
access is delegated to the external bus interface if...
access is delegated via the external bus interface if...

. it does not target the internal instruction memory IMEM (if implemented at all)
. **and** it does not target the internal data memory DMEM (if implemented at all)
. **and** it does not target the internal bootloader ROM or any of the IO devices - regardless if one or more of these components are
* it does not target the internal instruction memory IMEM (if implemented at all).
* **and** it does not target the internal data memory DMEM (if implemented at all).
* **and** it does not target the internal bootloader ROM or any of the IO devices - regardless if one or more of these components are
actually implemented or not.
[NOTE]
If the Execute In Place module (XIP) is implemented accesses targeting the XIP module are not forwarded to the
external memory interface. See section <<_execute_in_place_module_xip>> for more information.

.Address Space Layout
[TIP]
See section <<_address_space>> for more information.

.Execute-in-Place Module
[NOTE]
If the Execute In Place module (XIP) is implemented accesses targeting the XIP memory-mapped-region will not be forwarded to the
external memory interface. See section <<_execute_in_place_module_xip>> for more information.


**Wishbone Bus Protocol**

The external memory interface either uses the **standard** ("classic") Wishbone transaction protocol (default) or
**pipelined** Wishbone transaction protocol. The transaction protocol is configured via the <<_mem_ext_pipe_mode>> generic:
When _MEM_EXT_PIPE_MODE_ is _false_, all bus control signals including _STB_ are active and remain stable until the
transfer is acknowledged/terminated. If _MEM_EXT_PIPE_MODE_ is _true_, all bus control except _STB_ are active
and remain until the transfer is acknowledged/terminated. In this case, _STB_ is asserted only during the very
first bus clock cycle.
The external memory interface either uses the **standard** (also called "classic") Wishbone protocol (default) or
**pipelined** Wishbone protocol. The protocol to be used is configured via the <<_mem_ext_pipe_mode>> generic:

* If _MEM_EXT_PIPE_MODE_ is _false_, all bus control signals including `wb_stb_o` are active and remain stable until the
transfer is acknowledged/terminated.
* If _MEM_EXT_PIPE_MODE_ is _true_, all bus control except `wb_stb_o` are active and remain until the transfer is
acknowledged/terminated. In this case, `wb_stb_o` is asserted only during the very first bus clock cycle.
.Exemplary Wishbone bus accesses using "classic" and "pipelined" protocol
[cols="^2,^2"]
Expand All @@ -67,44 +70,45 @@ a| image::wishbone_pipelined_write.png[700,300]
| **Classic** Wishbone read access | **Pipelined** Wishbone write access
|=======================


.Wishbone Specs.
[TIP]
A detailed description of the implemented Wishbone bus protocol and the according interface signals
can be found in the data sheet "Wishbone B4 - WISHBONE System-on-Chip (SoC) Interconnection
Architecture for Portable IP Cores". A copy of this document can be found in the docs folder of this
Architecture for Portable IP Cores". A copy of this document can be found in the `docs` folder of this
project.


**Bus Access**

The NEORV32 Wishbone gateway does not support burst transfer yet, so there is always just one transfer in progress.
The NEORV32 Wishbone gateway does not support burst transfer yet, so there is always just a single transfer in "in fly".
Hence, the Wishbone `STALL` signal is not implemented. An accessed Wishbone device does not have to respond immediately to a bus
request by sending an ACK. instead, there is a _time window_ where the device has to acknowledge the transfer. This time window
request by sending an ACK. Instead, there is a _time window_ where the device has to acknowledge the transfer. This time window
id configured by the _MEM_EXT_TIMEOUT_ top generic that defines the maximum time (in clock cycles) a bus access can be pending
before it is automatically terminated. If _MEM_EXT_TIMEOUT_ is set to zero, the timeout disabled an a bus access can take an
arbitrary number of cycles to complete.
before it is automatically terminated with an error condition. If _MEM_EXT_TIMEOUT_ is set to zero, the timeout disabled
an a bus access can take an arbitrary number of cycles to complete (this is **not recommended**!).

When _MEM_EXT_TIMEOUT_ is greater than zero, the Wishbone gateway starts an internal countdown whenever the CPU
accesses a memory address via the external memory interface. If the accessed memory / device does not acknowledge (via `wb_ack_i`)
accesses an address via the external memory interface. If the accessed device does not acknowledge (via `wb_ack_i`)
or terminate (via `wb_err_i`) the transfer within _MEM_EXT_TIMEOUT_ clock cycles, the bus access is automatically canceled
setting `wb_cyc_o` low again and a CPU load/store/instruction fetch bus access fault exception is raised.

.External "Address Space Holes"
[IMPORTANT]
Setting _MEM_EXT_TIMEOUT_ to zero will permanently stall the CPU if the targeted Wishbone device never responds. Hence,
_MEM_EXT_TIMEOUT_ should be always set to a value greater than zero. +
+
This feature can be used as **safety guard** if the external memory system does not check for "address space holes". That means
that accessing addresses, which do not belong to a certain memory or device, do not permanently stall the processor due to an
unacknowledged/unterminated bus access. If the external memory system can guarantee to access **any** bus access
(even it targets an unimplemented address) the timeout feature should be disabled (_MEM_EXT_TIMEOUT_ = 0).
unacknowledged/unterminated bus access. If the external memory system can guarantee to acknowledge **any** bus accesses
(even if targeting an unimplemented address) the timeout feature can be safely disabled (_MEM_EXT_TIMEOUT_ = 0).


**Wishbone Tag**

The 3-bit wishbone `wb_tag_o` signal provides additional information regarding the access type. This signal
is compatible to the AXI4 _AxPROT_ signal.
is compatible to the AXI4 `AxPROT` signal.

* `wb_tag_o(0)` 1: privileged access (CPU is in machine mode); 0: unprivileged access
* `wb_tag_o(0)` 1: privileged access (CPU is in machine mode); 0: unprivileged access (CPU is not in machine mode)
* `wb_tag_o(1)` always zero (indicating "secure access")
* `wb_tag_o(2)` 1: instruction fetch access, 0: data access
Expand All @@ -113,23 +117,27 @@ is compatible to the AXI4 _AxPROT_ signal.

The NEORV32 CPU and the Processor setup are *little-endian* architectures. To allow direct connection
to a big-endian memory system the external bus interface provides an _Endianness configuration_. The
Endianness (of the external memory interface) can be configured via the _MEM_EXT_BIG_ENDIAN_ generic.
By default, the external memory interface uses little-endian byte-order (like the rest of the processor / CPU).
Endianness of the external memory interface can be configured via the _MEM_EXT_BIG_ENDIAN_ generic.
By default, the external memory interface uses little-endian byte-order.

Application software can check the Endianness configuration of the external bus interface via the
SYSINFO module (see section <<_system_configuration_information_memory_sysinfo>> for more information).


**Gateway Latency**
**Latency and Gating**

By default, the Wishbone gateway introduces two additional latency cycles: processor-outgoing ("TX") and
processor-incoming ("RX") signals are fully registered. Thus, any access from the CPU to a processor-external devices
By default, the Wishbone gateway introduces two additional latency cycles: processor-outgoing (`*_o`) and
processor-incoming (`*_i`) signals are fully registered. Thus, any access from the CPU to a processor-external devices
via Wishbone requires 2 additional clock cycles (at least; depending on device's latency).

If the attached Wishbone network / peripheral already provides output registers or if the Wishbone network is not relevant
for timing closure, the default buffering of incoming ("RX") data within the gateway can be disabled by implementing an
for timing closure, the default buffering of incoming signals can be disabled by implementing an
"asynchronous" RX path. The configuration is done via the _MEM_EXT_ASYNC_RX_ generic.

All outgoing signals use a "gating mechanism" so they only change if there is a actual Wishbone transaction being in
progress. This can reduce dynamic switching activity in the external bus system and also simplifies simulation-based
inspection of the Wishbone transactions.


**AXI4-Lite Connectivity**

Expand Down
2 changes: 1 addition & 1 deletion rtl/core/neorv32_package.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ package neorv32_package is
-- Architecture Constants (do not modify!) ------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant data_width_c : natural := 32; -- native data path width - do not change!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01070205"; -- NEORV32 version - no touchy!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01070206"; -- NEORV32 version - no touchy!
constant archid_c : natural := 19; -- official NEORV32 architecture ID - hands off!

-- Check if we're inside the Matrix -------------------------------------------------------
Expand Down
56 changes: 30 additions & 26 deletions rtl/core/neorv32_wishbone.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,17 @@
-- # << NEORV32 - External Bus Interface (WISHBONE) >> #
-- # ********************************************************************************************* #
-- # All bus accesses from the CPU, which do not target the internal IO region / the internal #
-- # bootloader / the internal instruction or data memories (if implemented), are delegated via #
-- # this Wishbone gateway to the external bus interface. Accessed peripherals can have a response #
-- # latency of up to BUS_TIMEOUT - 1 cycles. #
-- # bootloader / the OCD system / the internal instruction or data memories (if implemented), are #
-- # delegated via this Wishbone gateway to the external bus interface. Wishbone accesses can have #
-- # a response latency of up to BUS_TIMEOUT - 1 cycles or an infinity response time if #
-- # BUS_TIMEOUT = 0 (not recommended!) #
-- # #
-- # The Wishbone gateway registers all outgoing signals. These signals will remain stable (gated) #
-- # if there is no active Wishbone access. By default, also the incoming signals are registered, #
-- # too. this can be disabled by setting ASYNC_RX = false. #
-- # #
-- # Even when all processor-internal memories and IO devices are disabled, the EXTERNAL address #
-- # space ENDS at address 0xffff0000 (begin of internal BOOTROM address space). #
-- # space ENDS at address 0xffff0000 (begin of internal BOOTROM/OCD/IO address space). #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License #
-- # #
Expand Down Expand Up @@ -177,11 +182,11 @@ begin
ctrl.we <= '0';
ctrl.adr <= (others => '0');
ctrl.wdat <= (others => '0');
ctrl.rdat <= (others => '0');
ctrl.rdat <= (others => '-');
ctrl.sel <= (others => '0');
ctrl.timeout <= (others => '0');
ctrl.ack <= '0';
ctrl.err <= '0';
ctrl.timeout <= (others => '-');
ctrl.ack <= '-';
ctrl.err <= '-';
ctrl.tmo <= '0';
ctrl.src <= '0';
ctrl.priv <= '0';
Expand All @@ -199,20 +204,19 @@ begin

when IDLE => -- waiting for host request
-- ------------------------------------------------------------
-- buffer all outgoing signals --
ctrl.we <= wren_i;
ctrl.adr <= addr_i;
if (BIG_ENDIAN = true) then -- big-endian
ctrl.wdat <= bswap32_f(data_i);
ctrl.sel <= bit_rev_f(ben_i);
else -- little-endian
ctrl.wdat <= data_i;
ctrl.sel <= ben_i;
end if;
ctrl.src <= src_i;
ctrl.priv <= priv_i;
-- valid new or buffered read/write request --
if ((xbus_access and (wren_i or rden_i)) = '1') then
if (xbus_access = '1') and ((wren_i or rden_i) = '1') then -- valid external request
-- buffer (and gate) all outgoing signals --
ctrl.we <= wren_i;
ctrl.adr <= addr_i;
ctrl.src <= src_i;
ctrl.priv <= priv_i;
if (BIG_ENDIAN = true) then -- big-endian
ctrl.wdat <= bswap32_f(data_i);
ctrl.sel <= bit_rev_f(ben_i);
else -- little-endian
ctrl.wdat <= data_i;
ctrl.sel <= ben_i;
end if;
ctrl.state <= BUSY;
end if;

Expand Down Expand Up @@ -243,7 +247,7 @@ begin
end process bus_arbiter;

-- host access --
ack_gated <= wb_ack_i when (ctrl.state = BUSY) else '0'; -- CPU ack gate for "async" RX
ack_gated <= wb_ack_i when (ctrl.state = BUSY) else '0'; -- CPU ACK gate for "async" RX
rdata_gated <= wb_dat_i when (ctrl.state = BUSY) else (others => '0'); -- CPU read data gate for "async" RX
rdata <= ctrl.rdat when (ASYNC_RX = false) else rdata_gated;

Expand All @@ -259,15 +263,15 @@ begin
wb_tag_o(1) <= '0'; -- 0 = secure, 1 = non-secure
wb_tag_o(2) <= ctrl.src; -- 0 = data access, 1 = instruction access

stb_int <= '1' when (ctrl.state = BUSY) and (ctrl.state_ff /= BUSY) else '0';
cyc_int <= '1' when (ctrl.state = BUSY) else '0';

wb_adr_o <= ctrl.adr;
wb_dat_o <= ctrl.wdat;
wb_we_o <= ctrl.we;
wb_sel_o <= ctrl.sel;
wb_stb_o <= stb_int when (PIPE_MODE = true) else cyc_int;
wb_cyc_o <= cyc_int;

stb_int <= '1' when (ctrl.state = BUSY) and (ctrl.state_ff /= BUSY) else '0';
cyc_int <= '1' when (ctrl.state = BUSY) else '0';


end neorv32_wishbone_rtl;

0 comments on commit 47d7c99

Please sign in to comment.