Monday, August 2, 2021

AMBA Bus Architecture & Protocol Understanding - Part#1



Following diagram illustrates AMBA evolution of protocols along with the SOC design trends in industry.





AMBA 1 specification (First version) defines two buses/interfaces-

1) Advanced System Bus (ASB)

2) Advanced Peripheral Bus (APB)



AMBA 2 specification defines three buses/interfaces:

1) Advanced High-performance Bus (AHB) - widely used on ARM7, ARM9 and ARM Cortex-M based designs

2) Advanced System Bus (ASB)

3) Advanced Peripheral Bus (APB2 or APB)




AMBA 3 specification defines four buses/interfaces:

1) Advanced eXtensible Interface (AXI3 or AXI v1.0) - widely used on ARM Cortex-A processors including Cortex-A9

2) AHB-Lite v1.0

3) APB3 v1.0

4) Advanced Trace Bus (ATB v1.0)

AMBA 4 specification defines following buses/interfaces:

1) AXI Coherency Extensions (ACE) - widely used on the latest ARM Cortex-A processors including Cortex-A7 and Cortex-A15

2) ACE-Lite

3) AXI4

4) AXI4-Lite

5) AXI4-Stream v1.0

6) ATB v1.1

7) APB4 v2.0

AMBA 5 specification defines the following buses/interfaces:

1) AXI5, AXI5-Lite and ACE5

2) AHB5, AHB-Lite

3) CHI Coherent Hub Interface (CHI)

4) Distributed Translation Interface (DTI)

5) Generic Flash Bus (GFB)

APB: Used for connecting low bandwidth peripherals, Non-pipelined, No burst data transfer

AHB: Used for connecting higher bandwidth peripherals, Burst data transfer

AHB Lite: Single master support, no arbitration, retry, split transactions etc.

AXI: The Advanced Extensible interface (AXI) is useful for high bandwidth and low latency interconnects. This is a point to point interconnect and overcomes the limitations of a shared bus protocol in terms of number of agents that can be connected. The protocol also was an enhancement from AHB in terms of supporting multiple outstanding data transfers (pipe-lined), burst data transfers, separate read and write paths and supporting different bus widths.

AXI Lite: No burst data transfer

AXI-stream: Only streaming of data from master to slave

ACE — AXI Coherence extension protocol is an extension to AXI 4 protocol and evolved in the era of multiple CPU cores with coherent caches getting integrated on a single chip.

CHI (Coherent Hub Interface) — The ACE protocol was developed as an extension to AXI to support coherent interconnects. The ACE protocol used a signal level communication between master/slave and hence the interconnects needed large number of wires with added channels for snoops and responses. The CHI protocol uses a layered packet based communication protocol with protocol, link layer and physical layer implementation and also supports QoS based flow control and retry mechanisms.

AHB Architecture :



AHB Arbitration:



A single master system only requires the use of a Decoder and Multiplexor. A multi-master system requires the use of an interconnect that provides arbitration and the routing of signals from different masters to the appropriate slaves. This routing is required for address, control, and write data signaling. Different approaches used for multi-master systems, such as single layer or multi-layer interconnects.

The address phase of any transfer occurs during the data phase of the previous transfer. This overlapping of address and data is fundamental to the pipelined nature of the bus and enables high performance operation while still providing adequate time for a slave to provide the response to a transfer.

Buffer-able, it specifies that the final destination of the current transfer can be delayed (for example if final destination is to some USB or Network from the processor it can be delayed performing appropriate transfers.
Bufferable Transaction can be buffered. For example, if a master issues a write transaction to a Slave via a bus or a fabric, the bus/fabric may buffer the transaction and respond OK immediately. This means that the OK response does not come from the final destination. The bus/fabric will in turn will transfer the write command to the final destination. Bufferable Writes can also be called posted writes. 'Posted' Writes are the ones for which the issuer gets
no response. Though in bufferable writes the Master do get a response, but it’s not from the final destination Posted/non-posted are terms from the PCIe World. A bufferable transaction is indicated by HPROT[2], when it is '1', the transaction is bufferable.

Cacheable, it specifies that the transaction when it has reached to the final destination need not match with the initial transfer started by the Processor/Master. The data can also be mixed with from different transactions to form a packet (this mix may be used to form a packet/frame) accordingly. Not all master (AHB) will support this kind of transfers, here slave can use the transfers accordingly.

AMBA AHB implements the features required for high-performance, high clock frequency systems including:

· Burst transfers.

· Single clock-edge operation.

· Non-tristate implementation.



· Wide data bus configurations, 64, 128, 256, 512, and 1024 bits.



Locked transfers, if the master requires locked accesses then it must also assert the HMASTLOCK signal. This signal indicates to any slave that the current transfer sequence is indivisible and must therefore be processed before any other transfers are processed. Typically, the locked transfer is used to maintain the integrity of a semaphore, by ensuring that the slave does not perform other operations between the read and write phases of a microprocessor SWP instruction.

Note: Many masters are not capable of generating accurate protection information. If a master is not capable of generating accurate protection information, AHB5 specification recommends that:


The master sets HPROT to 0b0011 to correspond to a Non-cacheable, Non-bufferable, privileged, data access.
Slaves do not use HPROT unless absolutely necessary


Secure transfers - AHB5 defines the Secure_Transfers property. This property defines whether an interface supports the concept of Secure and Non-secure transfers. If this property is not defined then the interface does not support Secure transfers. An interface that supports Secure transfers has an additional signal, HNONSEC. This signal is asserted for a Non-secure transfer and deasserted for a Secure transfer. HNONSEC is an address phase signal and has the same validity constraints as HADDR.

Stable_Between_Clock- AHB5 defines the Stable_Between_Clock property. This property is defined to determine if an interface guarantees that signals that are required to be stable remain stable between rising clock edges. If this property is True, it is guaranteed that signals that are required to be stable remain stable and glitch free between rising clock edges.

Exclusive_Transfer_property -AHB5 defines the Exclusive_Transfers property. This property defines whether an interface supports the concept of Exclusive Transfers. If this property is not defined then the interface does not support Exclusive Transfers. Exclusive Transfers provide a mechanism to support semaphore-type operations. An Exclusive access sequence is a sequence of Exclusive Transfers from a single master that operate using the following steps:

1. Perform an Exclusive Read transfer from an address.

2. Calculate a new data value to store to that address that is based on the data value obtained from the Exclusive Read.

3. Between the Exclusive Read and the Exclusive Write there can be other Non-exclusive transfers.

4. Perform an Exclusive Write transfer to the same address, with the new data value:

• If no other master has written to that location since the Exclusive Read transfer, the Exclusive Write transfer is successful and updates memory.

• If another master has written to that location since the Exclusive Read transfer, the Exclusive Write transfer is failed and the memory location is not updated.

5. The response to the Exclusive Write transfer indicates if the transfer was successful or if it failed. This sequence ensures that the memory location is only updated if, at the point of the store to memory, the location still holds the same value that was used to calculate the new value to be written to the location.

If the Exclusive Write transfer fails, it is expected that the master will repeat the entire Exclusive access sequence.

It is IMPLEMENTATION DEFINED whether an update of the same, or overlapping, location by the same master after an Exclusive Read transfer will cause the associated Exclusive Write transfer to succeed or fail.

Note: An Exclusive Access Monitor is required to support an Exclusive access sequence and this monitor must determine if an Exclusive Write transfer succeeds or fails

Atomicity:

Single-copy atomicity size: The single-copy atomicity size defines the number of data bytes that a transfer is guaranteed to update atomically.

Multi-copy atomicity: A system is defined as being multi-copy atomic if:

· Writes to the same location are observed in the same order by all agents.

· A write to a location that is observable by an agent, other than the issuer, is observable by all agents.

Multi-copy atomicity can be ensured by avoiding the use of forwarding buffers, which can make a transfer visible to some agents in a system, but not visible to all.

Note Additional requirements exist to ensure multi-copy atomicity in systems that include some form of hardware cache coherency.



AHB Multi Layered Bus:
A normal AHB bus has a significant disadvantage, that only 1 Master/Slave pair can be active at a given
time. If any master/slave pair is engaged in a transfer, all the other Master/Slaves has no choice but to wait. This situation is addressed by a AHB Multilayered bus, where the arbitration takes place inside the Slave. Now if say Master M1 is interacting with Slave S3, the other master say M2 can interact in parallel with other slave say S5. And in theory, all masters can now can operate in parallel, as long as they are interacting with unique different respective Slaves.

HREADY an o/p from slave-

Ø HREADY is an o/p from slave so that it can extend the data phases if it needs more time.

HREADY is an input to slave-









Ø The Arbiter Grants a Master the bus on the Last clock cycle of the data phase of the previous Master, because the arbiter keeps track of the number of transfers and it knows the burst size and all characteristics.
Say an INCR4 burst is currently ongoing by Master1 to Slave2. The data phase will be 4 clock cycles, if the HREADY from Slave2 is always high.
The Dataphase will however be 5 clock cycles if say for example the Slave2 extends the DP3 into two clock cycles say DP3_0 and DP3_1, HREADY being '0' in DP3_0.
In absence of HREADY being an input to slave, another slave in the system say Slave3, will
see its HSEL go '1' in DP3_0, and it will start responding in DP3_1, which will result in data corruption. But Since HREADY is also availsble to this Slave3, it will come to konw that even when it has its HSEL go to '1'
The recommended default value of HREADY is '1'.

When would a master sample 'hready' -

Ø Since 'hready' is only related to data phase, the Master would sample 'hready' at the N+1th edge, where N is the clock edge, at which it issued the data.
When a master is granted the bus and is performing a fixed length burst it is not necessary to continue to request the bus in order to complete the burst. The arbiter observes the progress of the burst and uses the HBURST[2:0] signals to determine how many transfers are required by the master. For undefined length bursts the master should continue to assert the request until it has started the last transfer. The arbiter cannot predict when to change the arbitration at the end of an undefined length burst.

Ø An AHB Slave should not pull 'HREADY' low, if it has 'HTRANS' = NSEQ, i.e HTRANS = 2. This will be protocol violation. By spec, "The address cannot be extended, therefore all slaves must sample the address during this time" However, an 'Address Phase' could be extended as a side effect, if it coincides with Data Phase of previous transaction. HREADY indicates a transfer complete, it may then follow that a Slave may never pull down its HREADY to low, while its HTRANS is 0. Because this will mean that Slave is trying to extend a NOP?

Ø Though the AHB spec itself puts no restriction on a combinatorial path from issuing Master back to Issuing Master, but it should be avoided.

Ø A Slave should not pull 'hready' low during address phase, if this address phase is not overlapping with any pre-existing data phase. And if there is an overlap, then technically its the data phase which is being extended, with the address phase extended as a 'side effect'. Usually when all masters in a system are IDLE, then any slave if pulls its hready to 'low' then this won’t have any effect on any of the system masters. If the slave does so, it will be ignored. In fact, if this happens, and there is an arbiter between master and slave, then this low hready won’t be passed to any master either.



AHB Multilayer-
Consider an AHB arbiter, which connects M masters and S slaves. This arbiter or AHB Bus arbitrates between M masters and grants the bus to one of them, say Master 2 or M2 Now M2 is interacting with some slave say S3.
While M2 is interacting with S3, all other masters in the system cannot do anything will M2 is busy. Here we lose the opportunity, as while M2 is interacting with S3, there is no reason why say M4 cannot interact with S1 simultaneously. But with a single layered bus, this is not possible. A Multilayered bus or arbiter enables just that,
so that any master can interact with any slave as long as the master/slave pair is unique.
For example while M2 is interacting with S3, M4 can interact with S2, M1 can interact with S1 simultaneously.
How is this possible?
This is possible when the arbitration takes place at the slave port. Each Slave will have say M interfaces, So, that M masters can connect to it. There is now a path to each master in the system to each slave, so that simultaneous accesses can take place if the Master/Salve pair is unique.

This multi-layer system will be essentially AHB-Lite. No Bus Requests, no Bus Grants, as conceptually there are multiple parallel buses, actually there are as many buses in the system
as there are number of slaves.




References : AHB5, AHB-Lite , AXI Specification



----------------------------------------------------Happy Learning--------------------------------------









0 Comments:

Post a Comment