AMBA Bus Architecture & Protocol Understanding - Part#1

 

🚀 AMBA Protocol Evolution and SoC Design Trends

📌 AMBA Protocol Evolution

Following diagram illustrates AMBA evolution of protocols along with the SoC design trends in industry.

AMBA Protocol

🔹 AMBA 1 Specification (First version)

Defines two buses/interfaces:

  • Advanced System Bus (ASB)

  • Advanced Peripheral Bus (APB)

🔹 AMBA 2 Specification

Defines three buses/interfaces:

  • Advanced High-performance Bus (AHB) – widely used on ARM7, ARM9, and ARM Cortex-M designs

  • Advanced System Bus (ASB)

  • Advanced Peripheral Bus (APB2 or APB)

🔹 AMBA 3 Specification

Defines four buses/interfaces:

  • Advanced eXtensible Interface (AXI3 or AXI v1.0) – widely used on ARM Cortex-A processors including Cortex-A9

  • AHB-Lite v1.0

  • APB3 v1.0

  • Advanced Trace Bus (ATB v1.0)

🔹 AMBA 4 Specification

Defines:

  • AXI Coherency Extensions (ACE) – widely used on the latest ARM Cortex-A processors including Cortex-A7 and Cortex-A15

  • ACE-Lite

  • AXI4, AXI4-Lite, AXI4-Stream v1.0

  • ATB v1.1

  • APB4 v2.0

🔹 AMBA 5 Specification

Defines:

  • AXI5, AXI5-Lite, and ACE5

  • AHB5, AHB-Lite

  • CHI (Coherent Hub Interface)

  • Distributed Translation Interface (DTI)

  • Generic Flash Bus (GFB)


💡 Protocol Comparison

  • APB: Used for connecting low bandwidth peripherals; non-pipelined; no burst data transfer

  • AHB: Used for high-bandwidth peripherals; supports burst data

  • AHB-Lite: Single master support, no arbitration, retry, or split

  • AXI: High bandwidth, low latency, pipelined, supports multiple outstanding transactions, independent read/write. The Advanced Extensible interface (AXI) is useful for high bandwidth and low latency interconnects. This is a point to point interconnect and overcomes the limitations of a shared bus protocol in terms of number of agents that can be connected. The protocol also was an enhancement from AHB in terms of supporting multiple outstanding data transfers (pipe-lined), burst data transfers, separate read and write paths and supporting different bus widths.

  • AXI-Lite: No burst data transfer

  • AXI-Stream: Streaming of data -only from master to slave

  • ACE: AXI Coherence extension protocol is an extension to AXI 4 protocol and evolved in the era of multiple CPU cores with coherent caches getting integrated on a single chip.

  • CHI (Coherent Hub Interface): The ACE protocol was developed as an extension to AXI to support coherent interconnects. The ACE protocol used a signal level communication between master/slave and hence the interconnects needed large number of wires with added channels for snoops and responses. The CHI protocol uses a layered packet based communication protocol with protocol, link layer and physical layer implementation and also supports QoS based flow control and retry mechanisms.


🧱 AHB Architecture

AHB 

AHB Arbitration

  • Single Master System: Requires only a decoder and multiplexer

  • Multi-Master System: A multi-master system requires the use of an interconnect that provides arbitration and the routing of signals from different masters to the appropriate slaves. This routing is required for address, control, and write data signaling. Different approaches used for multi-master systems, such as single layer or multi-layer interconnects.

AHB Arbitration

Pipelined Transfers

  • Address phase overlaps with data phase of previous transfer – crucial for performance.

This overlapping of address and data is fundamental to the pipelined nature of the bus and enables high performance operation while still providing adequate time for a slave to provide the response to a transfer.
  • Bufferable Transactions: 

    • Can be posted; bus/fabric may respond before actual destination

    •  It specifies that the final destination of the current transfer can be delayed (for example if final destination is to some USB or Network from the processor it can be delayed performing appropriate transfers.
    • Bufferable Transaction can be buffered. For example, if a master issues a write transaction to a Slave via a bus or a fabric, the bus/fabric may buffer the transaction and respond OK immediately. This means that the OK response does not come from the final destination. The bus/fabric will in turn will transfer the write command to the final destination. Bufferable Writes can also be called posted writes. 'Posted' Writes are the ones for which the issuer gets no response. Though in bufferable writes the Master do get a response, but it’s not from the final destination Posted/non-posted are terms from the PCIe World.
    • A bufferable transaction is indicated by HPROT[2], when it is '1', the transaction is bufferable.

  • Cacheable Transactions:

    • Data at destination may differ from original or be merged across transactions
    • it specifies that the transaction when it has reached to the final destination need not match with the initial transfer started by the Processor/Master. The data can also be mixed with from different transactions to form a packet (this mix may be used to form a packet/frame) accordingly. Not all master (AHB) will support this kind of transfers, here slave can use the transfers accordingly.

Understanding AMBA AHB and Its Key Features in High-Performance Systems

AMBA AHB (Advanced High-performance Bus) is a backbone interconnect protocol designed for high-frequency, high-throughput systems. It is widely used in SoC (System on Chip) architectures. In this blog post, we’ll walk through its critical features, advanced signaling, and how AHB5 elevates communication efficiency with secure, atomic, and exclusive transactions.


Key Features of AMBA AHB

AMBA AHB supports the following advanced features to cater to high-performance requirements:

  • Burst Transfers – Efficient data transfer in groups to reduce overhead.

  • Single Clock-edge Operation – Simplifies timing and synchronization.

  • Non-tristate Implementation – Simplifies hardware design by using muxes instead of tristate buses.

  • Wide Data Bus Configurations – Supports 64, 128, 256, 512, and 1024-bit data buses.


Locked Transfers

When a master needs to ensure that a sequence of operations is indivisible (e.g., to maintain semaphore integrity), it uses locked transfers. This is done by asserting the HMASTLOCK signal. For example, in a read-modify-write sequence, asserting HMASTLOCK ensures that no other operations intervene.

⚠️ Note: Many masters can't generate accurate protection info. In such cases, the AHB5 specification suggests:

  • Set HPROT to 0b0011 (Non-cacheable, Non-bufferable, Privileged, Data access).

  • Slaves should use HPROT only when absolutely necessary.


Secure Transfers (HNONSEC)

AHB5 defines the Secure_Transfers property. This property defines whether an interface supports the concept of Secure and Non-secure transfers. If this property is not defined then the interface does not support Secure transfers. An interface that supports Secure transfers has an additional signal, HNONSEC. This signal is asserted for a Non-secure transfer and deasserted for a Secure transfer. HNONSEC is an address phase signal and has the same validity constraints as HADDR.

In other words,

AHB5 introduces support for secure vs non-secure transfers using the HNONSEC signal. If HNONSEC is high, it’s a non-secure transfer. If low, it’s secure. This signal is valid during the address phase and complements security-aware designs.


Signal Stability: Stable_Between_Clock

If this property is set to True, it guarantees that critical signals remain glitch-free and stable between rising clock edges. This ensures predictable and reliable data transfers.


Exclusive Transfers (Read-Modify-Write)

Exclusive transfers allow a master to read a value, compute a new one, and write it back—but only if no other master modified that location in the meantime. Here's how it works:

  1. Master performs Exclusive Read.

  2. Calculates new value.

  3. May perform non-exclusive operations.

  4. Performs Exclusive Write:

    • Success: If no other master changed the data.

    • Fail: If data was modified by another master.

  5. If the Exclusive Write transfer fails, it is expected that the master will repeat the entire Exclusive access sequence.
  6. It is IMPLEMENTATION DEFINED whether an update of the same, or overlapping, location by the same master after an Exclusive Read transfer will cause the associated Exclusive Write transfer to succeed or fail.

The system uses an Exclusive Access Monitor to track these sequences and determine success or failure.

📌 Exclusive access ensures atomic updates in multi-master systems.

📌Note: An Exclusive Access Monitor is required to support an Exclusive access sequence and this monitor must determine if an Exclusive Write transfer succeeds or fails 


Atomicity

  • Single-Copy Atomicity: Guarantees that a defined number of bytes are updated atomically.

  • Multi-Copy Atomicity: 

    • Writes to the same location are observed in the same order by all agents.
    •  A write to a location that is observable by an agent, other than the issuer, is observable by all agents.
    • It can be ensured by avoiding the use of forwarding buffers, which can make a transfer visible to some agents in a system, but not visible to all.
    • Ensures consistent write observation order across all agents. Achieved by avoiding forwarding buffers and with hardware coherency protocols.
📌 Note: Additional requirements exist to ensure multi-copy atomicity in systems that include some form of hardware cache coherency.

AHB Multilayer Bus Architecture

Why Move Beyond a Single Bus?

In traditional AHB architecture, only one master/slave pair can be active at a time. Others must wait, which leads to performance bottlenecks.

Solution: AHB Multilayered Bus

In a multilayered bus system:

  • Arbitration moves inside the slave.

  • Multiple masters can access different slaves simultaneously.

For example:

  • M1 ↔ S3

  • M2 ↔ S5
    Both transfers can happen at the same time, improving throughput and system efficiency.

✅ As long as each master-slave pair is unique, all can operate concurrently.


Role of HREADY Signal

HREADY plays a vital role in controlling data transfer timing.

From Slave (Output)

  • Used by the slave to extend data phase if it needs more time.

  • If a slave pulls HREADY low, it indicates "not ready", delaying the next transfer.

To Slave (Input)

  • Prevents premature activation of other slaves.

  • Ensures no data corruption due to overlapping data/address phases.

HREADY Signal

Key Notes on HREADY Behavior:

  • Master samples HREADY on the N+1 clock edge, after issuing data.

  • Slave must not pull HREADY low during the NSEQ address phase (HTRANS = 2). This would be a protocol violation.

  • Slaves must be careful during overlaps between address and data phases. Improper handling can lead to data corruption.


Understanding Arbitration and Burst Transfers

  • Arbitration takes place on the last cycle of the current data phase.

  • For fixed-length bursts (e.g., INCR4), the arbiter predicts and grants based on burst info.

  • For undefined bursts, the master continues to assert request until it starts the final transfer.

🛑 Avoid combinational paths from master back to master through the arbiter. It may violate timing or stability.


Summary: Why AHB Multilayer Matters

In a traditional bus:

  • M2 ↔ S3 means other masters wait.

With AHB Multilayer:

  • M1 ↔ S1, M2 ↔ S3, M4 ↔ S2 → all can operate simultaneously.

This is possible because:

  • Arbitration happens at the slave port.

  • Each slave can support multiple master ports.

  • Conceptually, it's like having parallel AHB-Lite buses (one per slave).


Final Thoughts

The AHB5 and multilayer bus architecture significantly enhance system performance, scalability, and reliability. With features like secure and exclusive transfers, atomicity guarantees, and intelligent arbitration, AMBA AHB continues to be a powerful choice in complex SoC designs.


📚 References

  • AHB5, AHB-Lite, AXI Specification


✅ -------------------------------------Happy Learning--------------------------------------------


Comments

Popular posts from this blog

Verilog HDL Examples - FIFO Design - Asynchronous FIFOs

Data-To-Data [Non-Sequential] Timing Checks

Fundamentals of Isolation Cells in Low Power VLSI Design