| Component | Source File | Engineering Context |
|---|---|---|
| 🛠️ Engineering Log | ENGINEERING_LOG.md |
Technical Journal. Detailed breakdown of the 2M-cycle regression results, the "Ghost Data" testbench fix, and the "Snoop & Serialize" hazard architecture. |
| 🧠 The RTL Core | rtl/axi4l_slave_bridge.sv |
SystemVerilog implementation of the Hazard-Stalled FSM, featuring lookahead output registration to eliminate combinational paths strictly as per AXI4-Lite specifications. |
| ⚡ The Stress Test | tb/tb_axi4l_slave_ram_top.sv |
Multi-threaded testbench engine capable of generating Back-to-Back (Zero-Delay) and randomized protocol traffic. |
| 🔍 The Auditor | scripts/integration_test_checker.py |
Automated Python regression script that parses 2M+ transaction logs to verify data coherency and protocol compliance. |
This repository hosts a robust AXI4-Lite Slave implementation and a Directed Random Verification environment designed to validate protocol compliance under extreme stress.
The framework pushes the design to 2,000,000 randomized transactions, validating the Hazard-Stall logic which strictly serializes simultaneous Read/Write requests to the same address, preventing Port Collisions and ensuring Data Coherency.
A robust, high-frequency optimized AXI4-Lite Slave bridge designed to interface with Dual-Port RAM.
%%{
init: {
"flowchart": {
"curve": "basis",
"padding":4,
"nodeSpacing": 100,
"rankSpacing": 25
}
}
}%%
flowchart LR
subgraph MASTER_GRP [**Verification**]
TB[**Master / Test Bench**]
end
AXI_BUS[**AXI4-Lite Bus if**]
subgraph WRAPPER_GRP [**DUT**]
direction LR
subgraph BRIDGE_GRP [**AXI4-Lite Slave Bridge**]
direction TB
WFSM[**Write FSM**]
RFSM[**Read FSM**]
end
RAM_BUS[**RAM Bus if**]
subgraph RAM_GRP [**Backend**]
RAM[**Dual Port RAM**]
end
end
PHANTOM[ ]
WFSM -- Hazard Signal --> RFSM
TB ==> AXI_BUS
AXI_BUS ==> WFSM
AXI_BUS ==> RFSM
WFSM ==> RAM_BUS
RFSM ==> RAM_BUS
RAM_BUS ==> RAM
RAM_GRP ~~~ PHANTOM
classDef master fill:#2d1b4e,stroke:#bd93f9,stroke-width:2px,color:#fff
classDef axi fill:#003f5c,stroke:#4facfe,stroke-width:2px,color:#fff
classDef logic fill:#0d1117,stroke:#58a6ff,stroke-width:2px,color:#fff
classDef rambus fill:#4a1e00,stroke:#ffa726,stroke-width:2px,color:#fff
classDef ram fill:#3e2723,stroke:#ffcc80,stroke-width:2px,color:#fff
classDef container fill:none,stroke:#ffffff,stroke-width:2px,stroke-dasharray: 5 5,color:#fff
classDef container2 fill:none,stroke:#ffffff,stroke-width:1px,stroke-dasharray: 5 5,color:#fff
classDef phantom fill:none,stroke:none
class TB master
class AXI_BUS axi
class WFSM,RFSM logic
class RAM_BUS rambus
class RAM ram
class MASTER_GRP container
class WRAPPER_GRP container
class BRIDGE_GRP container2
class RAM_GRP container2
class PHANTOM phantom
linkStyle 1,2,3 stroke:#4facfe,stroke-width:3px,color:#fff
linkStyle 4,5,6 stroke:#ffa726,stroke-width:3px,color:#fff
linkStyle 0 stroke:#ff5555,stroke-width:2px,color:#ffffff
Key Design Characteristics:
- Strict AMBA® AXI4-Lite™ Compliance: The architecture is designed strictly adhering to the AMBA® AXI and ACE Protocol Specification (ARM IHI 0022E).
- Lookahead Registered FSM Outputs: To strictly satisfy the AXI4-Lite specification (which prohibits combinational paths between Slave Inputs and Slave Outputs), control signals are calculated based on next-state logic and driven directly from flip-flops without latency. This prevents glitchy combinational loops and ensures a clean, protocol-compliant interface.
- Blocking Topology: The design operates as a Blocking FSM, strictly completing one transaction (including response) before accepting new requests, ensuring deterministic behavior.
Key Feature: Hazard-Stalled FSM (Collision Avoidance) To prevent Port Collisions and ensure data coherency during simultaneous Read/Write operations to the same address (a known hazard in Dual-Port RAMs), the Read FSM implements a "Snoop & Stall" mechanism:
- Snoop: The Read FSM snoops the Write FSM's
write_activestatus and the registered Write Address. - Pre-Collision Stall: If a potential hazard is identified (Read Addr == Active Write Addr), the Read FSM immediately transitions to a
S_RD_STALLstate before the collision can occur, parking the read request until the write commits - Result: Guarantees deterministic behavior by strictly serializing conflicting requests (Write-Priority).
graph TB
classDef state fill:#0d1117,stroke:#58a6ff,stroke-width:2px,color:#fff,rx:5,ry:5;
classDef decision fill:#21262d,stroke:#ff7b72,stroke-width:2px,color:#fff,shape:diamond;
classDef hazard fill:#330000,stroke:#ff5555,stroke-width:2px,color:#fff;
classDef action fill:#1f6feb,stroke:#fff,stroke-width:1px,color:#fff;
subgraph Read_FSM [**Read Channel FSM**]
direction TB
R_START((Reset))
R_IDLE["S_RD_IDLE<br/>Waiting for ARVALID"]
R_STALL["S_RD_STALL<br/>Wait for Write FSM"]
R_LAT["S_RD_LATENCY<br/>RAM Read Enable"]
R_RESP["S_RD_RESP<br/>Drive RVALID"]
R_DEC_REQ{{"Request?<br/>arvalid && ready"}}
R_DEC_COLL{{"HAZARD CHECK<br/>Active & Addr Match<br/>OR Wait_Addr?"}}
R_DEC_UNSTALL{{"Safe Now?<br/>!write_active"}}
R_DEC_RRDY{{"Master Ready?<br/>rready"}}
R_START --> R_IDLE
R_IDLE --> R_DEC_REQ
R_DEC_REQ -- No --> R_IDLE
R_DEC_REQ -- Yes --> R_DEC_COLL
R_DEC_COLL -- Collision Detected --> R_STALL
R_DEC_COLL -- Safe --> R_LAT
R_STALL --> R_DEC_UNSTALL
R_DEC_UNSTALL -- No (Still Active) --> R_STALL
R_DEC_UNSTALL -- Yes (Safe) --> R_LAT
R_LAT --> R_RESP
R_RESP --> R_DEC_RRDY
R_DEC_RRDY -- No --> R_RESP
R_DEC_RRDY -- Yes --> R_IDLE
class R_IDLE,R_STALL,R_LAT,R_RESP state
class R_DEC_REQ,R_DEC_UNSTALL,R_DEC_RRDY decision
class R_DEC_COLL hazard
class R_START action
end
subgraph Write_FSM [**Write Channel FSM**]
direction TB
W_START((Reset))
W_IDLE[S_WR_IDLE<br/>Ready for Req]
W_WAIT_D[S_WR_WAIT_DATA<br/>Latch Addr]
W_WAIT_A[S_WR_WAIT_ADDR<br/>Latch Data]
W_PRE[S_WR_PRE_EXE<br/>Collision Check Cycle]
W_EXE[S_WR_EXECUTE<br/>Drive RAM Enable]
W_BRESP[S_WR_BRESP<br/>Send Response]
W_DEC_ARR{{"Input Arrival?<br/>{aw_fire, w_fire}"}}
W_DEC_WD{{"Data Valid?<br/>(wvalid)"}}
W_DEC_WA{{"Addr Valid?<br/>(awvalid)"}}
W_DEC_BR{{"B-Ready?<br/>(bready)"}}
W_START --> W_IDLE
W_IDLE --> W_DEC_ARR
W_DEC_ARR -- "11 (Both)" --> W_PRE
W_DEC_ARR -- "10 (Addr Only)" --> W_WAIT_D
W_WAIT_D --> W_DEC_WD
W_DEC_WD -- No --> W_WAIT_D
W_DEC_WD -- Yes --> W_EXE
W_DEC_ARR -- "01 (Data Only)" --> W_WAIT_A
W_WAIT_A --> W_DEC_WA
W_DEC_WA -- No --> W_WAIT_A
W_DEC_WA -- Yes --> W_PRE
W_PRE --> W_EXE
W_EXE --> W_BRESP
W_BRESP --> W_DEC_BR
W_DEC_BR -- No --> W_BRESP
W_DEC_BR -- Yes --> W_IDLE
class W_IDLE,W_WAIT_D,W_WAIT_A,W_PRE,W_EXE,W_BRESP state;
class W_DEC_ARR,W_DEC_WD,W_DEC_WA,W_DEC_BR decision;
class W_START action;
end
linkStyle 2,4,7,11 stroke:#cc0000,stroke-width:2px,color:#fff
linkStyle 3,5,8,12 stroke:#00cc00,stroke-width:2px,color:#fff
linkStyle 0,1,6,9,10 stroke:#ffffff,stroke-width:2px,color:#fff
linkStyle 18,22,27,28 stroke:#cc0000,stroke-width:2px,color:#fff
linkStyle 19,23,28 stroke:#00cc00,stroke-width:2px,color:#fff
linkStyle 13,14,15,16,17,20,21,24,25,26 stroke:#ffffff,stroke-width:2px,color:#fff
style Write_FSM fill:none,stroke:none
style Read_FSM fill:none,stroke:none
The verification environment adopts a Directed Random methodology implemented in pure SystemVerilog. Designed for high-performance simulation, this framework bypasses the complexity of UVM in favor of a lightweight, multi-threaded architecture. This approach allows for aggressive randomization of protocol phases (Latency, Backpressure) while retaining deterministic control over transaction ordering to target specific corner cases.
%%{
init: {
"theme": "base",
"themeVariables": {
"fontSize": "12px"
},
"flowchart": {
"curve": "basis",
"padding": 4,
"nodeSpacing": 10,
"rankSpacing": 10
}
}
}%%
flowchart LR
subgraph BP_GRP [Backpressure Logic]
direction TB
BP_WR[Thread: Write Resp BP]
BP_RD[Thread: Read Resp BP]
end
subgraph GEN_GRP [Test Generator]
direction TB
GEN_WR[Thread: Write Gen Loop]
GEN_RD[Thread: Read Gen Loop]
end
subgraph WR_GRP [Write Task]
direction TB
WR_AW[Thread: Addr Channel]
WR_W[Thread: Data Channel]
WR_B[Thread: Resp Channel]
end
subgraph RD_GRP [Read Task]
direction TB
RD_AR[Thread: Addr Channel]
RD_R[Thread: Resp Channel]
end
%% Phantom node to protect against GitHub UI buttons
PHANTOM[ ]
RD_GRP ~~~ PHANTOM
GEN_WR ==> WR_AW
GEN_WR ==> WR_W
GEN_WR ==> WR_B
GEN_RD ==> RD_AR
GEN_RD ==> RD_R
classDef bp fill:#240046,stroke:#e0aaff,stroke-width:1px,color:#fff
classDef gen fill:#001d3d,stroke:#4cc9f0,stroke-width:1px,color:#fff
classDef write fill:#002500,stroke:#70e000,stroke-width:1px,color:#fff
classDef read fill:#3d0c02,stroke:#ff9e00,stroke-width:1px,color:#fff
classDef container fill:none,stroke:#ffffff,stroke-width:1px,stroke-dasharray: 5 5,color:#fff
classDef phantom stroke:none,fill:none,width:0px;
class BP_WR,BP_RD bp
class GEN_WR,GEN_RD gen
class WR_AW,WR_W,WR_B write
class RD_AR,RD_R read
class BP_GRP,GEN_GRP,WR_GRP,RD_GRP container
class PHANTOM phantom
linkStyle 1,2,3 stroke:#70e000,stroke-width:1px
linkStyle 4,5 stroke:#ff9e00,stroke-width:1px
The testbench implements a set of thread-safe components to manage the concurrent nature of the simulation:
-
Golden Reference (
golden_mem)- A standard SystemVerilog array serving as the "Source of Truth," initialized directly from the memory file.
- It mimics the RAM contents and is updated strictly upon successful Write Response (
BVALID && BREADY), ensuring it reflects only committed data.
-
Semaphore-Locked Scoreboard
- Architecture: To handle high concurrency without the complexity of UVM Analysis Ports, this testbench utilizes a Semaphore-based Locking Mechanism (
scoreboard_lock). - Goal: This enforces Atomic Verification. It guarantees that the shared memory models [Committed (Golden) and In-Flight (Pending)] are updated as a single atomic unit, strictly preventing race conditions where a thread might read "half-written" data or an intermediate state.
- Known Limitation: While this guarantees thread safety, this method introduces a rare issue. We accepted this specific edge case to maintain architectural simplicity; a comprehensive root-cause analysis is detailed in the Engineering Log.
- Architecture: To handle high concurrency without the complexity of UVM Analysis Ports, this testbench utilizes a Semaphore-based Locking Mechanism (
-
Parallel Backpressure Engines
- Two independent "forever" threads run in the background, randomly toggling
BREADY(Write Response) andRREADY(Read Data). - Goal: Forces the DUT to hold valid data on the bus for extended periods (0–30 cycles), validating protocol stability constraints.
- Two independent "forever" threads run in the background, randomly toggling
The verification environment ensures design robustness by cycling through four distinct high-stress scenarios:
| Scenario | Implementation Detail | Verification Goal |
|---|---|---|
| Concurrent R/W (Chaos Phase) | Write Task (3 threads) and Read Task (2 threads) run simultaneously with randomized inter-transaction gaps. | Forces random overlap of Read/Write operations to verify Channel Independence and validate the Hazard-Stall logic during RAM address collisions. |
| Decoupled Response Monitoring | Response threads (B & R channels) are completely independent of Request threads, monitoring the bus 100% of the time. | Ensures Zero Spurious Responses. Unlike sequential tests, this catches illegal BVALID/RVALID pulses that occur outside or before a valid transaction completes. |
| Random Backpressure | Background threads de-assert READY signals for random durations (0–30 cycles) independent of the driver. |
Verifies that the FSM correctly "parks" and holds valid data stable during Master stalls without dropping transactions. |
| Sequential Back-to-Back (Burst) | A final directed loop forces Zero-Delay Transactions (Fast Mode) for both Writes and Reads. | Stress tests the FSM Recovery Time by slamming the DUT with a new request immediately after the previous handshake completes, ensuring no dead cycles or lockups. |
To guarantee strict adherence to design constraints, Concurrent Assertions are embedded directly into the interface definition files (axi4l_bus_if and axi4l_slave_backend_if). These properties continuously validate the design against the following critical safety layers:
Embedded within the AXI Bus Interface, these assertions validate strict compliance with the AMBA specification:
- Handshake Stability: Verifies that once
VALIDis asserted, it remains high and stable untilREADYis received (no pulse/retract allowed). - Signal Integrity: Continuously monitors control signals for Unknown (
X) States during active operation. - Reset Protection: Ensures all
VALIDandREADYsignals remain strictly de-asserted whileARESETNis low. - Spurious Response Detection: Maintains active counters for Requests vs. Responses to instantly flag any 'Spurious' responses (responses without a corresponding request).
Embedded within the RAM Control Interface, these assertions protect the physical memory model:
- Collision Hazard Detection: Critical check that triggers if both Write Enable (
ena) and Read Enable (enb) fire simultaneously on the same address, notifying address collisions. - Control Logic Safety: Verifies that Enable signals are correctly disabled during idle states and resets.
- Data Validity: Enforces that no
Xstates propagate to the memory inputs when not in reset.
For the exact property definitions and assertions, please refer to the source code in
axi4l_bus_if.svandaxi4l_slave_backend_if.sv.
To support a robust verification workflow, the environment follows a two-stage regression strategy: first executing the RAM Unit Tests (verifying the memory model in isolation), followed by the full AXI4-Lite Integration Tests.
- Structured Logging: Both stages write simulation data (transactions, errors, status) to external log files using a machine-readable pipe-delimited format. The output file paths are configured directly within the corresponding testbench and interface source files.
- Automated Analysis: A companion Python Script parses these logs post-simulation to verify transaction counts, detect error patterns, and generate a final "Pass/Fail" summary, eliminating the need for manual waveform inspection.
To reproduce the verification results in Vivado, follow this two-stage workflow:
Stage 1: RAM Unit Test
- Goal: Verify the underlying memory model before attaching the bus.
- Setup: In the Vivado Sources window, right-click on
tb_unit_ram_dual_port_byte_en.svandram_dual_port_byte_en.svthen select Set as Top for each. - Run: Launch Simulation. Ensure no memory errors occur.
Stage 2: AXI Integration Test (Full System)
- Goal: Verify the full AXI4-Lite Protocol, FSMs, and Concurrency.
- Setup: Right-click
tb_axi4l_slave_ram_top.svandaxi4l_slave_ram_top.svthen select Set as Top for each. - Run: Launch Simulation. This will execute the Chaos.
Stage 3: Automated Analysis
- Prerequisite: Ensure the
log_filepaths in the testbench and interfaces point to a valid directory on your machine, where the.pyparser scripts are. [ See Script Directory ] - Execution: After each simulation completes, run the corresponding Python parser script to validate transaction counts and check for error signatures.
.
├── assets/ # Images and diagrams for documentation
├── rtl/
│ ├── axi4l_slave_bridge.sv # Main Protocol FSM & Bridge Logic
│ ├── axi4l_slave_ram_top.sv # Top-Level Wrapper (DUT)
│ └── ram_dual_port_byte_en.sv # Physical Memory Model (Backend)
├── tb/
│ ├── axi4l_bus_if.sv # AXI Interface Definition & SVA
│ ├── axi4l_slave_backend_if.sv # RAM Interface Definition & SVA
│ ├── tb_axi4l_slave_ram_top.sv # Integration Testbench
│ ├── tb_unit_ram_dual_port_byte_en.sv # Unit Testbench for RAM
│ └── mem_preload.mem # Memory Initialization File
├── scripts/
│ ├── integration_test_checker.py # Automated Analysis for AXI System Test
│ └── unit_test_ram_checker.py # Automated Analysis for RAM Unit Test
├── ENGINEERING_LOG.md # Detailed Debugging & Design Journal
├── LICENSE
└── README.md # Project Documentation
Anish Dey
- Education: B.E. in Electronics & Telecommunication Engineering, Jadavpur University (2027)
- Interests: Digital & Analog VLSI
- Reach Out: LinkedIn