Connecting to the ILA using HW Server
After designing the Overlay with a System ILA as described in part 1 of this series, we are going to look at how to connect and start analyzing the AXI4-Stream channels.
Prepare the Environment
To get system ILA information, the board needs to be connected via the micro USB cable to your development machine, where your Vivado is installed. You will also need the *.ltx
file (<proj_path>/<proj_name>.runs/impl_1/*.ltx
), this file indicates Vivado how the System ILA is configured and what signals are being monitored.
NOTE: For Kria boards it is recommended that you use the
Debug Bridge
instead of the micro USB cable. Check out how to Use an ILA without Physical micro USB cable
Connect to the System ILA
Create a new notebook and execute the following to load the overlay.
from pynq import Overlay, allocate
import numpy as np
ol = Overlay('dma.bit')
dma = ol.axi_dma
dma_send = ol.axi_dma.sendchannel
dma_recv = ol.axi_dma.recvchannel
In your Vivado instance, click Open Hardware Manger under PROGRAM AND DEBUG
Click on Open target, on the green bar that just appeared, then Auto Connect
This will automatically connect to a local hw_server that establish the
connection with your overlay running on the board.
If you do not see this, it is highly likely that the *.ltx
file was not correctly loaded. Only follow the steps below if you do see the Waveform windows completely black, no AXI4-Stream information.
In the Hardware window, select your device, xc7z020_1
for PYNQ-Z2, as you can see it is already programmed. However, if you look into the Hardware Device Properties, you can see that the Probes file is empty. Click on the ...
button and add the *.ltx
file
Once the ltx
files is added, the hardware manager will automatically refresh the Waveform window and show you AXI4-Stream channels that we being monitored.
Start Capturing with the System ILA
Now, that you are fully connected to the System ILA, we can do some test captures so you can familiarize with it.
The System ILA has many different options, I will only focus on some of the basics. For more information, check out Vivado Hardware Manager Dashboards.
The two buttons that we will use the most are Run Trigger for this ILA core
and Run Trigger immediate for this ILA core
Go ahead and click , as there is no activity in the AXI4-Stream channels you will see both MM2S
and S2MM
channels inactive. The waveform clearly shows that both channels are inactive and there is no stream activity.
Now, we are ready to set a trigger, in other words control when the System ILA starts capturing after an event in one of the signals of our choosing.
Drag the TVALID
signal from the MM2S
channel and drop it on the Trigger Setup - hw_ila_1 window, then set the trigger Value
to R (0-to-1 transition)
, i.e., rising edge. I prefer R
rather than 1 (logical one)
at it ensures that it triggers in the transition, for this example both options should work. Note that the Core status
is Idle.
Click and note how the Core status
changes to Waiting for Trigger
, this means that the System ILA is now waiting for the trigger to happen in order to start collecting the transactions and display them in the waveform.
Go back to the JupyterLab notebook, allocate the buffers and only use the sendchannel
of the DMA.
data_size = 16
input_buffer = allocate(shape=(data_size,), dtype=np.uint32)
output_buffer = allocate(shape=(data_size,), dtype=np.uint32)
input_buffer[:] = np.arange(data_size, dtype=np.uint32)
dma_send.transfer(input_buffer)
dma_send.wait()
After you run this code, the System ILA should have triggered and then updated the Waveform window. To simplify the visualization, let us change the radix of the TDATA
signals to be Unsigned Decimal
Select both TDATA
signals, then right-click on one of them and select Radix and click on Unsigned Decimal
Let us zoom in around the red T vertical line, which is where the valid signal for the MM2S
channel is asserted for the first time. I am highlighting in orange the TVALID
and TREADY
signal of both channels and in yellow the TLAST
. This is a good point in the blog series to point you to the AXI4-Stream Interface, but as very important note a valid transaction only happens when both TVALID
and TREADY
are asserted at the same time. Also, the DMA IP works in what it is called packet mode, which means that it is mandatory for the TLAST
signal to be asserted to indicate the end of a transfer. Also, the DMA IP expects the TKEEP
signal.
Note: I will leave up to the reader to familiarize with AXI4-Stream.
Now, let’s look at the Waveform. The MM2S
channel is active from sample 512 to 528 and you can see that all the transactions are valid as both TVALID
and TREADY
are asserted, in the sample 527 TLAST
is assert indicating the end of the transfer, this relates to the size of the input_buffer
in our PYNQ code. After this TVALID
goes to 0. In each of the samples, you can see how TDATA
carries each element of the input_buffer
array.
Bringing our attention to the slot_1
or S2MM
channel, you can see that the stream goes active for 4 cycles, 515 to 519, then TREADY
goes to 0, this is because the DMA preemptively loads 4 transactions (Stream Beat).
How would you setup the trigger to capture the rest of the S2MM
channel transactions when we run the PYNQ code? Stop for for a few minutes to think.
I would add S2MM
channel (slot_1) TREADY
signal to the Trigger Setup (Value R
)
Then I would set the trigger condition to Global OR
.
Click to start the trigger.
Go back to the JupyterLab notebook, allocate the buffers and only use the recvchannel
of the DMA.
dma_recv.transfer(output_buffer)
dma_recv.wait()
print(f'Are buffers equal after DMA? {np.array_equal(output_buffer, input_buffer)}')
I would leave up to the reader to analyze the waveform.
DMA Register Map
PYNQ allows you to read the DMA status via .register_map
. This capability shows all the DMA register and their value. In the JupyterLab after completing the transfers in both send and receive channels run:
dma.register_map
For now, just focus on the MM2S
and S2MM
length registers, as you can see both report 64, this value indicates the number of requested bytes to be transferred in each channel. Each array has 16 elements, and the datatype is np.uint32
which is represented by 4-Byte. So, 16 x 4 = 64
. Both DMASR
registers indicate that the channel is Idle
.
This concludes the second part of this blog series, see debugging common DMA issues.
Please, use the comments section for questions related to the content of this blog. If you have questions about your own design or unrelated topics, please create a new topic in the forum.