PYNQ: PYTHON PRODUCTIVITY

Using m_axi with PYNQ-Z1

I am using Vitis HLS v2020.1 to implement a top level function with the following signature:

void TrafficClassifier(input_t input[SIZE_IN], output_t output[SIZE_OUT]){
#pragma HLS INTERFACE s_axilite port=return bundle=CTRL_BUS
#pragma HLS INTERFACE m_axi offset=slave port=input 
#pragma HLS INTERFACE s_axilite port=input bundle=CTRL_BUS
#pragma HLS INTERFACE m_axi offset=slave port=output
#pragma HLS INTERFACE s_axilite port=output bundle=CTRL_BUS
...
}

as can be seen, I am using AXI Master interfaces (m_axi) to access the data located in the off-chip DRAM.
After the high level synthesis, I used Vivado to integrate the system as shown:

(I had to enable a S_AXI_HP0 port on the PS to connect it with m_axi_gmem)

After generating the concerned (.bit, .tcl and .hwh) files, I moved to PYNQ-Z1 board to accelerate my function.
However I noticed that Vivado allocated 64 bits for the addresses:

-- ------------------------Address Info-------------------
-- 0x00 : Control signals
--        bit 0  - ap_start (Read/Write/COH)
--        bit 1  - ap_done (Read/COR)
--        bit 2  - ap_idle (Read)
--        bit 3  - ap_ready (Read)
--        bit 7  - auto_restart (Read/Write)
--        others - reserved
-- 0x04 : Global Interrupt Enable Register
--        bit 0  - Global Interrupt Enable (Read/Write)
--        others - reserved
-- 0x08 : IP Interrupt Enable Register (Read/Write)
--        bit 0  - enable ap_done interrupt (Read/Write)
--        bit 1  - enable ap_ready interrupt (Read/Write)
--        others - reserved
-- 0x0c : IP Interrupt Status Register (Read/TOW)
--        bit 0  - ap_done (COR/TOW)
--        bit 1  - ap_ready (COR/TOW)
--        others - reserved
-- 0x10 : Data signal of input_r
--        bit 31~0 - input_r[31:0] (Read/Write)
-- 0x14 : Data signal of input_r
--        bit 31~0 - input_r[63:32] (Read/Write)
-- 0x18 : reserved
-- 0x1c : Data signal of output_r
--        bit 31~0 - output_r[31:0] (Read/Write)
-- 0x20 : Data signal of output_r
--        bit 31~0 - output_r[63:32] (Read/Write)
-- 0x24 : reserved
-- (SC = Self Clear, COR = Clear on Read, TOW = Toggle on Write, COH = Clear on Handshake)

My questions are:
1- Is my understanding correct that when I use m_axi and s_axilite for an argument such as input, then s_axilite is used for signalling (i.e. control signals) while m_axi is used for data transfer?
2- Am I missing something in the interface pragmas?
3- How to access the input / output arrays from the python code on the PS correctly? I’m confused because I only see address ranges for signalling but not for the data itself. Also, how to deal with the 2x 32 bits address ranges (for instance, how to deal with 0x10 and 0x14 for the input) ?

Thanks in advance !

  1. Yes you are correct, you are seeing the registers which represent the address for the buffer.
  2. The interface pragmas look OK to me
  3. You need to allocate a buffer for the accelerator to use. This is done through pynq.allocate
  4. You can ignore the high bits if you’re on a Z1. You can just do something like ip.write(0x10, buffer.device_address)

Peter

2 Likes