PYNQ program stuck there when trying to start a overlay on ZCU104

I try to run a very simple program on ZCU104 to test the PYNQ design process, my program stuck there when I trying to start the overlay, here are the details:

1.Evaluation Kit: EK-U1-ZCU104-G-ED

2.Detail Information
2.1 In Vivado HLS, write a basic program, which only reads in a matrix with 28*28 8-bit elements, change the matrix elements value, then return the revised matrix.
Here is the HLS function code:

2.2Make the HLS program into IP, use Vivado to start an IP block design(where I think the problem located in?)

When validating the block design, the memory address cannot be located correctly.
I have to choose “Auto Assign Address” in the address editor, but some ports are still excluded and left critical warning here.

2.3 I put the overlay bitstream file (design_1_wrapper.bit) and hardware file (design_1.hwh) on the ZCU104board.

3.The PYNQ code in Jupyter notebook:

  1. My program stuck hanging here:
    " if(ap_done):
    print(mmio.read(ADDR_AP_CTRL))
    break"
    since ADDR_AP_CTRL always be 0.

  2. I do a small test, assign values to different IP addresses.
    For those pre-claimed registers, no matter what value I assign, the value is always 0, I guess this is the direct cause of the problem, but not where I should fix it.

For the self-claimed memory address, I can correctly write in the value(cc=0x81=129).
register test

.Can anyone give me some advice? A thousand thanks.

I have some thoughts that might help:

  • Try to give the .bit and .hwh files the same names (i.e. NAME.bit and NAME.hwh).
  • Try to export the .tcl file from Vivado and put it on the board besides the .bit and .hwh files (also use the same name NAME.tcl).
  • From your Vivado integration diagram I don’t see that you’re using any interrupt. Try to comment out any part of your code that deals with GIE or IER … probably they’re making some troubles.
  • In your StartStop_Ex(state) method, try to check whether the IP is idle before you ask it to execute … something like:
# Check if IP is idle
while(True) :
    bits = mmio.read(ADDR_AP_CTRL)
    ap_idle = bits>>2 & 0x1
    if ap_idle == 1:
        break
# IP is idle and ready to compute .. do your business

These are my thoughts for now.

Do you really need 2G space for both ports? I think it might be good to start with a smaller range, something like 16MB or so.

Thank you Rock.
I edited the space address range to 16MB and nothing changed.

Thank you for your reply rashedkoutayni,

  1. About NAME.bit or NAME_Wrapper.bit, I checked there is no NAME.bit from Vivado. Also, I saw this guidance video also use NAME_Wrapper.bit.
    - YouTube

  2. Check whether IP is idle.
    Yes I tried this bit check, however, the ap_idle never comes to ‘1’

Hope to get your further instructions.

Thank you,

It looks like there are 2 problems.

In your IPI diagram, it looks like you haven’t connected ap_ctrl, and it isn’t an AXI interface.

  1. As it isn’t an AXI interface, you can’t use MMIO.
  2. As it isn’t connected, you can’t control this IP at all

In your HLS, you haven’t specified anything for your control interface, so you get a default ap_ctrl port. This will generate an interface of wires that you need to connect to your design in some way.
It is better to use an AXI lite interface for the control port:

#pragma HLS INTERFACE s_axilite port=return
You can bundle the control and the input addresses for the arrays to the same AXI slave interface.

You then need to check the address offsets in your code for the updated design.

Cathal

Thank you.

  1. I add an s-axilite interface to the return port, so I have 3 ports and I assign each port with a bundle:
    IP3_Interface

The entire HLS program is the following(only interface pragma changed):

  1. As far as I understand, the input port for the array requires m_axi port, the return port for CTRL uses s_axilite port, how can I simply bundle them together?

  2. The updated block diagram here:

I have such 4 critical warnings but don’t know what to do, so I just ignore them.

  1. In the Pynq, I updated the execution function with ap_idel check.
    overlay_ex

Before execution, print(mmio.read(ADDR_AP_CTRL)) returns value 4, which means ip_idle=1;
After execution, print(mmio.read(ADDR_AP_CTRL)) returns value 131!!!

Seems that my overlay won’t be stuck here!!! (in the past, the ap_idle never be 1, so the overlay will be stuck on this execution step)

The new problem is, the ADDR_OUTBUFFER has no output value update.

However, I don’t know which step was wrong. (FPGA not working, or only output not transferred to RAM?)
I tested it by adjusting the clock_wizard IP blcok’s output frequency, from 100MHz to 10MHz

I record the time before/after execution as the figure shows in step4, find out that the execution time doesn’t change under 100Mhz/10Mhz, which means the FPGA is not running?

Sincerely thanks.

Hey, could you add 2 more pragmas like:

#pragma HLS INTERFACE s_axilite port=img_8b bundle=img8baddr
#pragma HLS INTERFACE s_axilite port=res bundle=resaddr

and in python:
mmio.write(s_axi_lite_img8baddr_port, some_physical_address)

You could take a look there: GitHub - bartokon/Eclypse-Z7-Notebooks: Some basic notebooks for Eclypse-Z7
I have made some (bad) designs, but they will show you how to use IP’s with PYNQ, how to call them.
Source code is in .Zip and notebooks… you know the rest