Problem with HLS Video Stream with AXI Master on Pynq-Z2

Hello everyone, I am trying to create a HLS design with AXI stream and AXI master buffer.
The module is a passthrough video with a buffer storing a whole frame.
The code is as follows:

 #include "hls_video.h"
 #include <ap_fixed.h>
 #include <ap_int.h>

 #define FRAME_WIDTH 1920
 #define FRAME_HEIGHT 1080

 #define MAX_DEPTH FRAME_WIDTH*FRAME_HEIGHT
 const int max_depth=MAX_DEPTH;

 typedef ap_uint<32> uint32;

 #define W 32
 typedef hls::stream<ap_axiu<W,1,1,1> >           AXI_STREAM;
 
void img_buffer(AXI_STREAM & streamIn, AXI_STREAM & streamOut, int * buffer, int rows, int cols)
{
 //#pragma HLS INTERFACE ap_ctrl_none port=return
 #pragma HLS INTERFACE s_axilite port=return
 #pragma HLS INTERFACE s_axilite port=rows
 #pragma HLS INTERFACE s_axilite port=cols
 #pragma HLS INTERFACE axis port=streamIn
 #pragma HLS INTERFACE axis port=streamOut
 #pragma HLS INTERFACE m_axi depth=max_depth port=buffer offset=slave


for (int y = 0; y < rows; y++) {
 #pragma HLS LOOP_TRIPCOUNT max=1080
		for (int x = 0; x < cols; x++) {
 #pragma HLS LOOP_TRIPCOUNT max=1920
#pragma HLS PIPELINE II=1
#pragma HLS LOOP_FLATTEN
			// Write pixel to buffer.
			pixelIn = streamIn.read();
			buffer[y*cols+x] = pixelIn.data;
			streamOut.write(pixelIn);

		}
	}

I am connecting the HLS block to the HP0 as illustrated in the block design below.

To start the drivers, I am using the following jupyter notebook:

from pynq import Overlay
from pynq.lib.video import *
from pynq import DefaultIP

  #configure HLS driver
 class hls_img_bufferDriver(DefaultIP):
      def __init__(self, description):
       super().__init__(description=description)

    bindto = ['xilinx.com:hls:img_buffer:1.0']

    def img_buffer_func(self, rows, cols, buffer_ptr):
        self.write(0x00,  0x01)
        self.write(0x10, buffer_ptr)
        self.write(0x18, rows)
        self.write(0x20, cols)

  overlay = Overlay("/home/xilinx/pynq/overlays/video/img_buffer.bit")
 
  #allocate memory to AXI Master
  from pynq import allocate

 buffer = allocate(shape=1920*1080, dtype='int')
 buffer_ptr = buffer.device_address

 img_buffer = overlay.video.img_buffer_0
 img_buffer.img_buffer_func(1080, 1920, buffer_ptr)
 
 #configure HDMI drivers
 hdmi_in = overlay.video.hdmi_in
 hdmi_out = overlay.video.hdmi_out

hdmi_in.configure()
hdmi_out.configure(hdmi_in.mode)
hdmi_in.start()
hdmi_out.start()

hdmi_in.tie(hdmi_out)

However, I am unable to have video output. I think I need to allocate memory before using the AXI master but I am not sure what is wrong with my desing.
I am lost for now. Could someone please help?

Thank you.

The only thing which you might want to change with code is moving the self.write(0x00, 0x01) line to the bottom of the function so the accelerator doesn’t start until after you’ve programmed the registers but other than that things look OK.

When you say that the output isn’t working are you getting a black screen or no output at all?

What board are you using? If you’re on a Z1 or Z2 it might be worth moving one of you AXI connections to HP2 or HP3 as with all three masters at 1080p you are right on the border of using all of the theoretical bandwidth on a single slave. A quick check would be to drop the resolution down to 720p and see if you get a signal back.

I’d also make sure that the output is working in isolation in case there is a timing problem - the HDMI outputs can be susceptible to timing violations.

Hope this provides a few things to try

Peter

1 Like

Hi @PeterOgden,
thank you for the throroughly suggestions.
I am using a Pynq Z2 board. The output is a black screen. When I print the buffer variable I can see numbers stored. I was not understanding why I could not see signals, once I am passing the input pixels to the output. I was only storing the pixels in the buffer.
Anyway, I have few things to try.
The implementation time is really big (more than 3 hours). If I remove the axi master, the implementation is just few minutes.

Thank you

This is the video output after just replacing the write(0x00, 0x01) to the bottom line of the function. I used write(0x00, 0x81) this time instead. Really looks like the first few lines of the frame.

Hi @PeterOgden,
Connecting the AXI Master to an AXI Interconnected connected to HP2 did solve the problem.
Thank you very much indeed.