Pynq Z2 HLS opencv colorconversion

Hi,
I’d like to send an RGB image from the PS to the PL, do some opencv processing (colorconversion to grayscale for now) , then transfer it back to PS and show it. however the code gets stuck at dma.recvchannel.wait().
earlier i made a readback using only the DMA and the transfer is done well, but when i add my HLS core the image doesn’t get transferred back.
what i can see is that the IP is idle on start but AP_ready and AP_done are low. it exits it idle state when it received the start signal (0x01 or 0x81 on addr 0x00). It doesn’t clear the start signal and it doesn’t go back to its idle and ready state. I can also see something does happens when starting the IP: if i don’t start the IP, it stays stuck at dma.sendchannel.wait()
Vivado +HLS 2019.2
pynq 2.5

python code:

from pynq import Overlay
import numpy as np
import pynq.lib.dma
import time

from pynq import Xlnk
from PIL import Image
overlay = Overlay(‘/home/xilinx/pynq/overlays/made_by_benjamin/opencvtest/opencvtest.bit’)
dma = overlay.axi_dma_0
#dma_recv = overlay.axi_dma_from_pl_to_ps
opencvtest= overlay.cvtcolour_0
xres_address=0x10
yres_address=0x18
image_path = ‘/home/xilinx/pynq/overlays/made_by_benjamin/opencvtest/aurora_background.jpg’
original_image = Image.open(image_path)
original_image.load()
input_array = np.array(original_image)
xres, yres = original_image.size
print(xres)
print(yres)
xlnk = Xlnk()
in_buffer = xlnk.cma_array(shape=(yres, xres,3),
dtype=np.uint8, cacheable=1)
out_buffer = xlnk.cma_array(shape=(yres, xres,3),
dtype=np.uint8, cacheable=1)
in_buffer[:]=input_array
buf_image=Image.fromarray(in_buffer)
display(buf_image)
print(opencvtest.read(0x00))

opencvtest.write(xres_address,xres)
opencvtest.write(yres_address,yres)
print(opencvtest.read(xres_address))
print(opencvtest.read(yres_address))
def run_kernel():
dma.sendchannel.transfer(in_buffer)
dma.recvchannel.transfer(out_buffer)
print(‘run kernel’)
opencvtest.write(0x00,0x01)
dma.sendchannel.wait()
print(opencvtest.read(0x00))
dma.recvchannel.wait()
run_kernel()
buf_image2=Image.fromarray(out_buffer)
display(buf_image2)


cvtcolour.cpp:

#include “cvtcolour.h”

void cvtcolour(int xres, int yres, axi_stream& img_in, axi_stream& img_out){
#pragma HLS INTERFACE axis port=img_in
#pragma HLS INTERFACE axis port=img_out
#pragma HLS interface s_axilite port=return
#pragma HLS DATAFLOW
#pragma HLS INTERFACE s_axilite port=yres
#pragma HLS INTERFACE s_axilite port=xres
RGB_image imginmat(yres,xres);
RGB_image imggray(yres,xres);
RGB_image imgoutmat(yres,xres);

hls::AXIvideo2Mat(img_in, imginmat);
hls::CvtColor<HLS_BGR2GRAY>(imginmat, imggray);
hls::CvtColor<HLS_GRAY2RGB>(imggray, imgoutmat);
hls::Mat2AXIvideo(imgoutmat,img_out);

}

cvtcolour.h:

#include “hls_video.h”
#include “ap_fixed.h”
#include “ap_int.h”
#include “hls_stream.h”
#include “stdint.h”

#define max_width 1920
#define max_height 1200

typedef ap_axiu<32,1,1,1> axi_pixel;
typedef hls::stream<axi_pixel> axi_stream;
typedef hls::Mat<max_height,max_width,HLS_8UC3> RGB_image;
typedef hls::Mat<max_width,max_height,HLS_8UC1> gray_image;

void cvtcolour(int xres, int yres, axi_stream& img_in, axi_stream& img_out);

blockdesign:

Is the dimension correct?

that seems to be an error in the code indeed, but the gray_image type isn’t used in de .cpp file.
i have also tried using images significantly smaller than the maximum resolutions so it shouldn’t matter
or can i only use images that are the maximum defined resolution?

Based on my experience, you will have to provide images with consistent resolution. Otherwise the DMA may wait for more data to be transferred.

The DMA is not the problem i think: a simple readback through a fifo executes as it should be. in the HLS
IP the matrix dimensions are set by xres and yres.
and i coded the IP according to this tutorial: Leveraging OpenCV and High Level Synthesis with Vivado (v2013.1) - YouTube
Meanwhile i tried with fixed resolution and it still fails

If the HLS core is expecting more data, you cannot provide a smaller amount of data. You tested DMA without connecting the HLS IP, that is no problem; but after you connect HLS IP, the HLS IP has some requirements on how many data you provide.

Is it possible that the data format is incorrect? for example that the input stream streams data in 3x8b format and that the IP core replies in 1x32b format. that way it would be possible that the DMA awaits the buffer to be filled in 3x8b format. If i don’t write 0x01 to the control signals, the ip core hangs at dma.sendchannel.wait(), if i do start the core, it waits on dma.recvchannel.wait(). this would imply that the core does something at least.

That’s exactly what I was talking about. If HLS expects 100 1x32b = 3.2kb data, and you only provide 100 3x8b = 2.4kb data, since HLS core is expecting more data, it will not generate TLAST signal on its output side. This way the DMA is hanging assuming the transaction is not complete.

i’ve been trying to make it work by changing the data types. I fed my IP core random data in all shapes and sizes and tried receiving it in all shapes and sizes and none of it worked. Are there any good resources on the use of HDMI? I may try it with video data: many more people do it that way.

I am guessing it should not be that complicated - maybe it is just a small thing you ignored.

From your HLS code, it looks like (I have changed the gray_image dimension as you can notice):

typedef hls::Mat<max_height,max_width,HLS_8UC3> RGB_image;
typedef hls::Mat<max_height,max_width,HLS_8UC1> gray_image;

You have

RGB_image imggray(yres,xres);

which I think should be changed to:

gray_image imggray(yres,xres);

I don’t know why you converted the color space back to RGB again (so the HLS IP is doing RGB->gray->RGB?).

The reason why you need to change from RGB_image to gray_image is that you need to check if this is consistent with what
hls::CvtColor<HLS_BGR2GRAY> requires.