I re-built the base overlay for the pynq-Z1 (image version 2.7) and added an HLS IP between the output of the VDMA (MM2S) and the HDMI_OUT IP (in_stream):
I have an hdmi camera connected to the HDMI IN and a screen connected to the HDMI OUT. I can see the images from the camera when I load the base overlay but when I load the passthrough one, there is no HDMI signal and the screen goes into sleep mode.
The HLS IP takes RGBA (32 bit) pixels, that is why I sent the hdmi {in,out} mode to PIXEL_RGBA. Is there an issue with the HLS itself or where the IP is added into the block design?
Using structs and arrays is discouraged in Vitis HLS to works with streams, and (I believe) it should not work. In addition to that, you are not generating the user signal, which indicates start of frame.
Thanks for your help. I added the user signal but still doesn’t work. I still don’t get a “valid” hdmi signal and the screen goes into sleep mode.
I am not using Vitis but vivado HLS 2019.2 to generate the IP and then vivado 2020.2 for the overlay.
I’ve notice that the hdmi_in IP, out_stream has also tkeep but in_stream in hdmi_out doesn’t. Should the input of my IP also have tkeep or is that optional?
As you can see in the last image, the DMA has 32 bits for tdata and I set PIXEL_RGBA. I understand that I would get now four channels with 8 bits which is what I get with the pynqbuffer. However, if I allocate a buffer for the dma as in the tutorials input_buffer = allocate(shape=(5,), dtype=np.uint32) and output_buffer = allocate(shape=(5,), dtype=np.uint32), how can I set the pynqbuffer that I get the HDMI frame in to also have dtype=np.uint32 so I can stream each frame that comes from the VDMA through the DMA? How should I use the inframe which is a pynqbuffer from VDMA to stream the data to DMA instead of using input_buffer? As far as I understand, it is a datatype issue, isn’t it? I insists on this because the original IP (with only tlast works with dma using pre-stored images, I want now to use it with live-feed from the hdmi).
The buffer you get from the VDMA is always going to have multiple channels per each pixel. So, you want to pack 4x 8-bit into a 1x 32-bit element. In essence, you want to go from 1920x1080x4 unit8 to 1920x1080x1 uint32.
You can try to pack the 4 channels into a single uint32, but it is going to be slow as it involves shifts and multiplications. You can explore numpy frombuffer and tobytes to reinterpret to pixels.
Thanks again for your help and the “trick” to use the physical address. I tried with the frombuffer and tobytes methods but the performances decreased quite a lot as expected.
That’s a fair question. I wanted to do that as I am using a specific image processing library and I didn’t want to modify it. The passthrough following your example worked:
Or you can directly reuse buf_vdma, as pynq.allocate((1920,1080,4), dtype=np.uint8) is equivalent to pynq.allocate((1920,1080), dtype=np.uint32) in physical memory.