Improve numpy / frame processing

Hi,

I am trying to copy a 800x600 frame captured from the HDMI input into a bigger frame for output and its working great. Here is the code (pretty simple).

out_frame = hdmi_out.newframe()
for _ in range(numframes):
f = hdmi_in.readframe()
out_frame[0:600, 0:800] = f[0:600, 0:800]
hdmi_out.writeframe(out_frame)

The thing is that it is not a fast as needed (Pynq Z2 - 20fps).

My question is: How can I improve that?

  1. Maybe adding and IP that can do the copying process on logic? I thinks it is the best approach but out of reach for my skills right know.

  2. Maybe an optimized way of copying the frame that I don’t know

In the end what I need to do is a configurable PiP where the main source is a python generated graphic and the overlay is the HDMI input.

I’d love to hear your thoughts.
Thanks!

Have you tried setting cacheable_frames=False as shown in cell 7 of the HDMI introduction notebook?

On ZYNQ-7000 we get limited by how quickly we can flush the cache so if your computation is simple you might get a significant bump by disabling the cache for frames entirely.

Peter

Thanks Peter!

It made a difference 20 → 27fps. That is a lot. I think I need to push this number a bit further so I suppose I need to explore a hardware solution instead. is’n It?

You might be able to do something somewhat hacky using the existing base overlay - note that this hasn’t been tested it’s just a sketch of an idea.

The key is seeing that the VideoMode class has a stride attribute which sets the offset of each line. By default it’s the number of bytes in a line as the frame is contiguous but that’s not necessary.

If your output frame is smaller than your input frame (as seems to be implied in your post) you set the hdmi_out.mode to have the same stride as hdmi_in.mode. Now you can pass the create an output frame referencing the same buffer out_frame = in_frame[0:600,0:800] and pass that directly to hdmi_out.writeframe.

If the output is bigger that’s more painful as we don’t have a way to overallocate the number of lines as part of the video mode so you’d need to work out how to modify the existing Python code to do that.

Peter

Thanks again!

As you say the output is bigger 1280x720, and the input 800x600. The idea is to have some kind of background where the HDMI input is overlayed. Maybe this is too much for a Z2, or maybe the only way is having a logic on the FPGA that could do the copying via VDMA. In my mind the idea is to have two framebuffers that the logic will blend without Python having to intervene. But for that I need to hire someone, it would take me weeks to figure it out if even possible.

Thanks a lot for your help!

For those who may read this post. I found a solution that I am trying out now. You can use GitHub - Xilinx/PYNQ-ComputerVision: Computer Vision Overlays on Pynq and the remap function.