Composable Overlay Questions

jcollier · January 28, 2022, 5:52pm

I am interested in the potential of PYNQ Video Composable Overlays for a project.

I have non-video (focal plane) data that I would like to be able to process and be able to send data from IC to IC. For example,
Receive data from the focal plane
DMA it into the PL
Adjust the Gain
Fix bad pixels
Suppress the background

I see that there are only a couple of supported boards. How difficult would it be to use a custom board?
How difficult would it be to convert focal plane data into a usable format? It will not be HDMI-input, 1080p/60Hz ,etc. It might be 4K/2 Hz for example.

Thanks,
John

DarthSimpson · January 29, 2022, 5:40pm

What is the electrical connection with the sensor, is it high speed serial, like MIPI-CSI, LVDS or is it a parallel format. If its MIPI-CSI you can/could connect via MIPI interface on a KV260 Kria board for example. Otherwise you might need an interfacing board to bring the sensor output into the xilinx board, and the PL logic to convert it into an AXI Video Stream.

You use the term DMA it into the PL… i think you mean stream it into the PL.

Guessing your pipeline should look something like :-

Sensor → interface board → PL Logic (data to video stream conversion) → image processing algos → vdma (for sharing in Pynq noteboard or redirected to a HMDI/Displayport output).

marioruiz · January 31, 2022, 8:14am

Hi @jcollier,

I will be posting a tutorial/example about the composable overlay shorly. The hardware side of the composable overlay can be implemented in any Xilinx FPGA as we use standard IP. On the other hand, the software side heavily relies on PYNQ.

As long as your data can be streamed, aka AXI4-Stream, you can use the concepts of a composable overlay.

Mario

jcollier · January 31, 2022, 8:09pm

@DarthSimpson - Thanks for the reply. Honestly, I don’t know the final configuration. There will be some electrical connection into the PL by the sensor team. I don’t know if it will be via a PHY or an interfacing board. I would guess the latter.

Yes - it would be stream it into the PL. Thanks for helping me on the nomenclature.

For now, I am doing using simulated data that will be received on the PS and then sent into the PL via DMA. What I am trying to do is chain multiple operations together to prevent the PS from being involved between each step. I want to have the PS send the data (array of floats) via DMA into the PL to do and excute processing steps like

Gain correction ( multiply the entire focal plane by a float)
Mask out bad pixels ( zero out values for certain pixels)
Apply a pixel clipping region
Return the processed frame

As it stands now, I need to create an IP that does all of these steps.

I am hoping that the AXI4-Stream switch (and composable overlay in general) will allow me the flexibility enable/disable and tune each step of the pipeline.

My concern was that it is not “video.” There is no HDMI timing signal, etc. It does not come in via the HDMI or DisplayPort IP.

I am new to this but have successfully updated and built the bitstreams for the FIR demo.(How to accelerate a Python function with PYNQ - FPGA Developer)

Thanks,
John

jcollier · January 31, 2022, 8:19pm

@marioruiz,

Thanks for the reply. My plan is to use PYNQ for this R&D effort and the answer you gave is very important - “As long as your data can be streamed, aka AXI4-Stream, you can use the concepts of a composable overlay.”

I only have a ZCU104 board for now that is not supported for the Video Overlay. I am purchasing a PYNQ-ZU so I can go through the examples.

Can you discuss the porting effort to get the Video Overlay to work on a different board? Our final system will likely be based on a XQ-ZU28DR board.

It seems like this is a good fit for what I am trying to do after watching the online demos and reading some documents. Or is this specifically tailored for video processing standards?

Thanks,
John

DarthSimpson · February 1, 2022, 9:46am

You do not need to be worried about making sure your video matches HDMI timings. The key thing is that your video is using an AXI Video Stream format and AXI side band signals TUSER (used for start of video frame) and TLAST (used for end of each line). The pipeline will process video frames, as fast as the IP blocks allow them too. I have implemented processing pipelines with framerates >1kHz, and used Pynq to display captured images using VDMA transfer (obviously not at 1kHz) into a Pynq notebook as a sanity check.

I havent used the composable overlay yet, but im certain it can do this…

So without a real sensor yet in your system, you are injecting data into the PL for processing using (V)DMA. That is fine. I have done with with a webcam before. As long as your data is formatted correctly for image processing, e.g. 8bit R,G,B channel etc.

I would start of with basic setup, PS → PL (via VDMA) , some basic processing and then PL → PS (via VDMA)

Regarding image processing algo, sounds like a perfect job for Vivado/Vitis HLS? have you developed anything with Vivado/Vitis HLS yet?

What status is your design at the moment?

marioruiz · February 1, 2022, 10:00am

Hi @jcollier,

If you want to port the composable video pipeline to the ZCU104 completely, the effort is huge as you will have to add DFX regions.

However, if you are not interested in the DFX portion the effort to implement a composable overlay is minimal. I am finalising the documentation/tutorial for this, the steps to create your own composable overlay should be clear then.

Mario

jcollier · February 10, 2022, 3:14pm

Thanks. I look forward to seeing the tutorial.

ErikSwan · February 17, 2022, 1:50am

Hi @jcollier, where are you purchasing it from? I stumbled on this thread by chance because I am also looking to purchase a PYNQ-ZU board for an HDMI 2.0 project. I’ve been searching for a couple of months now but I haven’t been able to find anywhere to actually buy it.

If you’d be willing to share where/how you were able to buy one, it would be massively appreciated. Thanks!

jcollier · February 17, 2022, 1:08pm

I didn’t know it was so difficult either. I have been approved to buy one and haven’t been able to source it either. I have sent multiple emails to TUL and not gotten a response. I have checked ebay and aliexpress with no success either. Sorry that we are in the same boat.

ErikSwan · February 18, 2022, 6:36am

That has been my experience as well.

I’ve sent emails to TUL (and directly to Xilinx) without a response. Granted I am only a hobbyist so perhaps I’m just being ignored, but so far, despite lots of documentation on the board online, the board itself seems like vaporware.

If you do manage to get your hands on one, please let us know!

jcollier · February 18, 2022, 5:27pm

Would it be possible to add support for a ultrascsale+ board (like the ZCU-104)? There is apparently a supply issue for the PYNQ-ZU boards.

Thanks,
John

marioruiz · March 3, 2022, 11:35am

Hi @jcollier,

We do not plan to support the ZCU104. You can port the composable video pipeline or create your own composable overlay. In this tutorial you can find the basic steps to create a composable overlay

https://pynq-composable.readthedocs.io/en/latest/tutorial/composable_overlay.html

Mario

jcollier · March 4, 2022, 2:50pm

@marioruiz Thanks for the pointer to the overlay tutorial. I am looking at that right now. I saw that the Kria KV260 is supported and tried to get one of those and they are unavailable right now also. (Distributor estimates are August, 2022).

I have been taking the PYNQ HelloWorld demo and changing out functions (like the Gaussian Blur) and have a question comparing the Overlay src files to the HelloWorld src files:

The Composable Overlay files use AXIvideo2xfMat and the HelloWorld file creates their own axis2mat function. I saw a comment that the HelloWorld doesn’t use the video memory while the new one does. If I am not trying to use HDMI in/out, should I use the custom functions? I am not sure of the differences in memory use cases. Can you provide some insight or pointers to docs that could help?

Thanks,
John

marioruiz · March 4, 2022, 3:17pm

Hi,

The PYNQ HelloWorld uses a regular DMA as you indicate, this IP does not provide the necessary information to interpret an image. That’s the reason why we use the custom axis2mat to generate start of frame and end of line.

The VDMA generates these signals, you can refer to the PG043 for more information about how a video stream is encoded in AXI4-Stream. Page 13 onwards.

If you want to interoperate with the Vitis vision libraries you should use AXIvideo2xfMat

Mario

jcollier · March 4, 2022, 3:29pm

@marioruiz Thanks for the quick answer. I have a clarifying question (of course).

If my use case doesn’t use hdmi in/out but want to use VItis vision OpenCV functions in HLS, should I use regular DMA and the custom axis2mat or should I use AXIvideo2xfMat? I am trying to understand my options.

Some background on my application (which I am sure is not unusual): I currently get data on the PS and send it to the PL for processing. I want to use the axis switch to change the processing steps in the PL. Long term, we will get our data from a PHY on the PL side, process it, and then send processed data to the PS.

This sounds very much like a great use case for the composable overlay.

Thanks much,
John

marioruiz · March 4, 2022, 3:43pm

Hi,

Personally, I prefer VDMA in this context because of the sense of frames. You will have to use pixel pack and unpack if you go with this option.

If you use DMA, you can move the axis2mat and mat2axis to individual IP. Because all of the Vitis Vision function expect the images/frames to be encoded as indicated in the PG043.

Both alternatives have their pros/cons. It is up to you as designer to use the one that makes the most sense in your design.

Additionally, I would suggest you read documentation for the Python drivers available for these two IP as part of your evaluation.

https://pynq.readthedocs.io/en/latest/pynq_package/pynq.lib/pynq.lib.dma.html#module-pynq.lib.dma

github.com

Xilinx/PYNQ/blob/master/pynq/lib/video/dma.py#L72


      
                  return frame
          
          
    def return_pointer(self, pointer):
                  if len(self._cache) < self._capacity:
                      self._cache.append(pointer)
          
          
    def clear(self):
                  self._cache.clear()
          
          

          
class AxiVDMA(DefaultIP):
              """Driver class for the Xilinx VideoDMA IP core
          
          
    The driver is split into input and output channels are exposed using the
              readchannel and writechannel attributes. Each channel has start and
              stop methods to control the data transfer. All channels MUST be stopped
              before reprogramming the bitstream or inconsistent behaviour may result.
          
          
    The DMA uses a single ownership model of frames in that frames are either
              owned by the DMA or the user code but not both. S2MMChannel.readframe
              and MM2SChannel.newframe both return a frame to the user. It is the

Mario

jcollier · March 4, 2022, 3:44pm

Thanks again - I appreciate the information and the pointers to docs. Have a great day.
John

Topic		Replies	Views
The Composable Video Pipeline Announcements	15	4101	March 10, 2022
Sending frame from PYNQ z2 PS to FPGA Support	12	1353	May 5, 2022
Composable Video pipeline for the KR260 Support	6	671	June 28, 2023
Composable Overlay not working on PYNQ-ZU Support	20	954	February 9, 2023
PYNQ Composable Overlay Default Path Support	30	1281	March 27, 2023

Composable Overlay Questions

Related topics