I am interested in the potential of PYNQ Video Composable Overlays for a project.
I have non-video (focal plane) data that I would like to be able to process and be able to send data from IC to IC. For example,
Receive data from the focal plane
DMA it into the PL
Adjust the Gain
Fix bad pixels
Suppress the background
I see that there are only a couple of supported boards. How difficult would it be to use a custom board?
How difficult would it be to convert focal plane data into a usable format? It will not be HDMI-input, 1080p/60Hz ,etc. It might be 4K/2 Hz for example.
What is the electrical connection with the sensor, is it high speed serial, like MIPI-CSI, LVDS or is it a parallel format. If its MIPI-CSI you can/could connect via MIPI interface on a KV260 Kria board for example. Otherwise you might need an interfacing board to bring the sensor output into the xilinx board, and the PL logic to convert it into an AXI Video Stream.
You use the term DMA it into the PL… i think you mean stream it into the PL.
Guessing your pipeline should look something like :-
Sensor → interface board → PL Logic (data to video stream conversion) → image processing algos → vdma (for sharing in Pynq noteboard or redirected to a HMDI/Displayport output).
I will be posting a tutorial/example about the composable overlay shorly. The hardware side of the composable overlay can be implemented in any Xilinx FPGA as we use standard IP. On the other hand, the software side heavily relies on PYNQ.
As long as your data can be streamed, aka AXI4-Stream, you can use the concepts of a composable overlay.
@DarthSimpson - Thanks for the reply. Honestly, I don’t know the final configuration. There will be some electrical connection into the PL by the sensor team. I don’t know if it will be via a PHY or an interfacing board. I would guess the latter.
Yes - it would be stream it into the PL. Thanks for helping me on the nomenclature.
For now, I am doing using simulated data that will be received on the PS and then sent into the PL via DMA. What I am trying to do is chain multiple operations together to prevent the PS from being involved between each step. I want to have the PS send the data (array of floats) via DMA into the PL to do and excute processing steps like
Gain correction ( multiply the entire focal plane by a float)
Mask out bad pixels ( zero out values for certain pixels)
Apply a pixel clipping region
Return the processed frame
As it stands now, I need to create an IP that does all of these steps.
I am hoping that the AXI4-Stream switch (and composable overlay in general) will allow me the flexibility enable/disable and tune each step of the pipeline.
My concern was that it is not “video.” There is no HDMI timing signal, etc. It does not come in via the HDMI or DisplayPort IP.
Thanks for the reply. My plan is to use PYNQ for this R&D effort and the answer you gave is very important - “As long as your data can be streamed, aka AXI4-Stream, you can use the concepts of a composable overlay.”
I only have a ZCU104 board for now that is not supported for the Video Overlay. I am purchasing a PYNQ-ZU so I can go through the examples.
Can you discuss the porting effort to get the Video Overlay to work on a different board? Our final system will likely be based on a XQ-ZU28DR board.
It seems like this is a good fit for what I am trying to do after watching the online demos and reading some documents. Or is this specifically tailored for video processing standards?
You do not need to be worried about making sure your video matches HDMI timings. The key thing is that your video is using an AXI Video Stream format and AXI side band signals TUSER (used for start of video frame) and TLAST (used for end of each line). The pipeline will process video frames, as fast as the IP blocks allow them too. I have implemented processing pipelines with framerates >1kHz, and used Pynq to display captured images using VDMA transfer (obviously not at 1kHz) into a Pynq notebook as a sanity check.
I havent used the composable overlay yet, but im certain it can do this…
So without a real sensor yet in your system, you are injecting data into the PL for processing using (V)DMA. That is fine. I have done with with a webcam before. As long as your data is formatted correctly for image processing, e.g. 8bit R,G,B channel etc.
I would start of with basic setup, PS → PL (via VDMA) , some basic processing and then PL → PS (via VDMA)
Regarding image processing algo, sounds like a perfect job for Vivado/Vitis HLS? have you developed anything with Vivado/Vitis HLS yet?
If you want to port the composable video pipeline to the ZCU104 completely, the effort is huge as you will have to add DFX regions.
However, if you are not interested in the DFX portion the effort to implement a composable overlay is minimal. I am finalising the documentation/tutorial for this, the steps to create your own composable overlay should be clear then.
Hi @jcollier, where are you purchasing it from? I stumbled on this thread by chance because I am also looking to purchase a PYNQ-ZU board for an HDMI 2.0 project. I’ve been searching for a couple of months now but I haven’t been able to find anywhere to actually buy it.
If you’d be willing to share where/how you were able to buy one, it would be massively appreciated. Thanks!
I didn’t know it was so difficult either. I have been approved to buy one and haven’t been able to source it either. I have sent multiple emails to TUL and not gotten a response. I have checked ebay and aliexpress with no success either. Sorry that we are in the same boat.
I’ve sent emails to TUL (and directly to Xilinx) without a response. Granted I am only a hobbyist so perhaps I’m just being ignored, but so far, despite lots of documentation on the board online, the board itself seems like vaporware.
If you do manage to get your hands on one, please let us know!
We do not plan to support the ZCU104. You can port the composable video pipeline or create your own composable overlay. In this tutorial you can find the basic steps to create a composable overlay
@marioruiz Thanks for the pointer to the overlay tutorial. I am looking at that right now. I saw that the Kria KV260 is supported and tried to get one of those and they are unavailable right now also. (Distributor estimates are August, 2022).
I have been taking the PYNQ HelloWorld demo and changing out functions (like the Gaussian Blur) and have a question comparing the Overlay src files to the HelloWorld src files:
The Composable Overlay files use AXIvideo2xfMat and the HelloWorld file creates their own axis2mat function. I saw a comment that the HelloWorld doesn’t use the video memory while the new one does. If I am not trying to use HDMI in/out, should I use the custom functions? I am not sure of the differences in memory use cases. Can you provide some insight or pointers to docs that could help?
The PYNQ HelloWorld uses a regular DMA as you indicate, this IP does not provide the necessary information to interpret an image. That’s the reason why we use the custom axis2mat to generate start of frame and end of line.
The VDMA generates these signals, you can refer to the PG043 for more information about how a video stream is encoded in AXI4-Stream. Page 13 onwards.
If you want to interoperate with the Vitis vision libraries you should use AXIvideo2xfMat
@marioruiz Thanks for the quick answer. I have a clarifying question (of course).
If my use case doesn’t use hdmi in/out but want to use VItis vision OpenCV functions in HLS, should I use regular DMA and the custom axis2mat or should I use AXIvideo2xfMat? I am trying to understand my options.
Some background on my application (which I am sure is not unusual): I currently get data on the PS and send it to the PL for processing. I want to use the axis switch to change the processing steps in the PL. Long term, we will get our data from a PHY on the PL side, process it, and then send processed data to the PS.
This sounds very much like a great use case for the composable overlay.
Personally, I prefer VDMA in this context because of the sense of frames. You will have to use pixel pack and unpack if you go with this option.
If you use DMA, you can move the axis2mat and mat2axis to individual IP. Because all of the Vitis Vision function expect the images/frames to be encoded as indicated in the PG043.
Both alternatives have their pros/cons. It is up to you as designer to use the one that makes the most sense in your design.
Additionally, I would suggest you read documentation for the Python drivers available for these two IP as part of your evaluation.