I am trying to use a VDMA to stream inputs with varying sizes to a consumer IP that does some processing on the inputs. Currently, I have to stop the IPs in the design and restart the Python kernel before I can change the video mode for the writechannel of the VDMA without messing up the output of the consumer IP. By messing up the output I mean the values are correct but out of position in the output buffer as if the whole buffer wrapped around and shifted a few locations.
What is the correct way of stopping the VDMA to allow inputs with different dimensions without restarting the kernel?
- Custom ZU+ SoC
- PYNQ 2.7.0 image
- Xilinx tools 2020.2
- Ubuntu 18.04.6 set up using Vagrant file
from pynq import Overlay from pynq.lib.video import VideoMode # Load overlay ol = Overlay('my_bitstream.bit') my_ip = ol.my_ip vdma = ol.vdma vdma_in = vdma.writechannel # Configure VDMA input_w = 224 # cols input_h = 224 # rows vdma_in.mode = VideoMode(input_w, input_h, 24) # Start IPs vdma_in.start() my_ip.start() # self.write(0, 0x81) in the "driver" for the IP # Send data input_data = (np.random.rand(input_h, input_w, 3) * 255).astype(np.uint8) in_frame = vdma_in.newframe() in_frame[...] = input_data vdma_in.writeframe(in_frame) # Get result out_data = my_ip.results() # returns data from output buffer of my_ip in the "driver" # Stop IPs my_ip.stop() # self.write(0, 0) in the "driver" for the IP vdma_in.stop() # Delete overlay del ol
I am trying to benchmark my consumer IP for various inputs sizes to see how the hardware implementation compares to software implementation for different input sizes. Some of the benchmarks may take for a while and would like to be able to run them back to back without stopping the kernel.
- Do I need to create a new input frame using
vdma_in.newframe()for every new input I pass to the VDMA? My benchmarked FPS drops by ~30% if I create a new frame for every new input.
When running in a loop I seemed to be able to get away with creating a single input frame outside of the loop and use it repeatedly. From the pynq.lib.video module documentation it seems that when I hand back the ownership of the buffer to the DMA using
vdma_in.writeframe(in_frame)I shouldn’t access the buffer (frame) again as it may be deleted without warning. I ran this set up of 1000s of iterations without any issue.
- When I provided inputs with dimensions not multiples of 4, my output becomes inconsistent. Is this a limitation of the VDMA or something likely in my IP?
Thank you for your help.