DMA goes wrong when I send the data above some specific range


board : pynq-z2
version : v2.5

I try to modify the DMA tutorial from the website below:

What I want to do is make a 3*3 convolution in PL and use DMA to send data between PS and PL.
My design is setup as below:

When I send the data below some specific range, the design works correct. When i send above the specific range, the input data is fine at the first time but missing some value when I run again. The DMA stuck at sendchannel when i re-run many time.

Do I setup wrong or i should re-design my convolution? It seems to work well if the input data is within the range.


What do you mean by range?

I see that your kernel is an RTL kernel, most of the times the DMA related issues are because tlast is not handled properly.
If this is the case, you need to post your question in the Xilinx forums, as this is not really related to pynq.


Hi, thanks to your reply!
I’m not sure the problem is the tlast or not. I put my design in the middle of the FIFO and the FIFO depth is 1024. When I try to get the 1031 results from output buffer, the DMA failed. I have tried to increase the output FIFO depth to 2048, it can work but I don’t understand why.

It would help if you could post your code.
You may have a race condition. Are you using a dma.recvchannel.wait() ?


Hi, thanks to your reply!
yes, I am using a dma.recvchannel.wait().
below is my code:

If I tried to get 1031 output from buffer, the DMA failed when I transfer data few times.

Are there something I need to understand or something I should be careful?

Are you sure you are sending/receiving the right amounts of data?
It looks like you are sending more data than you receive. If the IP produces the same amount of data as it consumes, then this is your issue. You will eventually reach deadlock.

You could check this quickly by trying to increase the size, or do some extra transfers on the recv DMA.

I looks like you are executing the same cells multiple times and out-of-order so it is difficult to track how exactly you are executing your code. It may help if you can show more clearly how your code executes.


Hi! Thanks for your reply!
I try to design a 3*3 convolution IP, so I will send more data than I receive.

I execute the first four cells once and then run the last cell multiple times.