PYNQ: PYTHON PRODUCTIVITY

DMA halted error in PZ7030 custom board

Hi All,
I am writing constant data through dma to ddr continuously and I am trying to read continuously from the dma using dma.recvchannel.transfer(out_buffer) function and i am able to read data for 16 time, till 16th time its working fine but after 16th time dma is going to halt state, what could be the problem ?
and how can i restart dma ?

here is the register output till 16th time and after 16th time
till 16th time
RegisterMap {
MM2S_DMACR = Register(RS=1, Reset=0, Keyhole=0, Cyclic_BD_Enable=0, IOC_IrqEn=1, Dly_IrqEn=0, Err_IrqEn=0, IRQThreshold=1, IRQDelay=0),
MM2S_DMASR = Register(Halted=0, Idle=0, SGIncld=0, DMAIntErr=0, DMASlvErr=0, DMADecErr=0, SGIntErr=0, SGSlvErr=0, SGDecErr=0, IOC_Irq=0, Dly_Irq=0, Err_Irq=0, IRQThresholdSts=0, IRQDelaySts=0),
MM2S_CURDESC = Register(Current_Descriptor_Pointer=0),
MM2S_CURDESC_MSB = Register(Current_Descriptor_Pointer=0),
MM2S_TAILDESC = Register(Tail_Descriptor_Pointer=0),
MM2S_TAILDESC_MSB = Register(Tail_Descriptor_Pointer=0),
MM2S_SA = Register(Source_Address=0),
MM2S_SA_MSB = Register(Source_Address=0),
MM2S_LENGTH = Register(Length=0),
SG_CTL = Register(SG_CACHE=0, SG_USER=0),
S2MM_DMACR = Register(RS=1, Reset=0, Keyhole=0, Cyclic_BD_Enable=0, IOC_IrqEn=0, Dly_IrqEn=0, Err_IrqEn=0, IRQThreshold=1, IRQDelay=0),
S2MM_DMASR = Register(Halted=0, Idle=1, SGIncld=0, DMAIntErr=0, DMASlvErr=0, DMADecErr=0, SGIntErr=0, SGSlvErr=0, SGDecErr=0, IOC_Irq=1, Dly_Irq=0, Err_Irq=0, IRQThresholdSts=0, IRQDelaySts=0),
S2MM_CURDESC = Register(Current_Descriptor_Pointer=0),
S2MM_CURDESC_MSB = Register(Current_Descriptor_Pointer=0),
S2MM_TAILDESC = Register(Tail_Descriptor_Pointer=0),
S2MM_TAILDESC_MSB = Register(Tail_Descriptor_Pointer=0),
S2MM_DA = Register(Destination_Address=939835392),
S2MM_DA_MSB = Register(Destination_Address=0),
S2MM_LENGTH = Register(Length=8192)
}

after 16th time
RegisterMap {
MM2S_DMACR = Register(RS=0, Reset=0, Keyhole=0, Cyclic_BD_Enable=0, IOC_IrqEn=1, Dly_IrqEn=0, Err_IrqEn=0, IRQThreshold=1, IRQDelay=0),
MM2S_DMASR = Register(Halted=1, Idle=0, SGIncld=0, DMAIntErr=0, DMASlvErr=0, DMADecErr=0, SGIntErr=0, SGSlvErr=0, SGDecErr=0, IOC_Irq=1, Dly_Irq=0, Err_Irq=0, IRQThresholdSts=0, IRQDelaySts=0),
MM2S_CURDESC = Register(Current_Descriptor_Pointer=0),
MM2S_CURDESC_MSB = Register(Current_Descriptor_Pointer=0),
MM2S_TAILDESC = Register(Tail_Descriptor_Pointer=0),
MM2S_TAILDESC_MSB = Register(Tail_Descriptor_Pointer=0),
MM2S_SA = Register(Source_Address=0),
MM2S_SA_MSB = Register(Source_Address=0),
MM2S_LENGTH = Register(Length=0),
SG_CTL = Register(SG_CACHE=0, SG_USER=0),
S2MM_DMACR = Register(RS=0, Reset=0, Keyhole=0, Cyclic_BD_Enable=0, IOC_IrqEn=0, Dly_IrqEn=0, Err_IrqEn=0, IRQThreshold=1, IRQDelay=0),
S2MM_DMASR = Register(Halted=1, Idle=0, SGIncld=0, DMAIntErr=1, DMASlvErr=0, DMADecErr=0, SGIntErr=0, SGSlvErr=0, SGDecErr=0, IOC_Irq=1, Dly_Irq=0, Err_Irq=1, IRQThresholdSts=0, IRQDelaySts=0),
S2MM_CURDESC = Register(Current_Descriptor_Pointer=0),
S2MM_CURDESC_MSB = Register(Current_Descriptor_Pointer=0),
S2MM_TAILDESC = Register(Tail_Descriptor_Pointer=0),
S2MM_TAILDESC_MSB = Register(Tail_Descriptor_Pointer=0),
S2MM_DA = Register(Destination_Address=939835392),
S2MM_DA_MSB = Register(Destination_Address=0),
S2MM_LENGTH = Register(Length=8192)
}

It looks like the S2MM/recv channel is signalling an internal error. According to the product guide this could be for the following reasons:

DMA Internal Error. This error occurs if the buffer length specified in the fetched descriptor is set to 0. Also, when in Scatter Gather Mode and using the status app length field, this error occurs when the Status AXI4-Stream packet RxLength field does not match the S2MM packet being received by the S_AXIS_S2MM interface. When Scatter Gather is disabled, this error is flagged if any error occurs during Memory write or if the incoming packet is bigger than what is specified in the DMA length register.

This error condition causes the AXI DMA to halt gracefully. TheDMACR.RS bit is set to 0, and when the engine has completely shut down, the DMASR.Halted bit is set to 1.

• 0 = No DMA Internal Errors.
• 1 = DMA Internal Error detected.

Parsing that for your use case seems to indicate that the most likely event is that the function feeding your DMA engine is sending more that 8192 bytes of data on the 16th iteration for some reason.

You can use .start() functions on the DMA channels to restart them if the system is halted.

Peter

Hello Peter Orden,
thanks for the reply!!
after 16th read i restarted the dma using dma.recvchannel.start() function but still i am facing the same isuue.

Hello Peter Orden,
thanks for the reply!!
After 16th read i restarted the dma using dma.recvchannel.start() function but still i am facing the same issue.

Thanks,
Gireesha

It’s likely the underlying problem with the stream is still there - can you chipscope the AXI stream connection to the DMA and get a better look at what’s going on?

Peter

hello Peter Orden,
thanks for the reply!!
Okay i will do that and one more question
to restart dma i am using only dma.recvchannel.start() function
is this function is enough to restart dma ? or any sequence is there to restart the dma ?

That will restart the DMA. There is a separate notion of resetting the DMA but we don’t have an API for that. You can do it manually by doing dma.register_map.S2MM_DMACR.Reset = 1 and waiting for a few milliseconds. You’ll need to .start the channel afterwards.

Peter

Hi Peter

I have tried verifying data by ILA, I tried capturing 16 frames of 8192 bytes each. And i observed there was no length miss match between frames. But the problem still remains same .Can you please help me with the solution.

Regards
Akarsh

Are you able to share your design in a form that’s runnable?

Hello peter,

dma_cam_1.zip

I have attached my project. Please find the attachment.

Regards

Gireesha

I don’t have permission to download that file

Peter

Hello peter,

please download the project here.

Regards,
Gireesha

Do you have the code/notebook to go along with it?

hello Peter,
please find the code

from pynq import Overlay
from pynq import Xlnk
import numpy as np
import pynq.lib.dma
from pynq import allocate
import time
from PIL import Image
ol = Overlay("/home/xilinx/pynq/overlays/dma/apk_6.bit")
ol?
get_ipython().magic(‘pinfo ol’)
dma_fifo = ol.axi_dma_0
print(dma_fifo.register_map)
out_buffer = allocate(shape=(2048,), dtype=np.uint32)
array = np.zeros([100, 2048, 3], dtype=np.uint8)
for k in range(100):
print("\n")
print(k)
dma_fifo.recvchannel.start()
dma_fifo.recvchannel.transfer(out_buffer)
for m in range(2048):
array[k,m] = [ (out_buffer[m] & 0x000000FF), ((out_buffer[m] & 0x0000FF00) >> 0x8), ((out_buffer[m] & 0x00FF0000)>> 0x10)]
img = Image.fromarray(array)
img.save(‘image.png’)
print(“image written\n”);

Regards,
Gireesha

You have an LED attached to your FIFO’s almost_full output, is this showing anything?

My best guess for what’s going on is that your apk_0 core isn’t handling tready at all meaning that when the DMA is being recofigured or started up arbitrary data-beats are being lost - meaning that tlast isn’t occuring when it should be (if at all). The 16 iterations is suspicious because it’s exactly the size of your data FIFO which would be the amount of “clean” data before things start getting dropped. A quick check would be to change the size of the dma/axis_data_fifo_0 and see if that changes which iteration the DMA stops working.

Peter

Hello Peter,

  1. led state was changing when fifo full.
  2. we have changed the fifo size also but still we are facing the same issue.

Regards,
Gireesha