Unable to restart DMA after transmit decode following error

I’ve got an overlay with PS → DMA MM2S → FIFO (in packet mode) → User Cores → FIFO → DMA S2MM → PS. From looking in the System ILA I can see that when I kick off a transfer (in the same manner I’ve been doing from python with other bitstreams) my data gets sent in as I’d expect. The correct number of samples and their content are sent in bursts that make sense based on the DMA core’s config.

After this transfer the MM2S_DMASR (status reg) reports 0x5041: Halted due to a DMADecErr, with an error interupt. Any further attempt to use the core to send gets “RuntimeError: DMA channel not started”. Which makes sense as the core is halted.

Unfortunately the following clears the error but does not allow use of the core:
dma.register_map.MM2S_DMACR.Reset=1
time.sleep(1)
dma.register_map.MM2S_DMACR.RS=1
dma.register_map.S2MM_DMACR.RS=1
time.sleep(1)

resulting in “RuntimeError: DMA channel not idle”. Per the Xilinx DMA docs the DMA will report not idle until the first transfer after a reset. Perhaps there is something in pynq.lib.dma.DMA that also needs resetting, for that matter a reset() helper might be helpful here.

As to why I’m getting here in the first place, I’m really not sure.

This config (with unaligned support off) is what I’ve used in a number of overlays routed to a PS HPC Slave with 128 bit width through a smartconnect.

DMA channel not idle may means your DMA is still waiting for more data. I would suggest you change Width of Buffer Length Register to something large. 17bits means you can have a maximum of 128KB data to transfer. Since you did not paste your python code, I am just guessing your data buffer sizes have some mismatches somewhere.

Hi Rock

Just watching this thread… im having similar problems. I think i’ll start a new message it though…

But in the mean time, and apologies if this is a silly question, does the allocated space in Pynq have to be the same as the Buffer Length register? Can you have arbitary packets sizes, as long as it doesnt exceed Buffer Length Register.

@ baileyji i’d be keen to find out what your solution is… my project is an image processing one, and im extracting features from each image… feature list is variable on a frame by frame basis.

Thanks Darth

@DarthSimpson the space you allocate can be significantly smaller than the buffer length register. It may (I’m not clear on this, I imagine @Rock is) need to be multiples of the stream MM word size configured in the DMA core if unaligned transfers aren’t allowed AND/OR you may need to index when sending data so that your first send index is a multiple of the MM word size. For instance if you allocate a np.uint16 buffer and have a MM word size of 128 you may need to only start transfers from indices i*8, I’m not sure this was inherent in my data so I’ve never played.

I’ll update this when I know more. I pretty drastically oversimplified in my initial post. Rock is very much correct that trying to kick off a receive when the DMA core hasn’t gotten an inbound TLAST for S2MM will result in the message I posted.

In my case I don’t think that was the problem as I think I was kicking off a MM2S transfer that was smaller than the buffer length window into a pipeline that does not produce back pressure.

@rock I’d also configured an axi stream switch and a second DMA to try and support capturing data at different points in my pipeline. I gave up on that for now so if the problem comes back as I reintroduce it it might help sort out the root cause.

Yes you can transfer data as long as it does not exceed the maximum buffer length. For your user core, I don’t know whether it is Vivado or HLS core; for the first case, it is likely the TLAST is not generated properly. If you already have an ILA connected, maybe you can verify the transactions near TLAST.

Thanks Rock. Ill try a System ILA block and try and catch the transaction.

I have ran an RTL/CSIM cosimulation in HLS and TLAST is generated, but off course the System ILA should prove that out.

Thanks Baileyji for getting back. My transactions are 64bits at a time, and there can be a variable number of them per frame. I do not enable unaligned transfers, and i have the Buffer Width Length set to 23bits… Maybe i need to pad the transfers out… but just dont know yet :slight_smile: Hopefully i can make progress on this…