I’m working on a subclass of DefaultIP to drive the Xilinx Multichannel DMA in Scatter Gather mode so we can capture to the PL DDR4. I’ve got an overlay that uses the MIG and a SmartConnect to tie both the MCDMA and an AXI HP Master from the PS into PL DDR. I’m planning on SG Buffer descriptors in PS DDR and data going to PL DDR.
I suspect that I could manually populate all the addresses for the PL DDR4 based on the address space and that the MCDMA would merrily clobber anything already there or otherwise using those addresses (nothing is). Ultimately I’m going to be repeatedly filling up the DDR4 and doing analysis on the results so getting a numpy array of good bits of the data will be important. While I I think I could pull it out via MMIO.read my understanding is that this is going to be painfully slow.
Thoughts? Is this not a good approach?
Roughly speaking I’d like to be able to modify this snippet or extend allocate so I can do something like this:
class MCS2MMBufferChain: def __init__(self, n, buffer_size=8096, contiguous=True, zero_buffers=True): # TODO ensure starting address is a multiple of 0x40 self._chain = allocate(16*n, dtype=np.uint32) if contiguous: self.contiguous = True # TODO ensure starting address is on a stream width (here 0x40) boundary as DRE is disabled # TODO allocate in the PL DDR4 buff = allocate(buffer_size*n, np.uint8) self._buffers = [buff] self.bd_addr = [buff.device_address+buffer_size*i for i in range(n)] else: # TODO ensure starting address is on a stream width (here 0x40) boundary as DRE is disabled # TODO allocate in the PL DDR4 self._buffers = [allocate(buffer_size, np.uint8) for i in range(n)] self.bd_addr = [b.device_address for b in self._buffers]