How to allocate DMA buffer for 2D FFT parallelization?

Hi, I am implementing 2D FFT on my ZCU104 board and using PYNQ for testing. The image is unit8 512 by 512 . Since it is uint8, I am able to parallelize my design by streaming 256-bit at a time. The hardware design works, however, I have to reshape the image before copying it to the DMA buffer. This is because it reads a whole column before going to the next column making the rows change constantly. I want to have the same rows being read until all columns in the rows have been read.

Right now I am slicing and stacking my image from (512, 512) to (512*16, 32) for it to work as intended. This task is “meh” and adds to pre-processing. Is there a way to tell the allocate function how the image is to be represented in memory without the need for slicing and stacking? I have looked around the forums, but I have not found an answer.

Best regards,
Asbjørn

2 Likes

Hi @Asmami,

Welcome to the PYNQ community.

Is there a way to tell the allocate function how the image is to be represented in memory without the need for slicing and stacking?

No, order is not supported by the allocate, I believe this is a constraint from the driver. But, you could do something like this using numpy

shape = (2,2)
buf0 = allocate(shape=shape, dtype=np.int8)
buf1 = allocate(shape=shape, dtype=np.int8)
for i in range(shape[0]):
    for j in range(shape[1]):
        buf0[i][j] = i * shape[0] + j

# Transpose (change rows and columns order)
buf1[:] = buf0.T
print(f'{buf0=}\n{buf1=}')

Or you can try with reshaping the array, I suggest you use the built-in numpy APIs which are super efficient.

Mario

1 Like

Thanks for the quick answer. It was not what I hoped for, however, it was expected. Thanks for the optimization tips.

Asbjørn

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.