My accelerator DMA output is giving zeros

anoir_nechi · October 27, 2021, 3:13pm

I have created the following IP with Vitis HLS 2020.2 (Please check the attached files pobj.cpp (2.5 KB) pobj.hpp (327 Bytes)) and successfully synthesized and generated an IP. After that I created the following block design:

and generated the bitstream also successfully
Then I started the python coding step as follows:

from pynq import (allocate, Overlay)
import numpy as np
ol = Overlay('pobj.bit')

# Define dimensions
M = 64
N = 512
# Allocate memory for DMA transfers
A_buffer = allocate(shape=(M,N), dtype=np.float64, cacheable=False)
X_buffer = allocate(shape=(N,1), dtype=np.float64, cacheable=False)
Y_buffer = allocate(shape=(M,1), dtype=np.float64, cacheable=False)
Z_buffer = allocate(shape=(M,1), dtype=np.float64, cacheable=False)
P_buffer = allocate(shape=(2,1), dtype=np.float64, cacheable=False)

CTRL_REG = 0x00
AP_START = (1<<0) # bit 0
AUTO_RESTART = (1<<7) # bit 7
def run_kernel():
    dma_A.sendchannel.transfer(A_buffer)
    dma_X.sendchannel.transfer(X_buffer)
    dma_YZ.sendchannel.transfer(Y_buffer)
    dma_param.sendchannel.transfer(P_buffer)
    
    dma_YZ.recvchannel.transfer(Z_buffer)
    dma_param.recvchannel.transfer(P_buffer)
    
    pobj_ip.write(CTRL_REG, (AP_START | AUTO_RESTART))  # initialize the module
    
    dma_A.sendchannel.wait()
    dma_X.sendchannel.wait()
    dma_YZ.sendchannel.wait()
    dma_param.sendchannel.wait()
    
    dma_YZ.recvchannel.wait()
    dma_param.recvchannel.wait()

A = np.random.rand(M, N).astype(dtype=np.float64)
X = np.random.rand(N,1).astype(dtype=np.float64)
Y = np.random.rand(M,1).astype(dtype=np.float64)
P = np.zeros((2,1)).astype(dtype=np.float64)
P[0] = 0.5           #lambda

A_buffer[:] = A
X_buffer[:] = X
Y_buffer[:] = Y
P_buffer[:] = P

%%timeit
run_kernel()
# 100 loops, best of 3: 17.1 ms per loop
print(Z_buffer)
# all zeros

First of all the IP is so slow and more importantly, its output is zeros … What’s wrong?
How could I solve these issues?

Roua_Zaied · June 6, 2022, 8:36am

i try to run your code but i hve missing string.h file
INFO: [HLS 200-10] Analyzing design file ‘roua/pobj.cpp’ …
ERROR: [HLS 207-812] ‘string.h’ file not found: roua/pobj.cpp:2:10
INFO: [HLS 200-111] Finished Command csynth_design CPU user time: 0.76 seconds. CPU system time: 0.55 seconds. Elapsed time: 0.9 seconds; current allocated memory: 197.892 MB.
command ‘ap_source’ returned error code
while executing
“source /home/roua/Desktop/roua/solution1/csynth.tcl”
invoked from within
“hls::main /home/roua/Desktop/roua/solution1/csynth.tcl”

Topic		Replies	Views
Custom HLS IP block using DMA Support	2	510	December 2, 2022
DMA output all zeros with custom IP Support	8	1471	January 8, 2024
DMA sendchannel.transfer() and wait() stuck with larger buffer Support	6	134	June 18, 2025
Kria PYNQ DMA connect problem Support	3	41	March 10, 2025
DMA data transfer to image processing HLS IP is not working Support	1	1246	August 19, 2020

My accelerator DMA output is giving zeros

Related topics