DMA for float array

Hey everyone,

I’m following the DMA tutorial, but I’m working with floats. Here is my code:

#include "dma_copy.h"
#include <iostream>

using namespace std;

void copier(STREAMTYPE& a, STREAMTYPE& b) {
#pragma HLS INTERFACE s_axilite port = return bundle = control
#pragma HLS INTERFACE axis port=a
#pragma HLS INTERFACE axis port=b

    DTYPE input[N];
    DTYPE output[N];

    while (1) {

        for (int i = 0; i < N; ++i) { // READ
            axis_t temp =;
            input[i] =;

        for (int i = 0; i < N; ++i) { // PROCESS
            output[i] = input[i] - 5;

        axis_t temp2;
        for (int i = 0; i < N; ++i) { // WRITE
   = output[i];
            temp2.last = i == N-1? true : false;
            temp2.keep = -1;

Here are my data structures:

#ifndef _DMA_COPY_H_
#define _DMA_COPY_H_

#include "hls_stream.h"
#include "ap_axi_sdata.h"

#define N 5

typedef float DTYPE;
typedef hls::axis<float, 0, 0, 0> axis_t;
typedef hls::stream<axis_t> STREAMTYPE;

void copier(STREAMTYPE& a, STREAMTYPE& b);


I used hls::axis<float, 0, 0, 0> after reading this topic. Here is my python code:

overlay = Overlay("dma_hls.bit")
dma = overlay.axi_dma_0

data_size = 5
input_buffer = allocate(shape=(data_size,), dtype=np.float32)
output_buffer = allocate(shape=(data_size,), dtype=np.float32)

features = [-0.2,2,0.333,4.1234, -0.62689]

for i in range(data_size):
    input_buffer[i] = features[i]


for i in range(data_size):
    print("Output vector", output_buffer[i])

Lastly, here is my block diagram:

When I first executed the python code, I got a RuntimeError: DMA does not support unaligned transfers; Starting address must be aligned to 6 bytes. I applied this solution.

Now, the python code doesn’t execute any code from ....wait() onwards. When I remove the lines containing ...wait() my output vector is still zero. I’m not sure why the DMA doesn’t work. Am I missing any step? Appreciate any help, thank you :))

1 Like

Hi @Pynq_userrrr,

You may want to have a look at Cathal’s DMA tutorial Tutorial: using a HLS stream IP with DMA (Part 1: HLS design)

Regarding your issue, I do not see the code to star the HLS IP.

You may also want to review the documentation about the pipeline directive, and the concept of Initiation Interval (II). Vitis HLS should be returning a warning in regards to this directive #pragma HLS PIPELINE II=0.


1 Like

Thank you for your reply @marioruiz, I did take a look at the tutorial and updated my python code as follows. Note: I changed typedef hls::axis<float, 0, 0, 0> axis_t; to typedef ap_axis<32,2,5,6> axis_t; and am using integers for now.

from pynq import Overlay
from pynq import allocate
import numpy as np

ol = Overlay("dma.bit")

dma = ol.axi_dma_0
dma_send = dma.sendchannel
dma_recv = dma.recvchannel

hls_ip = ol.copier_0

hls_ip.write(CONTROL_REGISTER, 0x81)


data_size = 5
input_buffer = allocate(shape=(data_size,), dtype=np.uint32)
output_buffer = allocate(shape=(data_size,), dtype=np.uint32)

for i in range(data_size):
    input_buffer[i] = i


for i in range(data_size):

However, I still get the same errors. I am including my DMA configuration too, in case something is incorrect here. I checked Allow Unaligned Transfers as I receive the errors mentioned above.

1 Like


Please read the DMA tutorial, your DMA and Vivado project are not setup properly.