Vector addition, correct output only on every other elements

sensei88 · September 26, 2021, 7:15am

I am implementing a simple vector addition function:
Z = X + Y
where: X, Y and Z are integer vector(array).

In HLS, I was able to confirm/simulate that array Z contains correct value. But, when implemented in PYNQ, the correct answer is found in every other elements.

For example: array X contains all 1, array Y contains all 2, but in array Z, the answers are all reflected on every other elements (i.e, z[0]=3;z[1]=0; z[2]=3; z[3]=0; z[4]=3; z[5]=0, etc.

Any suggestions will be appreciated.

***PYNQ code as follows:

from pynq import Overlay
# Load the overlay
overlay = Overlay('/home/xilinx/pynq/overlays/vecadd_int/vecadd_int3.bit')
# IP alias
vecadd=overlay.vecadd_int3_0

from pynq import Xlnk
import numpy as np

# Allocate contiguous buffer for memory transfer
xlnk = Xlnk()
x_buffer = xlnk.cma_array(shape=(N,), dtype=np.int)
y_buffer = xlnk.cma_array(shape=(N,), dtype=np.int)
z_buffer = xlnk.cma_array(shape=(N,), dtype=np.int)
# Copy the DNA string to the in_buffer
np.copyto(x_buffer,vec_x)
np.copyto(y_buffer,vec_y)
np.copyto(z_buffer,vec_z)
# check buffer status
xlnk.cma_stats() 

# initialize AXI with address and length, 
vecadd.write(0x10,z_buffer.physical_address)  # vector Z is AXI so need to initialize address
vecadd.write(0x18,y_buffer.physical_address)  # vector Y is AXI so need to initialize address
vecadd.write(0x20,x_buffer.physical_address)  # vector X is AXI so need to initialize address
vecadd.write(0x28,N)                          # initialize N
vecadd.write(0x00,0x01) # start

while vecadd.read(0x00 & 0x4)!= 0x04:
    pass

vecadd.write(0x00,0x00) # stop   
np.copyto(vec_z, z_buffer)

***HLS code as follows:

#include <string.h>
typedef int data_t ;

void vecadd_int3(volatile data_t *z, volatile const data_t *y, volatile const data_t *x, unsigned int N) {
#pragma HLS INTERFACE m_axi port=z offset=slave depth=32767 bundle=out_z
#pragma HLS INTERFACE m_axi port=y offset=slave depth=32767 bundle=out_y
#pragma HLS INTERFACE m_axi port=x offset=slave depth=32767 bundle=in_x
#pragma HLS INTERFACE s_axilite port=y bundle=cntl
#pragma HLS INTERFACE s_axilite port=x bundle=cntl
#pragma HLS INTERFACE s_axilite port=N bundle=cntl
#pragma HLS INTERFACE s_axilite port=return bundle=cntl
	data_t x_buff[32767];
	data_t y_buff[32767];
	data_t z_buff[32767];
	memcpy (y_buff, (const data_t*) y, N*sizeof(data_t));
	memcpy (x_buff, (const data_t*) x, N*sizeof(data_t));
	unsigned int i;
VECLOOP: for (i=0;i<N;i++)
#pragma HLS PIPELINE
#pragma HLS LOOP_TRIPCOUNT min=1024 max=32767
			z_buff[i] = x_buff[i] + y_buff[i];
	memcpy ((data_t*) z, z_buff, N*sizeof(data_t));
}

PeterOgden · September 27, 2021, 8:32am

My guess would be that there is a mismatch in the setting of the AXI port on the PS block in your diagram. You don’t say what version of board you are using but this should be handled automatically in 2.6 for all ZYNQ Ultrascale+ boards. For Zynq-7000 you’ll need to make sure that the HP slave ports on the PS match the board default (usually 64-bit width)

Peter

sensei88 · September 27, 2021, 9:01am

Thank you @PeterOgden for your insight.

I am using pynq z2 board with 2.5 version.

I set the HP slave ports to 32-bit (though default is 64-bit).

Let me try to check if the PS and PL match.

sensei88 · September 27, 2021, 12:09pm

I set the HP slave port back to default 64-bit and result is correct already!

Topic		Replies	Views
Pynq vector add half of results not correct Support	4	415	August 23, 2022
Vector operation with Stream in HLS and Vivado Support	13	2775	October 7, 2021
KR260 PYNQ ip calling problem Support	5	17	March 7, 2025
PYNQ3.0.1 Overlay output always 0 Support	6	263	March 22, 2024
Output array doesn't show result in PYNQ Support	3	626	September 17, 2020

Vector addition, correct output only on every other elements

Related topics