Can an ARRAY variable be both read and write?

I am implementing the following SAXPY function:
Y = a*X + Y
where: X and Y are float vector(array) and a is a float scalar variable.

In HLS, I was able to confirm/simulate that array Y contains the updated value. But, when implemented in PYNQ, array Y is NOT updated and still contain the original value.

Any suggestions will be appreciated.

***PYNQ code as follows:
from pynq import Overlay
overlay = Overlay(’/home/xilinx/pynq/overlays/saxpy/saxpy3.bit’)
from pynq import Xlnk
import numpy as np

Allocate contiguious buffer for memory transfer

xlnk = Xlnk()
x_buffer = xlnk.cma_array(shape=(N,), dtype=np.float32)
y_buffer = xlnk.cma_array(shape=(N,), dtype=np.float32)

Copy the DNA string to the in_buffer


initialize AXI with address and length,

sax.write(0x10,y_buffer.physical_address) # vector Y is AXI so need to initialize address
sax.write(0x18,x_buffer.physical_address) # vector Y is AXI so need to initialize address
sax.write(0x20,int(int(scalar_a))) # initialize scalar A
sax.write(0x28,N) # initialize N (number of elements)
sax.write(0x00,0x01) # start
while & 0x4)!= 0x04: # wait till its done
sax.write(0x00,0x00) # stop
np.copy(vec_y, y_buffer)

***HLS code as follows:

 #include <string.h>
 typedef float float_t ;
 void saxpy(volatile float_t *y, volatile const float_t *x, const float_t a, unsigned int N) {
 #pragma HLS INTERFACE m_axi port=y offset=slave depth=1000 bundle=OUT_Y
 #pragma HLS INTERFACE m_axi port=x offset=slave depth=1000 bundle=IN_X
 #pragma HLS INTERFACE s_axilite port=y bundle=ctrl
 #pragma HLS INTERFACE s_axilite port=x bundle=ctrl
 #pragma HLS INTERFACE s_axilite port=a bundle=ctrl
 #pragma HLS INTERFACE s_axilite port=N bundle=ctrl
 #pragma HLS INTERFACE s_axilite port=return bundle=ctrl
 	float x_buff[1000];
 	float y_buff[1000];
 	memcpy (y_buff, (float_t*) y, N*sizeof(float_t));
 	memcpy (x_buff, (const float_t*) x, N*sizeof(float_t));
 	unsigned int i;
 SAXLOOP: for (i=0;i<N;i++)
 #pragma HLS LOOP_TRIPCOUNT min=5 max=1000
 		   y_buff[i] = a * x_buff[i] + y_buff[i];
 	memcpy ((float_t*) y, y_buff, N*sizeof(float_t));

Yes, you can have read/write on the same AXI master.

I edited your post to format the code properly. Some of the formatting was messed up by the markdown converter. I think the code is OK. You have additional volatiles and const I don’t think you need, which would mean you wouldn’t need to cast back to the float pointer, but I don’t think they matter.

You have left out some of the Python code. I’m not sure what values you initialised your arrays to, and what value int(int(scalar_a)) has. ‘a’ would need to be a float so you may need to be careful with this when you write it.
You might want to check the value you write by reading back

If ‘a’ is zero, or very small, you may think Y is not updating as the algorithm is y[i] ~= y[i].


Thanks Cathal for the feedback.

Actually scalar_a should be floating-point. But I need to convert it to int because the register in axi-lite accepts only integer. Is there are suggestion how axi-lite accepts floating point?

I modified to code to handle integer datatype and it works :slight_smile: But floating-point still is a problem. I am intrigue with your last statement that “a” is zero or very small number… I will check by back reading I wonder if there is a problem converting integer to floating point and vice-versa? For the python input, I assign scalar_a with a value of 4.0

Yes, “scalar_a” is float, but if you just convert to int, it will round it.
E.g. 1.1 float is 0x3f8ccccd in hex, but int(1.1) is 1.
You need to get the integer representation of the float.

I’m not sure if there is a better way, I’d use this:


Check this. I have created float simple adder

void addfloat(int a, int b, int& c) {
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE s_axilite port=a
#pragma HLS INTERFACE s_axilite port=b
#pragma HLS INTERFACE s_axilite port=c

float afloat =  *(float*)&a;
float bfloat =  *(float*)&b;
float cfloat = afloat+bfloat;
c = *(int*)&cfloat;


Thank bartokon! That is a short and nice example of floating-point to integer :slight_smile: Will try to adopt your floating-to-integer conversion and vice versa and try it in the saxpy project.

I have questions … what does *(float*)&a as well as *(int*)&cfloat means?


@bartokon why are you using int’s on you HLS function when you want float’s? I think you should be able to replace all the casting by using floats, and replacing your code with:
*c = a+b; (where c is passed as a pointer)

I’m not sure that this casting/dereferencing is actually doing anything in HLS:

c = *(int*)&cfloat;

I think this should be the equivalent, although in your code you may want to cast the value of cfloat to an int:

c = cfloat; 

If it isn’t the same, I’d really like to understand why you did this.


1 Like

Hmm, You mean something like that?:

If you would like me to check something just write full code and I will check :slight_smile:

but you can’t write float to axi register

If you could port this design to axi-stream and share code that would be great :smiley:

Will complete my SAXPY floating-point project and share it with everybody.

Again @bartokon and @cathalmccabe for the helping out with questions.

Appreciate it!

Wait a second! @cathalmccabe was right.

Curious why declare as float &c and not as pointer (float *c)?

Though from your output it works.

Look here:

Also, I read somewhere that &should be used for writing (I think somewhere in Vitis documentation)

It also works on dma’s :smiley:

So focused with array (and pointers) that I forgot the basic :slightly_smiling_face:

1 Like

Apparently DMA doesn’t care if something is uint 32or float unlike the registers…

I finally solved the problem and completed the SAXPY IP project.

Thank you again @cathalmccabe for pointing out that a could be a very small value (almost bordering 0) and that is what happen. The equation of saxpy is y = ax +y.

With a almost zero, the equation becomes y[i] ~= y[i]. That is why I thought that array y has the same value.

The culprit? a = int(scalar_a). For example: if scalar_a = 4.0, then a = 4. But, when it is passed to the HLS (which is defined as float a), it is receiving a value of 0x00000004 which in float ( IEEE-754) is an underflow (number so small it is approaching 0).

So, the solution is to translate the floating to its IEEE-754 representation. Again, @cathalmccabe has pointed the routine from stack overflow and modified by @bartokon to make it integer (instead of hex).

The function looks like this:

import struct
def float_to_int(f):
return int(struct.unpack(’<I’, struct.pack(’<f’, f))[0])

So, a = float_to_int(scalar_a)

The rest of the code listed above remains the same.

So, I guess that solves the mystery of the “array variable can be both read and write”.

As a takeaway, simply converting float to integer using the int() function won’t work!

1 Like

@sensei88 one thing please double-check what happens if y < 0. On float adder I had weird result’s coming back from accelerator!