Pynq custom overlay of multidimensional array calculation using vivado hls

Hi everyone, I am now writing a code of matrix multiplication with vivado HLS.
The following is my C code on vivado HLS.


I have exported it into verilog using RTL transfer it into IP.

I am now load the tcl file and bit file on my pynq ultra96 v2 board.
I want to use it with python code, but i met some problem of sending the input matrix into the function.
I do not know if my C code or python code is wrong, Hope someone can give me some advice!!!

Hi, I have created similar design for research purposes :smiley:

void MMULT(float array1[8], float array2[8][8], float array3[8][8][8], float array1o[8], float array2o[8][8], float array3o[8][8][8])

{
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE s_axilite port=array1
#pragma HLS INTERFACE s_axilite port=array2
#pragma HLS INTERFACE s_axilite port=array3
#pragma HLS INTERFACE s_axilite port=array1o
#pragma HLS INTERFACE s_axilite port=array2o
#pragma HLS INTERFACE s_axilite port=array3o

for(int i = 0; i < 8; i++)
{
	array1o[i]=array1[i];
};


for(int o=0; o < 8; o++)
{
	for(int i = 0; i < 8; i++)
	{
			array2o[i][o]=array2[i][o];
	};
};

for(int p=0; p < 8; p++)
{
	for(int o=0; o < 8; o++)
	{
		for(int i = 0; i < 8; i++)
		{
				array3o[i][o][p]=array3[i][o][p];
		};
	};
};

};

And created something like this in jupyter:
image

I can write one value by using base address and incrementing by 4bythes (size of int)
We could do probably whole array like this, but there must be another way! I will try to search for a solution.
We could just use DMA and allocate, but this way is more interesting…
Also, I have find out that PYNQ has some errors with register map if your array is bigger than ?64? units.
image

MMIO PS/PL Interfaces — Python productivity for Zynq (Pynq)

Any IP connected to the AXI Slave GP port will be mapped into the system memory map. MMIO can be used read/write a memory mapped location. A MMIO read or write command is a

single transaction to transfer 32 bits of data to or from a memory location

. As burst instructions are not supported, MMIO is most appropriate for reading and writing small amounts of data to/from IP connect to the AXI Slave GP ports.

Hi @bartokon, sorry for late reply. Thanks for giving helpful advice. How can you write the jupyter python code based on your overlay?

Tim

Maybe something like this:
As array is square for example A[2][2]
we could do:
mmio.write(ADDRESS_OFFSET+ROW+COLUMN, value)
COLUMN is just 4bythes (size of float for example)
and ROW can be COLUMN*2 because we have 2 float variables in one row.
You know what I mean? Virtual memory is sequential you just need to calculate distance and send new data.

Give it a try

Hi bartoken, i just modify your code to the following.
I just add integer 10 to the array2o.


But when i export to RTL, it occurs that there is no use ip.
Do you know what is going on?
Tim

@TingShen_Kuo In your Python code, what are “A” and “B”?
The error is that the data type must be int or bytes. You may not have assigned a value to A or B correctly.

Can you post more details about this problem you have?:

@bartokon

Also, I have find out that PYNQ has some errors with register map if your array is bigger than ?64? units.
Depending on the host CPU (32bit/64bit) registers are 32 bit or 64 bit. This is why you see an error if you try to use something larger than this.

I can write one value by using base address and incrementing by 4bythes (size of int)
We could do probably whole array like this, but there must be another way! I will try to search for a solution.

See this post:

Cathal