Why a 6x6 matrix only print out 32 elements?

shadow346015 · March 30, 2023, 10:10am

Hi dear all,

I used vitis HLS 2022.2 and vivado 2022.2 to my object.
My output expected to a 6X6 matrix, i had checked it in matlab and C++ environment for simulation.
But when I use the similar (same structure) to run during vitis and vivado, and put them into FPGA broad.
I can only output the elements to 32, these are correct with matlab and C++ results.
Why the other 4 were gone? I have no idea with this
thank you for your attention.

Please have a reference with my following result:
matlab:
P_ss =

12.6831 8.6205 4.6556 6.2474 -4.5267 -0.1616
8.6205 13.6159 6.6142 13.6793 -2.9391 0.2384
4.6556 6.6142 14.9192 11.6875 5.6499 0.9135
6.2474 13.6793 11.6875 19.1106 5.3445 1.2266
-4.5267 -2.9391 5.6499 5.3445 17.6676 2.1206
-0.1616 0.2384 0.9135 1.2266 2.1206 0.4714

jupyter:

briansune · March 30, 2023, 11:57am

@shadow346015

Welcome to PYNQ forum =]

Too many missing background.
Need more info to resolve such issue

ENJOY~

shadow346015 · March 30, 2023, 3:20pm

Hi briansune, thanks for your replying

here is the call out of FPGA

briansune · March 30, 2023, 4:38pm

@shadow346015

Can you post the DMA block settings?
Possible all the block connection image capture.
What kind of HLS multiplication you had implemented IEEE754?

ENJOY~

shadow346015 · March 30, 2023, 5:04pm

Hi @briansune
here are the setting of dma and the block connection

shadow346015 · March 30, 2023, 5:05pm

briansune · March 30, 2023, 5:25pm

@shadow346015

My best guess if 512/32 = 16.
So simply speaking 6X6 = 36/16 floor = 2 aka 2x16 = 32.

So what do you think the DMA block setting could resolve such issue?
Or how do you make the matrix to support a complete transferring?

ENJOY~

shadow346015 · March 30, 2023, 7:26pm

Hi @briansune
I am not sure where you specify the calculation above.
But maybe meet my setting of limit(almost the same)

shadow346015 · March 30, 2023, 7:27pm

briansune · March 31, 2023, 5:12am

@shadow346015

OK, when the DMA engine is based on 512 bit transfer and 32bit IEEE754 makes 16 set.
So if your DMA engine cannot handle unaligned transfer 36 set will not able to mask the remain 4 set of data.

Don’t modify the HLS first, and observe what will happen if modify the DMA engine or just the Python script.

Your bus is fixed to 512 wide bus.
So solutions:
A: try 48 set of result if return complete. If this return success, then prediction is correct.
B: try activate DMA unaligned transferring. If both A, B also settle then sure it is what the root cause.

Always solve issues by making a reasonable assumptions and cross examinations.

ENJOY~

shadow346015 · April 25, 2023, 2:17pm

Hi @briansune
Thanks for your replying.
The final script was fixed based on j_limit = 16 ; i_limit = 2, these caused my loop only run 32 times as i_limit floor.

But base on this DMA setting, does it mean I can only transfer a 6 digits value? Even when my data type setting is “double”?

briansune · April 25, 2023, 3:34pm

@shadow346015

Nope, DMA bus width and the data format is all free to use. This is purely based on your design constrains.

If you had activate unaligned transfer. Then only thing to concern is transfer cycle. While the final data you can masked out whatever the MSB information.
In example, if your data is 18b and bus is 32b then the tricks are 24b as 18b payload and ID would be needed to sperate the payload destination. (So 6 bit is wasted on the transfer).

Remember a aligned transfer always make computation more effective in digital world. So I have a good suggestion on data format:
Use byte based information standard IEEE754 for example.
Use LUT of finite data to reduce resolution but far far less bit information.
Reduce bus transfer needs by computing all inside the PL logic and only pass the final result to the CPU. (Accelerator design methodology).

ENJOY~

shadow346015 · May 7, 2023, 12:57pm

@briansune

thanks for your kindly suggestions.
that help me a lot as a freshman in FPGA universe.

Topic		Replies	Views
Double array Support	15	1072	July 14, 2022
Custom HLS IP block using DMA Support	2	506	December 2, 2022
Kria PYNQ DMA connect problem Support	3	35	March 10, 2025
Vivado HLS overlay problem Support	2	908	August 13, 2020
Issue regarding use of DMA with custom IP Support	7	347	March 1, 2024

Why a 6x6 matrix only print out 32 elements?

Related topics