Poor copy performance out of PynqBuffer

baileyji · April 27, 2023, 4:20pm

Hi Folks,
Running PYNQ 3.0.1 on a ZCU111 I’m seeing much slower performance getting data out of a PynqBuffer than I think I should be getting and I’m 1) not sure why and 2) don’t know how to speed it up.

Broadly I’ve got a PynqBuffer (pynq.buffer.allocate) of about 1MiB and operations on it take on order of the time needed to fill the buffer from the PL side, making any sort of ISR timing much tighter than I’d hoped.

Bottom line: Copying data out of a 0.78MiB PynqBuffer (or into another pynq buffer!) takes about 10x that of a numpy array copy.

For some specific numbers:

Pynq version 3.0.1
10.00 ms of data @ 100.0% max rate
102400 items, 0.78 MiB
Copy into Pynq buffer: 1.71 ms
Copy Pynq 2 Pynq: 4.21 ms # buffer2[:]=buffer1
Copy Pynq 2 Pynq (.copy()): 6.44 ms # buffer2=buffer1.copy()
Copy Pynq 2 Numpy: 6.48 ms # array[:]=buffer1
Copy Pynq 2 Numpy (np.array()): 6.38 ms # array=np.array(buffer1)
Copy Numpy 2 Numpy: 0.92 ms # array2[:]=array1
Copy Numpy 2 Numpy (.copy()): 0.69 ms # array2=array.copy()

These two tests involved pip installing python-bloc2 into the pynq venv.
Compress numpy (blosc2.pack_array): 5.37 ms
Compress pynq (blosc2.pack_array): 11.92 ms

These final two tests involve bitshifts on the uint64 values to unpack them into usable form:
Unpack numpy (incl. np.zeros() allocate): 6.40 ms
Unpack pynq (incl. np.zeros() allocate): 36.04 ms

JennySmith888 · April 27, 2023, 7:21pm

Here is the same test run on PYNQ 2.7, results are pretty similar:

Pynq version 2.7.0
10.00 ms of data @ 100.0% max rate
102400 items, 1.17 MiB, 0.78 MiB packed
Copy into Pynq: 1.43 ms
Copy Pynq 2 Pynq: 4.61 ms
Copy Pynq 2 Pynq (.copy()): 7.38 ms
Copy Pynq 2 Numpy: 6.58 ms
Copy Pynq 2 Numpy (np.array()): 6.27 ms
Copy Numpy 2 Numpy: 1.44 ms
Copy Numpy 2 Numpy (.copy()): 0.96 ms
Compress numpy (blosc2.pack_array): 7.68 ms
Compress pynq (blosc2.pack_array): 12.71 ms
Unpack numpy (incl. np.zeros() allocate): 6.68 ms
Unpack pynq (incl. np.zeros() allocate): 33.01 ms

jobrien · April 29, 2023, 1:23am

I think there was a change in the allocator to be non-cacheable by default when moving from xlnk to xrt.

Try setting the cacheable option to True.

E.g. “buffer = pynq.allocate(shape, dtype, cacheable=True)”

Topic		Replies	Views
Fast data transfer to pynq buffer Support	1	181	September 23, 2024
Any method to reduce the calculation time on PS (PYNQ-z2) Support	7	690	March 29, 2022
Allocation function error in PYNQ v2.7 Support	10	1674	January 7, 2022
Pynq Allocate Speed Support	3	706	November 28, 2022
Upload numpy array on pynq dram Support	3	401	February 23, 2023

Poor copy performance out of PynqBuffer

Related topics