Allocation function error in PYNQ v2.7

mizan · November 26, 2021, 10:08am

I have a project running on PYNQ v2.6. I just have ported the program to the newer PYNQ. (I have seen some differences between axi slave calling through overlay function, which is solved and not relevant to this, I hope). Now, the main problem I am having is the repeated allocation of a buffer. It shows an error for allocation.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-5412c6fea447> in <module>
    233                        
    234                         
--> 235             frame = allocate(shape=(int(height/3), int(width/5), 3), dtype=np.uint8, cacheable=True)
    236            
    237             

/usr/local/share/pynq-venv/lib/python3.8/site-packages/pynq/buffer.py in allocate(shape, dtype, target, **kwargs)
    170     if target is None:
    171         target = Device.active_device
--> 172     return target.allocate(shape, dtype, **kwargs)

/usr/local/share/pynq-venv/lib/python3.8/site-packages/pynq/pl_server/device.py in allocate(self, shape, dtype, **kwargs)
    290 
    291         """
--> 292         return self.default_memory.allocate(shape, dtype, **kwargs)
    293 
    294     def reset(self, parser=None, timestamp=None, bitfile_name=None):

/usr/local/share/pynq-venv/lib/python3.8/site-packages/pynq/pl_server/xrt_device.py in allocate(self, shape, dtype, **kwargs)
    167 
    168         """
--> 169         buf = _xrt_allocate(shape, dtype, self.device, self.idx, **kwargs)
    170         buf.memory = self
    171         return buf

/usr/local/share/pynq-venv/lib/python3.8/site-packages/pynq/pl_server/xrt_device.py in _xrt_allocate(shape, dtype, device, memidx, cacheable, pointer, cache)
    122         bo, buf, device_address = pointer
    123     else:
--> 124         bo = device.allocate_bo(size, memidx, cacheable)
    125         buf = device.map_bo(bo)
    126         device_address = device.get_device_address(bo)

/usr/local/share/pynq-venv/lib/python3.8/site-packages/pynq/pl_server/xrt_device.py in allocate_bo(self, size, idx, cacheable)
    412                             xrt.xclBOKind.XCL_BO_DEVICE_RAM, idx)
    413         if bo >= 0x80000000:
--> 414             raise RuntimeError("Allocate failed: " + str(bo))
    415         return bo
    416 

RuntimeError: Allocate failed: 4294967295

What’s needs to be changed in order to make it work? I just want to remind that, this program has so far no problem running on v2.6.

Edit: There might be a need for these following information:
** First time it always works, from the second time it shows the error.
** In my project, I have used two vdma and DisplayPort output.
** surprisingly, custom i2c IP started working in v2.7 which is controlled by custom axi slave, was not working in v2.6.
** All custom axi slave had to redefine e.g.: controller = overlay.pipeline_top_0.axi_s to controller = overlay.pipeline_top_0
** The project is a video pipeline that supports 60FPS. But the FPS is greatly reducing (~8) if I use the allocate function in the while loop. But If I keep outside the loop it is providing 60FPS, but I can’t achieve my goal to refresh the buffer this way (meant overlays are scrambled)

Thanks

abitofmaya · November 26, 2021, 1:03pm

I’m not sure but I think you’re out of memory. Try adding some delay in the loop to see if that works. Or you could use the non-cacheable memory with frame = allocate(shape=(int(height/3), int(width/5), 3), dtype=np.uint8, cacheable=False).

Regards,
Frank Shrestha

mizan · November 29, 2021, 5:57am

Fresh copy of pynq v2.7. 32GB sd card. Same program on the same board running on v2.6 with another sd card. There is no possibility I can see for either running out of sd card storage neither the RAM on the board. If it is running out of memory, this is related to how v2.7 allocate the buffer.
Still to test your theory I have swapped the memory card, Still same. Also I have tried both program on another zcu104. Same issue.

With non cacheble, the program either hangs or very low fps 15FPS on v2.6. And v2.7 this down to 4-5 if it is running in first try. The problem presist like before.

Edit: I also can’t add delays as I need the 60FPS as my output.

Thanks,
Mizan

abitofmaya · November 29, 2021, 6:23am

I meant the CPU cache memory but seems unlikely if it was fine on the previous image.

Well then, there must be something going on with the new image.

mizan · December 10, 2021, 8:47am

If anyone from PYNQ side could comment on this or provide a workaround will be very much appreciable. Thanks

Edit: Not only this project it is also happening in another project with allocating buffer.

cathalmccabe · December 10, 2021, 11:15am

Can you post your complete code (or a simplified version)?
Which board(s) are you running on? I see you mentioned ZUC104. Did you try any other boards?

Cathal

mizan · December 10, 2021, 11:29am

I have tried with two ZCU104 boards. And code is quite long (around 700lines). It is what could be deduced as I really don’t understand what to put and what not to. Both programs running fine in PYNQ v2.6. The problem only happens when allocating buffer is inside a while loop. I am also using jupyter ui poll

with jupyter_ui_poll.ui_events() as poll:
    while (run):
        poll(5)                # React to UI events (upto 10 at a time)
        if halt is False:
            frame = allocate(shape=(200, 250, 3), dtype=np.uint8, cacheable=True)

mizan · December 14, 2021, 6:05am

Can you give us some insight into what has changed with the allocation function in v2.7 or what to change for making it work in a loop as well like v2.6?

mizan · December 23, 2021, 6:51am

Sorry for disturbing you again, I just need to know whether this problem could be related to Vivado and Vitis versions? I am using Vivado 2020.1, one IP in the project generated by vitis hls 2020.1, and another from Vivado hls 2020.1. Do I need to use another vivado version for v2.7 to generate the bitstream?
I just came up with the solution that keeping the buffer outside the loop and loading with zeros before allocating something to it (else it is overlapped with the previous image), but it is reducing FPS significantly and i need it to provide 60FPS.

Edit: Also found in a post from which it seems like it is related to virtual memory addressing on v2.7:

On a 32-bit system, your virtual memory address space cannot exceed 2^31-1 (4294967295) bytes.

and I think this is happening in my program.

cathalmccabe · January 7, 2022, 11:12am

v2.7 is only verified with 2020.2. You should use this version, but if your design works with 2020.1 I don’t think this is the problem.

You are using a ZCU104 (64-bit). However, you will be limited by the available DRAM memory. Allocate is allocating contiguous memory, so the space available will be less than the total (fragmented) free memory in your system.

Does the allocation fail the first time you run this or after you run your loop a few times?
You can check available system memory with:

mem = !cat /proc/meminfo | grep 'MemFree'
print(mem)

I’m not sure what you are doing, but “double buffering” could be used to avoid this. i.e. Allocate two frame buffers, display the “old” one while writing an update to the other buffer. When the update is ready, switch to display this buffer and start writing a new frame to the other (“old”) buffer.

Cathal

mizan · January 7, 2022, 11:54am

It doesn’t happens with v2.6, i kept the system running more than 6 hours. The buffers are inside while loop.

Generally it runs on first try, but the processing is slower (FPS is reduced). Even if it is running if I switch between frame buffer (there are 4 frames in total for showing the menu and corresponding settings), the error appears.
It is a continuous loop until I press the stop button, if i don’t turn on any buffer while it is running on first try, there is no error appears. But it never runs on second try.

I am not doing anything related multiprocessing, or not using the buffer in two processes at the same time. The buffers are used with cv2.puttext and some drawings. that’s why if i put the buffer outside while loop it gets scrambled. So, I had to keep it inside loop. Can keep it outside the loop by filling it zeros in each iteration, but not getting performance i needed. Still, I’ll try with two buffer for each as your suggestion.

The thing I don’t understand is why this one not coming in v2.6.

Topic		Replies	Views
Allocation problem on run time (pynq 2.7) Support	20	1106	July 17, 2023
Pynq Allocate failed :RuntimeError: Allocate failed: 4294967295 Support	1	49	October 12, 2024
Allocate fails with error Support	8	118	February 12, 2025
Problem with pynq.allocate() Support	8	1447	June 22, 2023
Question about the 'allocate' module in PYNQ Support	1	1640	December 16, 2019

Allocation function error in PYNQ v2.7

Related topics