PYNQ3.0.1 allocate ddr4 returns buffer outside of address range

Hello,

I just upgraded my ZCU111 from PYNQ2.7 to PYNQ3.0.1. I’m using the same bitstream + hwh built with Vivado 2022.1 and I’m noticing some weird behavior with pynq.allocate().

I am attempting to allocate a buffer targeting the PL DDR4:

>> buffer = pynq.allocate((n, 2), dtype='i2', target=ol.capture.ddr4_0)
>> print(hex(buffer.device_address))
0x78800000
>>print(hex(ol.capture.ddr4_0.base_address))
0x500000000

pynq.allocate() is returning a buffer below the base address of the PL DDR4? This causes AXI decoding errors when I try to capture data… In PYNQ2.7 everything seems to behave: the first time I call pynq.allocate() it returns a buffer with the same base address as the DDR4 and my capture works just fine.

It looks like PYNQ3.0.1 correctly identifies the MIG and its base address.

>> ol.capture.ddr4_0?
Type:            EmbeddedXrtMemory
String form:     <pynq.pl_server.embedded_device.EmbeddedXrtMemory object at 0xffff58f27ca0>
File:            /usr/local/share/pynq-venv/lib/python3.10/site-packages/pynq/pl_server/embedded_device.py
Docstring:       <no docstring>
Class docstring:
Class representing a memory bank in a card

Memory banks can be both external DDR banks and internal buffers.
XrtMemory instances for the same bank are interchangeable and can
be compared and used as dictionary keys.

I’m not sure what the root cause is. Test .bit and .hwh attached.
test.zip (6.1 MB)

1 Like

Hi there,

This is a recent issue due to latest XRT changes in the way allocation works – it expects memory to show up in the device tree. If you look at dmesg output while your overlay loads you might see some Memory 0 is not reserved in device tree. Will allocate memory from CMA messages. We don’t quite have a clean, automated fix for this use case in the current release.

A workaround you could try applying is to add a device tree overlay with a reserved-memory node for your ddr4 memory. There’s an example of inserting a dtbo for pynq in the kria-pynq repo, you’d have to do something similar with your reserved-memory dtbo. Another workaround is to just use MMIO to write to that memory.

Apologies for not having a cleaner solution. Hope this helps.

Thanks
Shawn

2 Likes

Hi Shawn,

Thanks for the quick reply. That is an unfortunate change in XRT. I’d prefer not to use MMIO because in my experience it’s very sensitive to memory access alignment and I’d rather not have to tip toe around that. XRT allocation worked well in PYNQv2.7 and is a core part of our system so I may just revert my board and hope it’s fixed in the next release.

I’d be willing to try to develop a reserved-memory node solution but I may need a few more pointers. If I’m understanding your links right, I have to create a reserve_ddr4.dts file which will look something like

reserved-memory {
   #address-cells = <2>;
   #size-cells = <2>;
   ranges;
 
   reserved: buffer@0 {
      no-map;
      reg = <0x0 0x70000000 0x0 0x10000000>;
   };
};
 
reserved-driver@0 {
   compatible = "xlnx,reserved-memory";
   memory-region = <&reserved>;
};

I’m not really clear on what the .dts should actually look like. The link you provided also mentioned a .dtsi and a device driver. Are those relevent? Is there a tool I can use to generate the .dts or do you have any other resources you can point me to on how to construct that file for a single 4 GiB DDR4 with base address 0x500000000? Assuming I get that right, I would then want to run a script like this:

import os
import warnings

""" Insert DeviceTree Fragment using pynq
If the segment is not inserted already import pynq and create the a
DeviceTreeSegment with the full path to the \'pynq.dtbo\' file.
Then insert the fragment
"""

path = os.path.dirname(__file__)
dtbo = 'reserve_ddr4.dtbo'
sysfs_dir = '/sys/kernel/config/device-tree/overlays/' + \
    os.path.splitext(dtbo)[0]

if not os.path.exists(sysfs_dir):
    try:
        from pynq import DeviceTreeSegment
    except UserWarning:
        pass

    if path != '':
        path = path + '/'

    db = DeviceTreeSegment(path + dtbo)
    db.insert()

whenever the pynq virtual environment is sourced? Please let me know if there are any major steps I missed and if possible include some more resources on the .dts file generation. Thanks for your help with this!

Best,
Jenny

1 Like

Hi Jenny,

The simplest way to do is to just copy what’s done in the Kria-PYNQ repo Kria-PYNQ/pynq.dts at main · Xilinx/Kria-PYNQ · GitHub and replace those fragments with your own reserved-memory one. You might not need the driver stuff in there, could add something like this in pynq.dts:

fragment@4 {
        target-path = "/";
        overlay4: __overlay__ {
                reserved-memory {
                        ranges;
                        reserved {
                                reg = <0x0 0x70000000 0x0 0x10000000>;
                        };
                };
        };
};

And just delete the other fragments.

You can re-use the makefile from kria-pynq as well to compile the dtbo (will probably need to run apt-get install device-tree-compiler first).

Thanks
Shawn

2 Likes

Thank you for clarifying. Here’s my latest attempt:

I cloned the Kria repo to my board and edited the pynq.dts as you suggested:

/*
 * Add zocl fragment to the device tree
 */

/dts-v1/;
/plugin/;
/ {
    /*afi*/
    /*TODO is afi necessary?*/
    fragment@4 {
            target-path = "/";
            overlay4: __overlay__ {
                    reserved-memory {
                            ranges;
                            reserved {
                                    reg = <0x0 0x70000000 0x0 0x10000000>;
                            };
                    };
            };
    };
};

I installed device-tree-compiler and used the Kria Makefile to generate pynq.dtbo. I then tried the insert_dtbo.py script in a jupyter notebook

import sys
import os
path = '/home/xilinx/Kria-PYNQ/dts/'
dtbo = 'pynq.dtbo'
sysfs_dir = '/sys/kernel/config/device-tree/overlays/' + \
    os.path.splitext(dtbo)[0]

if not os.path.exists(sysfs_dir):
    try:
        from pynq import DeviceTreeSegment
    except UserWarning:
        pass

    if path != '':
        path = path + '/'

    db = DeviceTreeSegment(path + dtbo)
    db.insert()

But it errored out with

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [6], in <cell line: 6>()
     13     path = path + '/'
     15 db = DeviceTreeSegment(path + dtbo)
---> 16 db.insert()

File /usr/local/share/pynq-venv/lib/python3.10/site-packages/pynq/devicetree.py:115, in DeviceTreeSegment.insert(self)
    113     read_back = f.read(1024*1024)
    114     if read_back != dtbo_data:
--> 115         raise RuntimeError('Device tree {} cannot be applied'.format(
    116             self.dtbo_name))
    118 if not self.is_dtbo_applied():
    119     raise RuntimeError('Device tree {} cannot be applied.'.format(
    120         self.dtbo_name))

RuntimeError: Device tree pynq cannot be applied

I also tried

dtbo_path = '/home/xilinx/Kria-PYNQ/dts/pynq.dtbo'
d_test = pynq.devicetree.DeviceTreeSegment(dtbo_path)
d_test.insert()

But got the same error. Interestingly enough d_test.is_dtbo_applied() returned True so I went ahead and tried to download my overlay with the ddr4 and the board crashed. I probably don’t have too much more debugging bandwidth for this now but if I made a simple error please let me know and I can give it another shot when I have time.

Thanks!
Jenny

Hi Jenny,

Sorry I didn’t mean to use that reg value exactly as is, it should be adapted to your ddr base address and amount of memory you’re allocating. As an example this was the snippet I just used to get allocate on the rfsoc4x2:

/dts-v1/;
/plugin/;
/ {
        /*reserved memory*/
        fragment@4 {
                target-path = "/";
                overlay4: __overlay__ {
                        reserved-memory {
                                ranges;
                                reserved {
                                        reg = <0x10 0x00 0x02 0x00>;
                                };
                        };
                };            
        };
};

image

DDR reg addressing is 64-bit so the reg values have to be split into 2 (upper 32-bit and lower 32-bit)
base address (from vivado address editor) 0x1000000000 => 0x10 0x00
range (from vivado address editor) (8GB) => hex(8589934592) => 0x200000000 => 0x02 0x00

I think your reg should look something like
reg = <0x05 0x00 0x01 0x00>

If you’ve tried loading a device tree overlay before (and haven’t rebooted), you can remove it with

rmdir /sys/kernel/config/device-tree/overlays/pynq

though after lots of overlay experimentation I found a reboot is sometimes necessary.

Thanks
Shawn

5 Likes

I had a feeling I had to change the register values but wasn’t sure about the specification. Thanks for walking me through that :slightly_smiling_face:. I updated the values to <0x05 0x00 0x01 0x00> as suggested and it’s working! I added it to pynq-venv.sh following what’s done in the Kria repo and now it’s all working as normal. Thanks so much for your help with this! I will repost step by step instructions below:

In an ssh terminal to the board:

  1. activate the pynq virtual environment source /etc/profile.d/pynq_venv.sh
  2. install the device-tree-compiler sudo apt-get install device-tree-compiler
  3. download the Kria-PYNQ repo dts folder
  4. edit pynq.dts replacing the entire contents of the file with
/dts-v1/;
/plugin/;
/ {
        /*reserved memory*/
        fragment@4 {
                target-path = "/";
                overlay4: __overlay__ {
                        reserved-memory {
                                ranges;
                                reserved {
                                        reg = <0x10 0x00 0x02 0x00>;
                                };
                        };
                };            
        };
};

The reg field should be updated like so:
The first value is the upper 32-bits of the PL DDR4 base address. For example if the base address is 0x500000000 this value should be 0x500000000 >> 32 = 0x5 represented as a 32-bit number so 0x05 . The second value is the lower 32-bits of the PL DDR4 base address ex: 0x500000000 & 2^32-1 = 0x0 represented as a 32-bit number so 0x00 . The third and fourth values are the upper and lower 32 bits of the allowed memory range. For example if the PL DDR4 base address is at 0x500000000 and it is 4 GiB, the high address is 0x5ffffffff . The allowed range is 0x5ffffffff-0x500000000+1 = 0x100000000 . The high bits are 0x01 and the low bits are 0x00 so for this example the reg field would read: reg = <0x05 0x00 0x01 0x00> .
5. in the dts folder run make to compile the pynq.dts into pynq.dtbo
6. Make a folder in the pynq-venv for the dts products mkdir -p /usr/local/share/pynq-venv/pynq-dts/
7. copy the insert_dtbo and .dtbo to the folder cp insert_dtbo.py pynq.dtbo /usr/local/share/pynq-venv/pynq-dts/
8. append the insert script to the pynq virtual environment startup script echo "python3 /usr/local/share/pynq-venv/pynq-dts/insert_dtbo.py" >> /etc/profile.d/pynq_venv.sh

I sort of guessed at the permissions I made insert_dtbo.py, pynq.dtbo, and pynq-dts/ all belong to root and executable using sudo chown root: <files> and sudo chmod 755 <files>.

Power cycled and now I am able to allocate the PL DDR4 as normal:

import pynq
from pynq import Overlay

ol = Overlay('design_with_mig.bit', ignore_version=True)
ol.download()

buffer = pynq.allocate((2**19, 2), dtype='i2', target=ol.capture.ddr4_0)

hex(buffer.device_address)
>>'0x500000000'
hex(ol.capture.ddr4_0.base_address)
>>'0x500000000'
4 Likes