AXI BRESP unhandled

Board: Pynq-Z1
Image: v3.0.1

We are working on bringing a pynq system up to see if this is something worth pursuing and have a few questions -

  1. On an AXI transaction, for the BRESP field we had designed in logic that if a transaction is invalid (for instance, writing to an invalid register), the BRESP field would send back an error code. As best as we can tell, this is not handled by the python side, crashes and requires a reboot. Holding the BRESP field at ‘OKAY’ (b’00) ‘works’ (in a sense - it doesn’t crash). Is it possible to set these bits to other values? Following is the message that occurs when the crash occurs:
8<--- cut here ---
Unhandled fault: external abort on non-linefetch (0x1818) at 0xb6707000
pgd = d8b72a69
[b6707000] *pgd=0a375831, *pte=43c00743, *ppte=43c00c33

While not vital in this, we would like to use the HDL in other designs (it’s fine if the response is ignored, but the crashes are problematic).

  1. I have noticed that if the pynq-z1 board is booted, and left idle it will randomly hang (no response via Jupyter or via terminal); maybe once or twice a day. No firmware images are loaded or programs running, outside the default pynq image. Found one unresolved thread related, is this a known/expected issue or a known way to mitigate it?

  2. After loading an image -

Help on Overlay in module pynq.overlay:

<pynq.overlay.Overlay object>
    Default documentation for overlay bitstream.bit. The following
    attributes are available on this overlay:
    
    IP Blocks
    ----------
    GPIO                       : pynq.lib.axigpio.AxiGPIO
    mymodule               : pynq.overlay.DefaultIP
    processing_system7_0 : pynq.overlay.DefaultIP
    
    Hierarchies
    -----------
    mymodule : pynq.overlay.DefaultHierarchy
    
    Interrupts
    ----------
    None
    
    GPIO Outputs
    ------------
    None
    
    Memories
    ------------
    PSDDR                : Memory

In the above, you can see that the created module (mymodule) exists both as an IP block and a Hierarchy. If I just use the ‘standard’ method of mymod = ol.mymodule, that returns the Hierarchy block, not the IP block. There may be a way to direct which type should be returned that we can’t find.

3.a) The hierarchy is mentioned with an example at Overlay Tutorial — Python productivity for Zynq (Pynq) but is tough to follow - the example is more about the DMA. Are there any other guides/explanations on what a hierarchy is that are recommended. The example seems to be doing the same thing (writing to a memory address), but it’s unclear to us what the difference between an ip block and a hierarchy fundamentally is.

3.b) One method for forcing the IP block to return was:

ol._ip_map._description['hierarchies'].pop('mymodule')

But as pointed out, is a hack (and is accessing in ways that couldn’t have been intended). Our experience is that it generally works, not quite reliably; is there a recommended way to get around this? Or, is it better to tie the DocumentedHierarchy to usable functions with the checkhierarchy method (copied from v3.0.0 of the documentation):

@staticmethod
    def checkhierarchy(description):
        if 'multiply_dma' in description['ip'] \
           and 'multiply' in description['ip']:
            return True
        return False

Thanks in advance for your time.

1 Like
  1. PYNQ doesn’t do anything to manage failed AXI transactions. (It may be worth checking on the Xilinx forums ways to manage this. I’m guessing you would need to modify drivers at the kernel level. One alternative may be to build an wrapper to intercept the BRESP errors and ignore them if you don’t care, or direct them somewhere else - as an interrupt or set a wire, write to a register etc. )
  2. Quality of SD cards varies a lot. Some can be unreliable. This would be the first thing I would try check.
  3. Hierarchy is a construct in Vivado IPI. The PYNQ ip_dict parses the HWH to determine the IP. Hierarchy can confuse it, so the hierarchy dict was introduced to address this. You usually want to target an IP inside the hierarchy, not the top level hierarchal block.
    I’d suggest if you know your IP is within a hierarchy, ignore the entry you are seeing in the IP block list.

Cathal

1 Like

I’m guessing you would need to modify drivers at the kernel level.

I think that will likely be the response of our firmware engineers (for good reason, they won’t want to maintain special versions just for this), but thanks for confirming that is the expected behavior.

Quality of SD cards varies a lot. Some can be unreliable. This would be the first thing I would try check.

That aligns somewhat with what I’m seeing - reformatting both SD cards (Samsung high endurance, 32GB or something?) lowered the number of hangs back down to 1 a day. It will probably just need to be re-built every week or two for now. Unfortunately, that means the viability of this is low.

You usually want to target an IP inside the hierarchy, not the top level hierarchal block.

In our case, since the HDL is directly instantiated within the block diagram, the ‘IP’ (so to speak) exists at the top level. So the IP definition and the hierarchy definition are the same thing.

I’ll have to look for an example of what is differentiating these; the documentation is not very clear.

I’d suggest if you know your IP is within a hierarchy

I’m not sure I understand what you’re saying here. Yes, HDL is hierarchical.

When situations like this occur (duplicate names), is the hierarchy guaranteed to take precedence as the return? Or is that left to the language.

Hi @PAL85,

I’m not sure I understand what you’re saying here. Yes, HDL is hierarchical.
When situations like this occur (duplicate names), is the hierarchy guaranteed to take precedence as the return? Or is that left to the language.

It is likely that you are hitting an edge case that has not been considered in the parser. Could you share the .bit and .hwh?

Hi Mario,

Unfortunately, as is so often the case I can’t - there’s a blanket policy against me posting files regardless of content. My apologies, it’s frustrating to not ‘see what I see.’ Replicating the setup is fairly straightforward though:

  1. Create a block diagram within Vivado (I’m using 2022.1, but it doesn’t matter which)
  2. Right click, and do ‘Add Module’ → select the HDL file (highest level within that submodule).

After the normal processes (.bit/.hwh download), the name should appear twice - once as an IP, and once as a hierarchy. Accessing as an IP is possible as above, although by default the hierarchy seems to return.

That probably explains it to some extent. Our preference is not to use the IP block wrappers that Xilinx developed, as it introduces secondary control systems. Since the HDL (and dependents) are not viewed as a single module from the BD level, it shows up as a hierarchy (that’s my vague guess here - I have some background in digital but don’t do much).

I think I see how to write a hierarchy level driver from the overlay API, which hopefully I can try this week.

Hi @PAL85,

Without files to reproduce, the process gets more complicated even if the process to reproduce may seem straightforward.

The hwh file is parsed by this project, so maybe digging into the files, you may be able to find a solution. GitHub - Xilinx/PYNQ-Metadata: PYNQ-Metadata provides an abstract description of reconfigurable system designs.

I think I see how to write a hierarchy level driver from the overlay API, which hopefully I can try this week.

This is also a potential solution. As you are providing an HDL IP, there’s no register map associated to it in the metadata, so you will have to define it yourself.

Thanks; I understand, it’s just not something I have a say in unfortunately.

This is also a potential solution. As you are providing an HDL IP, there’s no register map associated to it in the metadata, so you will have to define it yourself.

Interesting. Once I pop the hierarchy out, there is a register map within the metadata… so the information is there, it’s just a matter of accessing it correctly.

You can go for this option for now. Alternatively, you can just use MMIO.