What's in My Bitstream - A Pythonic Approach to Discovering FPGA Contents

Keywords: Bitstream, Metadata, Python, FPGA



State-of-the-art System-on-Chip (SoC) devices often consist of reconfigurable devices such as FPGA / Programmable Logic (PL). Compared to the traditional CPU, the reconfigurable devices usually require bitstreams to be downloaded so a meaningful hardware architecture is present before users can run any application. The bitstream (e.g., *.bit or *.bin) is in binary format; it looks almost like an encrypted file to users. An immediate question pops up now: given any bitstream file, how do you know what’s inside that bitstream?

Users literally have no idea what is in the bitstream. If users have developed the bitstreams by themselves, then at least they are lucky enough to make some assumptions in their code. Consider an example where users have a hardware IP with the address 0x80000000; without any help from elsewhere, to use such an IP, users must hardcode ip_address = 0x80000000 somewhere in their programs. This is like an assumption; it is error-prone, and not easy to change. Remember, you may have hundreds of IPs in your design!

The Use of Metadata


Metadata is defined as “a set of data that describes and gives information about other data”. Simply put, it is the data for other data. Metadata is used almost everywhere in all types of system. For example, a device tree is a classic example of metadata, where all the components accessible by OS are listed. Based on the contents of the device tree, the Linux OS can look for more data at various places, and assign compatible drivers to each component. A device tree example is shown below.


We want to do something similar in the PYNQ framework - use metadata to understand what is in the bitstream. The goal is to make such metadata Python-readable, so it can be integrated into the PYNQ framework seamlessly. In detail, the benefits include:

  1. If I have runtime metadata, I never have to look back at the original design to understand it.

  2. My code has better readability. The metadata can be both human-readable and Python-accessible.

  3. The metadata eliminates the use of magic numbers for addresses, ranges, and many others; those numbers could change from build to build.

  4. A hardware developer can write drivers quickly without the help from a Linux software expert.

Let me list a few things we should consider when using metadata:

  1. What data structure should we choose for the metadata?

  2. Where do we get the metadata source containing system information?

  3. How do we efficiently translate the system information into metadata?

I will answer these questions in the next few sections.

What Data Structure

It turns out the dictionary is the best data structure to use in Python. It has small footprint and O(1) access time-complexity. It allows deep copy as well as shallow copy, depending on what users want. Compared to list, tuple, and set, dictionary is easy to update, expand, remove, and clean. Dictionary is in human-readable json format; keep in mind that json-formatted data is easily portable to other high-level programming languages.

For the above reasons, PYNQ keeps most of the metadata in the form of dictionary. PYNQ also uses, if necessary, nested dictionaries for very complex metadata.

Where is System Information

The history of TCL file

For earlier versions of PYNQ, we (PYNQ developers) uses the TCL file (*.tcl) as the source for metadata. TCL file can be the source to build hardware design at the same time. It contains IP parameters, Processing System (PS) configurations, nets / pins, and interrupt connections.

However, there is still information missing in the TCL file. For example, default parameters are omitted from the TCL file. Consider the following case: the DMA IP has a default length register width 14, corresponding to a maximum transfer length of 16KB. If user initiates a 30KB transfer, with no exception thrown, the DMA will only transfer a subset of the original data. Since the DMA IP defaults to a low max transfer size, and this default parameter is not in the TCL file, this often leads to chipscope debugging.

The current use of HWH file

Starting from image v2.4, PYNQ began to adopt the HWH file (*.hwh) as the source of the metadata. HWH file is in XML format, and includes a complete set of register values, interfaces, and connections. This file is generated automatically by Vivado during implementation.

Let’s see an example of HWH file below. Note that the HWH file can get pretty large (e.g., the HWH file of the base overlay for Pynq-Z1 has around 60,000 lines). For this reason, we do want PYNQ to extract useful information from it during run-time, instead of using it directly.

Currently both TCL and HWH files are supported as metadata source, but our hope is to use HWH file solely in the near future.

How to Populate Metadata

The pynq package has some file parsers, for example, the HWH parser to populate metadata from metadata source to Python dictionary.

There are many metadata extracted into dictionaries. Among all of those dictionaries, the most useful one is the IP dictionary ip_dict. It contains all the PS-accessible IP blocks in the design.

The metadata parsing happens when users load an overlay. For example, in the following notebook cell, when users instantiate an overlay object, the following things happen:

  1. The bitstream is loaded onto PL.
  2. The HWH parser in pynq package populates the Python dictionaries.

From the dictionary, now we can access PL information in a Pythonic way; for instance:


With the use of metadata, the code to access PL information is Pythonic, robust, readable, and manageable. If your PL has loaded a wrong bitstream, the use of metadata will enforce an early exception (e.g., no key leds_gpio found in the ip_dict, indicating the bitstream does not have that IP); the early exception will prevent users from proceeding and triggering a system hang later.


In summary, PYNQ leverages HWH parser to extract metadata into Python dictionaries. With the help of these dictionaries, users now understand what is in the bitstream. This concludes the article. Hope you enjoy a PYNQ day.


  1. Xilinx AXI GPIO driver
  2. Python XML parser