PYNQ: PYTHON PRODUCTIVITY

Composable Overlay question

Very interesting feature. I’ll definitely have a try.

If I understood correctly, the composable overlay means you can reconfigure regions on the fly.

Say within one frame I need to execute more functions than the number of available regions, will the composable overlay reconfigure one region dynamically to finish working on that frame? ie including the reconfiguration process in the list of tasks to be executed?

Regardless of this feature, composable pipeline is already an awesome feature that I hope will unify work accross the community with DFX :slight_smile:

1 Like

Hi,

If I understood correctly, the composable overlay means you can reconfigure regions on the fly.
No, composable overlays means that you can change the datapath at runtime. Dynamic reconfiguration is part of the composable overlay but not strictly necessary.

Say within one frame I need to execute more functions than the number of available regions, will the composable overlay reconfigure one region dynamically to finish working on that frame? ie including the reconfiguration process in the list of tasks to be executed?
You can reconfigure the the region dynamically, if the composable overlay has DFX regions. Depending on the size and configuration of the hardware you may be able to reconfigure the hardware while data is on flight. However, for video streams this is not possible. Typically, you will have everything configure before feeding the video to the composable overlay.

Have a look at this video. It may help clarifying your questions

Mario

Mario

1 Like

Thanks Mario for the video and answers, this indeed clarified lots of architecture and hardware question I had.

I still have a question regarding your last answer; based on your video at around 25:18, the pipeline comprises the following tasks:

pr_0/erode_accel → pr_0/dilate_accel → pr1/dilate_accel → pr1/erode_accel

You mentionned this shouldn’t be possible with video streams. Is this because of timing constraints with video acquirement? Anyhow, the shown usecase with the looped video works fine for me, I’m just trying to understand what’s under the hood.

To clarify my intentions, what I would like to use as a pipeline is the following task graph

Would this cause some problem with the composable overlay? How would the reconfiguration process be managed in such a case? Would the bitstream download be able to pre-fetch (eg: reconfigure pr_2 with task_9 after task_2 has been executed, even if task_0 still has to wait for task_6, 7 and 8 to finish?)

1 Like

You mentionned this shouldn’t be possible with video streams. Is this because of timing constraints with video acquirement? Anyhow, the shown usecase with the looped video works fine for me, I’m just trying to understand what’s under the hood.

erode and dilate are implemented in the same partial bistream, that the reason why I can use them at the same time.

c_dict.loaded shows what is loaded and can be used. If a function is not loaded you need to load it manually, this is something the provided API does not make automatically.

The composable video pipeline forks to two branches, so you will have to create your own composable overlay that can branch the number of time you need.

Would this cause some problem with the composable overlay? How would the reconfiguration process be managed in such a case? Would the bitstream download be able to pre-fetch (eg: reconfigure pr_2 with task_9 after task_2 has been executed, even if task_0 still has to wait for task_6, 7 and 8 to finish?)

It depends, reconfiguring part of the FPGA takes time. So, you need to store the intermediate results from une function to another. Depending on this size of this intermediate result the implementation could be viable or not. I am assuming these connections are streaming.

As I mentioned in the first comment, we use DFX as a feature to augment the overlay functionality, but it is not the core of the composable overlay.

1 Like