I am currently looking into adding an IP from the Vitis Vision library into one of the dfx regions of the composable overlay design and synthesising it using the Abstract Shell methodology. The Xilinx guides on how to do so are very clear and straightforward, however, I have some queries as to how to adapt the procedure for the composable overlay design.
First I understand I need to add a Reconfigurable Module (RM) using the DFX wizard, however, when I attempt to do so I get greeted with the following message:
Is there something I need to do first to allow me to add new RM to the Reconfigurable Partitions?
Secondly I am curious as to why the static IP’s are all connected by 1 main stream in and 1 stream out to the Axis Stream Switch, whereas all of the RM preloaded in the DFX regions of the design have 2 streams in and out to the dfx decoupler before then the Axis Stream switch?
Is this just to allow for multi-stream functionality (e.g. as utilised by the dilate and erode IP or by the Subtract IP) and both don’t actually need to be connected. For example if i was to add a simple rgb2bgr IP from the Vitis vision library I could just have a stream in and out like I did when swapping it into the static region?
If I was to create a new RM with the rgb2bgr IP block can I simply wire it up directly to the Axi Slice and out again or would I need to add a fifo block for the second stream input like with the rgb2xyz IP:
I have dug a bit deeper into the docs it has lead me down the path of the Abstract Shell flow, however I have a few questions on the creation of the composable overlay project from the makefile.
Does the makefile process write abstract shells for the Partial Reconfigurable regions (pr_0,pr_1,pr_2) after it has finished implementation?
As I notice these .dcp files present in the impl_1 folder:
If I want to add a new filter into Pr_0 am I right in saying I create the .bd file as part of the bdc, carry out synthesis. Then open the .bd in a separate vivado project and add the folder: abs_shell_video_cp_i_composable_pr_0.dcp through tcl console?
From my understanding there is the BDC flow for adding PM to PR blocks, which you are describing, that takes longer to generate the partial bitstream for the new RM as it incorporates implementation of the full static design with the new RM.
I am investigating the abstract shell flow which looks into extracting only a small section of the static design (all the timing and spatial constraint required external to the PR block) saved as a checkpoint, to carry out implementation with the new RM for partial bitstream generation. Greatly reducing implementation time.
Yes, that is correct which the tutorial also uses this method and I think only the method that could works in Vivado 2020 or below. Read more on the tcl command of the tutorial.
While I am not sure the changes in the newer Vivado.
One major thing is that you need decoupler on each interface to the PR block.
No, the script do not write the abstract shell dcp explicitly, this is something you’ll have to do yourself. But, it is using the abstract shell methodology under the hood.
I think the questions are getting too advanced to be answered in this forum. Maybe, the Xilinx forum is more appropiate.
Is there something I need to do first to allow me to add new RM to the Reconfigurable Partitions?
In this design the RP are already created and configured, I suggest you look at the DFX tutorial to see how to create them.
Secondly I am curious as to why the static IP’s are all connected by 1 main stream in and 1 stream out to the Axis Stream Switch, whereas all of the RM preloaded in the DFX regions of the design have 2 streams in and out to the dfx decoupler before then the Axis Stream switch?
This was a design decision to be able to place fork, join IP in any RP.
If I was to create a new RM with the rgb2bgr IP block can I simply wire it up directly to the Axi Slice and out again or would I need to add a fifo block for the second stream input like with the rgb2xyz IP:
Yes, the design you’re showing should work.
Mario
I think it is better to clarify the response a bit.
My response is referring to the time for re-implement new PR block when design is update.
Using checkpoint vai tcl command as shown in the tutorial is highly suggest to reduce design turnover time.
However for abstract shell dcp @marioruiz have a better experience and I would suggest to learn from the proposed links.
I am still attempting to add my own Reconfigurable modules into the composable overlay Vivado design. I have been mainly following this guide: AMD Adaptive Computing Documentation Portal, but it is a very basic use case as with most xilinx guides and doesn’t cater for many RMs in different RPs.
I have been bounced around on the main Xilinx forums but they cannot provide much guidance, I’m back here just looking a bit of clarification on the limits of the design:
Is there room in any/all of the p_blocks for me to add my own logic (nothing too large, I am trying to add the rgb2xyz block design to pr_0 as well as pr_1) or do I need to edit the floorplan of the FPGA to extend the p_block and so with it the reconfigurable partition?
→ Currently when trying to implement the design I get a place_design error, suggesting the P_block can’t handle the additional logic.
→ Would it be easier to delete current RMs to replace with my own in the same RP?
Do I need to create my own DFX region via block design container and connect it to the dfx decoupler to be able to add more RM?
Bit of a stretch but do you know about the DFX Wizard and what configuration runs I need to carry out to implement (and eventually generate partial bitstream) for the new RM… I.e. do I need a run with just the new RM, or the new RM in one RP and all the other possible RMs in the other RPs?
I know the main reason for the support forms isn’t for messing with the design of the overlay, but I believe it is a great open source tool that if easily customizable, could greatly accelerate many areas of FPGA development.
Hence any help is much appreciated!!
I’ll answer to the best of my knowledge, but these questions are out-of-scope for this forum.
This document is the best place I found where it is described what you’re looking for. Please, check it carefully and look at the associated scripts. All the steps are there. I would suggest you reproduce these steps as you’ll clearly see what is needed (starting from only an abstract shell dcp), although the design is different the steps will be the same.
Yes, it should be room to fit this IP in any of the PR regions.
No, with Abstract shell you can target the existing RP.
This is cover in the document linked above. You need a different project to do this. Define the interfaces, add IP cores, synthesize, link against dcp and finally generate partial bistream.
Ok great thanks for clearing them few things up for me that’s a big help. I will have a go at following that abstract shell example and attempt to use it for doing the same to the composable overlay.
I was reading through the bdc_dfx.tcl (24.8 KB) file to see how the RM are instantiated within the RPs and the implementation runs set up to gather the partial bitstreams.
Out of curiosity, would it be possible (and if so would it be easier) for me to write my own tcl script emulating these commands to first build a module such as laid out here:
I’m not the most familiar with tcl commands but this seems like an easy approach if wanting to continually add/trial new modules? Can this process be done easily within the tcl console of the main project, or does this only work in the build flow?
Out of curiosity, would it be possible (and if so would it be easier) for me to write my own tcl script emulating these commands to first build a module such as laid out here:
You can certainly do this.
And then also create a config run and child implementation leading to bitstream generation such as is done again in bdc_dfx.tcl:
You can also do this as well.
Can this process be done easily within the tcl console of the main project, or does this only work in the build flow?
Yes, you can execute these commands from the TCL console in Vivado. But, you should make sure that all the variables are previously defined.
Leads to a vivado error as the variables of the other RM in pr_0 are undefined and hence prompts the error: can’t read “pr_0_dilate_erode”: no such variable
If I only synthesise the new pr_0_rgb2xyz_fifo_viatcl it leads to the deletion of the sources from the pr_0 block and I cant set the pr_0_rgb2xyz_fifo_viatcl block as an active source.
Do I need to synthesise all the blocks or only focus on my new RM?
I would like to trial and error with new filter designs in one of the RP. So ideally I would like to be able to build a template tcl script that I can run within an already fully implemented design, to add another RM to one of the RP - then implement and generate the full and partial bitstream required.
By adding my new module to the bdc_dfx.tcl script, are you suggesting the procedure of running from the makefile to process it. or could I then run the bdc_dfx.tcl from the tcl console in the main project?
By adding my new module to the bdc_dfx.tcl script, are you suggesting the procedure of running from the makefile to process it.
Yes
or could I then run the bdc_dfx.tcl from the tcl console in the main project?
I haven’t tried this flow myself. But, I assume if you try this, all changes made by bdc_dfx.tcl for the RP you’re working on will be overwritten. On top of this, previous partial bitstreams may not be usable anymore because the main bitstream can change.
Ok Cheers, well I will try running from the makefile with my new module added into bdc_dfx.tcl to be able to say I have been able to add in my new module to the design.
With regards to opening a new vivado instance and sourcing cv_dfx_3_pr am I correct in guessing that this will do the same as running the makefile essentially?
Earlier you mentioned using a dcp from within a new Vivado instance. Could I write a tcl script following abstract shell flow in a new instance to generate a partial bitstream for a new module?
EDIT: Also just wondering Mario if I add my new module to the RP in the project and set it as the active synthesis source will it require a new full bitstream?
Yes, but only the part of the Vivado project. It won’t run the dependencies.
Earlier you mentioned using a dcp from within a new Vivado instance. Could I write a tcl script following abstract shell flow in a new instance to generate a partial bitstream for a new module?
EDIT: Also just wondering Mario if I add my new module to the RP in the project and set it as the active synthesis source will it require a new full bitstream?
No, it shouldn’t. But, I haven’t tried to do this once the full project is created.
An Update:
I have given up on trying to create a tcl script to run in project mode (within the full generated project). It leads down a dead end where I cant launch an implementation run to bitstream generation for the impl_1 (full static) and the child_7_impl_1 (new RM in pr_0), as it tells me “write_bitstream needs reset - reset runs” (im assuming impl_1) but when I do so it resets impl_1 and all the child runs.
When re launced after this then the implementation leads to an error due to a “ports mismatch - 296 ports are missing”. Am I effectively destroying the implemented design in an attempt regenerate an image of it with my new module in? Is this why you suggest going to a different project?
I am attempting now to follow the guide you sent (adding a new RM to the dfx KV260 design) and carry out the steps but for the composable overlay. In this guide it references working from the abstract shell design checkpoints within that project. Do these correspond to those written as part of the impl_1 run stored in the build files of the composable overlay:
Lastly, The dfx kv260 example guide creates an abstract shell where the partial bitstream is generated. Do I not also need a new full static bitstream aswell to recognise the partial bitstream?
Thanks Mario for the continued assistance and great support!
Cameron
Disregard my last message, thanks for the handy guide to the abstract shell flow it was easy enough to follow and adapt to the composable overlay now that I am more familiar with reading tcl scripts.
I’ll post a tutorial in the learn section for anyone else interested so they don’t have to struggle through these posts and so I can add in any deviations from the tutorial I had to make.