Understanding how to use custom overlay

So I’ve made a overlay, now I have a simple question, how do I use the overlay and access the accelerated functions in jupyter notebook? I’ve uploaded a image of what my overlay looks like:

I’m trying to use the HOG descriptor accel, but I’m not sure how to feed data to it, or read the output. Any help would be appreciated.

https://pynq.readthedocs.io/en/v2.0/overlay_design_methodology/overlay_tutorial.html

instead of the .tcl file however, you need to copy the .hwh file along with your bitfile.
Both have to be named the same (except for the file extension).

Best of luck

downlad .bit, .tcl , .hwh files on jupyter, and use it.

Picked up this project again now that I have time, but I’m still stuck on this part. How do I write the image into the register? I’m not sure how to convert it into byte array so it can write into the register.

your video section is currently a black box so it’s not really possible to see how everything is wired in there. Also if it’s a video processing module, it may be easier to use an AXI-DMA. Both python and DMA should allow for 64b wide transfers using the AXI-HP ports. using it is then fairly straightforward: you free some space in DDR RAM using 2 XINK buffers (1 for tx,1 for rx), you copy your image to the buffer and start the DMA. after it’s done you can open the new image. you just need to make sure everything is correctly formatted so both python and hardware can interpret the data. Naturally for this to work, you need AXI-stream interfaces instead of full or lite interfaces.

The video section I just ripped out from the original base overlay, but here’s a picture of what’s inside:

I’m not sure with how to setup or access the AXI-DMA. I 'd assume I’ll have to link it to the custom ip, but I don’t know how to get started with that…

watch these 3 videos:
the first is how to make, implement and access an HLS IP
2nd is how to make and implement an HLS AXI-stream IP with AXI-DMA
3rd is how to use it in python on the PS

before calling the DMA.wait() functions, be sure to start the IP by writing 0x01 to adress 0x00 of the IP core (HOG_accel.write(0x00,0x01) )

I’m trying to connect to the input and output of the DMA from the accel, but it seems the ports are incompatible? Not sure where to go from there.

1st you need to write your IP so it has AXI-stream interfaces. this is explained and shown in the 2nd video. you will have 1 or 2 axi stream ins and outs? and your ctrl signals.
2nd a normal DMA is easier to use to begin with, the VDMA is a bit weird in its use of control signals of the axi stream interface.

maybe stupidly enough, reproduce the tutorial given in the 2nd video, and implement it first? then you’ll get how axi stream interfaces work on the most basic level and then you can go from there.

below is an exerpt (probably horribly inefficient) of a convert to grayscale HLS core using normal DMA. On the PS, you just open an image, put it in a [n]x[m]x[3] array and stream it to PL.


#include “cvt2gray.h”

void cvtgray(int xres, int yres,axi_stream& img_in, axi_stream& img_out){

#pragma HLS PIPELINE II=2
#pragma HLS INTERFACE s_axilite port=yres
#pragma HLS INTERFACE s_axilite port=xres
#pragma HLS INTERFACE axis port=img_in
#pragma HLS INTERFACE axis port=img_out
#pragma HLS interface s_axilite port=return

axi_pixel pix_in1_r,pix_in1_g,pix_in1_b;
axi_pixel pix_out_r,pix_out_g,pix_out_b;
uint8_t r1,g1,b1,gray1;

for(int i=0;i<yres;i++){
#pragma HLS LOOP_TRIPCOUNT min=1 max=4000

for (int j=0;j<xres;j++){

#pragma HLS LOOP_TRIPCOUNT min=1 max=4000
pix_in1_r=img_in.read(); //you read the 3 color values
pix_in1_g=img_in.read();
pix_in1_b=img_in.read();

    **insert mathemagic here**
	r1=pix_in1_r.data*0.11;
	g1=pix_in1_g.data*0.59;
	b1=pix_in1_b.data*0.30;



    
	gray1=r1+g1+b1;


    **manually set the control signals**
	pix_out_r.data=gray1;
	pix_out_r.dest=pix_in1_r.dest;
	pix_out_r.id=pix_in1_r.id;
	pix_out_r.keep=pix_in1_r.keep;
	pix_out_r.last=pix_in1_r.last;
	pix_out_r.strb=pix_in1_r.strb;
	pix_out_r.user=pix_in1_r.user;
	pix_out_g.data=gray1;
	pix_out_g.dest=pix_in1_g.dest;
	pix_out_g.id=pix_in1_g.id;
	pix_out_g.keep=pix_in1_g.keep;
	pix_out_g.last=pix_in1_g.last;
	pix_out_g.strb=pix_in1_g.strb;
	pix_out_g.user=pix_in1_g.user;
	pix_out_b.data=gray1;
	pix_out_b.dest=pix_in1_b.dest;
	pix_out_b.id=pix_in1_b.id;
	pix_out_b.keep=pix_in1_b.keep;
	pix_out_b.last=pix_in1_b.last;
	pix_out_b.strb=pix_in1_b.strb;
	pix_out_b.user=pix_in1_b.user;

 **write it to the output stream**
	img_out.write(pix_out_r);
	img_out.write(pix_out_g);
	img_out.write(pix_out_b);


}

}


the .h file:

#include <stdio.h>
#include <stdint.h>
#include “hls_stream.h”
#include “hls_video.h”
#include “ap_utils.h”
#include “ap_fixed.h”
#include “ap_int.h”

#define max_width 1920
#define max_height 1200
typedef ap_axiu<8,2,5,6> axi_pixel;
typedef hls::stream<axi_pixel> axi_stream;
void cvtgray(int xres, int yres,axi_stream& img_in, axi_stream& img_out);