PYNQ: PYTHON PRODUCTIVITY

Stuck at recvchannel.wait()

I am using Ultra96v2, and tried my own HLS IP. But the code in jupyer notebook always stuck at recvchannel.wait(). How to deal with this?

Here is my Block design, the HLS IP is called HLS_accel_0.

The hls code is atteched below, and the main inferface function is defined as

void Hls_accel(hls::stream<AXI_DMA_IO> &stream_in, hls::stream<AXI_DMA_IO> &stream_out,
		int a, int b, int &c){
#pragma HLS INTERFACE s_axilite register port=return
#pragma HLS INTERFACE s_axilite port=b
#pragma HLS INTERFACE s_axilite port=a
#pragma HLS INTERFACE s_axilite port=c
#pragma HLS INTERFACE axis register both port=stream_in
#pragma HLS INTERFACE axis register both port=stream_out

c = a + b;

#pragma HLS DATAFLOW
stream<ap_uint<IN0_CH*ACT_BW_D> > img_in("img_in_stream");
strm_image<IN0_ROW,IN0_COL,IN0_CH,ACT_BW_D>(stream_in, img_in);
strm_out<IN0_ROW, IN0_COL, IN0_CH, ACT_BW_D>(img_in, stream_out);
}

AXI_DMA_IO is defined in main.h.

struct AXI_DMA_IO{
        ap_uint<64> data;
        ap_uint<1> last;
};

strm_image and strm_out is defined in function.h

template <	unsigned IN_ROW,
			unsigned IN_COL,
			unsigned IN_CH,
			unsigned BW
		>
void strm_image(hls::stream<AXI_DMA_IO> &stream_in, hls::stream<ap_uint<IN_CH*BW> > &in){
	AXI_DMA_IO tmp;
//	static_assert(sizeof(tmp.data)==sizeof(ap_int<IN_CH*BW>),"DMA width != img_in width");
	for (int i = 0; i < IN_ROW*IN_COL; i++){
#pragma HLS PIPELINE
		tmp = stream_in.read();
		in.write(tmp.data);
	}
}

template <	unsigned IN_ROW,
			unsigned IN_COL,
			unsigned IN_CH,
			unsigned BW
		>
void strm_out(hls::stream<ap_uint<IN_CH*BW> > & img_out, hls::stream<AXI_DMA_IO>& stream_out){
	AXI_DMA_IO tmp;
//	static_assert(sizeof(tmp.data)==sizeof(ap_int<IN_CH*BW>),"DMA width != img_in width");
	for (int i = 0; i < IN_ROW*IN_COL; i++){
#pragma HLS PIPELINE
		tmp.data = img_out.read();
		if(i == IN_ROW*IN_COL -1)
			tmp.last = 1;
		else
			tmp.last = 0;
		stream_out.write(tmp);
	}

}

hls_code.zip (2.4 KB)

The jupyer code is

from pynq import Overlay
import pynq
import numpy as np
overlay = Overlay('./design_1.bit')
dma_x = overlay.axi_dma_0.sendchannel
dma_y = overlay.axi_dma_0.recvchannel
x = np.random.randint(0, 10, size=(4,4), dtype=np.uint64)
buff_x = pynq.allocate(shape=(4,4), dtype=np.int64)
buff_x[:] = x
buff_y = pynq.allocate(shape=(4,4), dtype=np.uint64)
dma_x.transfer(buff_x)
dma_y.transfer(buff_y)
dma_x.wait()
dma_y.wait()

The program stuck at dma_y.wait()
image

My .bit/.tcl/.hwh files are attached below
bit_hwh_tcl.zip (261.3 KB)

Hi @sunwy,

Which pynq sdcard image version are you using?
This may be related to this question

Mario

Hi, I‘m using ultra96v2_v2.5. Do you have any suggestiongs? Thanks very much!

Is the AXI HP0 FPD Data with set to 128? Like suggested here

I would also suggest to burn the 2.6 pynq image.

You may also want to check if the stream_out is asserting tlast by adding a ChipScope on your design.

Mario

1 Like

You don’t seem to be starting your HLS IP.
add to your jupyter code:
hls_ip=overlay.Hls_accel_0

and before your wait functions:
hls_ip.write(0x00,0x01) #this sets the ip_start bit so the core starts processing the data

also, connect the DMA interrupts to your PS

2 Likes

Thanks for the suggestions, I will try 2.6 image later!

Yes, hls_ip.write(0x00,0x01) really works. Thanks very much!

1 Like