FIFO Buffer Depth when using PYNQ

Hello,

I am taking PYNQ-HelloWorld example and substituting in other video functions and have a question about MAXDEPTH, depth, and FIFOs.

For example, in the gaussian_diff example, I see that one matrix is created with a 15350 argument.

xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1> imgInput(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1> imgin1(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1> imgin2(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1, 15360> imgin3(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1> imgin4(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPC1> imgOutput(rows, cols);

and 2 xfOpenCV kernel functions use the argument as well.

xf::cv::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, TYPE, HEIGHT, WIDTH, NPC1>(imgInput, imgin1, sigma);
    xf::cv::duplicateMat<TYPE, HEIGHT, WIDTH, NPC1, 15360>(imgin1, imgin2, imgin3);
    xf::cv::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, TYPE, HEIGHT, WIDTH, NPC1>(imgin2, imgin4, sigma);
    xf::cv::subtract<XF_CONVERT_POLICY_SATURATE, TYPE, HEIGHT, WIDTH, NPC1, 15360>(imgin3, imgin4, imgOutput);

This 15360 seems to correspond to a #define called MAXDELAY (but is not called by name so it could be a coincidence)

I copied the hello-world axis functions and added this code to get it to run.

void gaussian_diff_accel(stream_t& img_inp, stream_t& img_out,
                  int rows, int cols,
                  float sigma) {


    #pragma HLS INTERFACE axis register both port=img_inp
    #pragma HLS INTERFACE axis register both port=img_out

    #pragma HLS INTERFACE s_axilite port=sigma
    #pragma HLS INTERFACE s_axilite port=rows
    #pragma HLS INTERFACE s_axilite port=cols
    #pragma HLS INTERFACE s_axilite port=return


    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX> src_mat(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX> imgin1(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX> imgin2(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX,15360> imgin3(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX> imgin4(rows, cols);
    xf::cv::Mat<TYPE, HEIGHT, WIDTH, NPIX> dst_mat(rows, cols);

    #pragma HLS DATAFLOW

    // Convert stream to xf::cv::Mat
    axis2xfMat<DATA_WIDTH, TYPE, HEIGHT, WIDTH, NPIX>(img_inp, src_mat);

    // Run xfOpenCV kernel:

    xf::cv::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, TYPE, HEIGHT, WIDTH, NPIX>( src_mat, imgin1, sigma);
    xf::cv::duplicateMat<TYPE, HEIGHT, WIDTH, NPIX,15360>(imgin1, imgin2, imgin3);
    xf::cv::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, TYPE, HEIGHT, WIDTH, NPIX>(imgin2, imgin4, sigma);
    xf::cv::subtract<XF_CONVERT_POLICY_SATURATE, TYPE, HEIGHT, WIDTH, NPIX,15360>(imgin3, imgin4, dst_mat);

    // Convert xf::cv::Mat to stream
    xfMat2axis<DATA_WIDTH, TYPE, HEIGHT, WIDTH, NPIX>(dst_mat, img_out);

Without the 15360 arguments, it compiles but the DMA hangs.

I don’t know what these do or when to apply them. Copying them seems to work but I am looking for details so that I can independently use custom functions.

(FYI - I get a very dark image at the end of the subtract so I am not sure the science is working but at least I am getting the DMAs to complete.)

Thanks,
John

1 Like

Hi @jcollier,

This is not really a PYNQ question. It is more about HW design, I will try to give a generic answer. If you need more information you may be better off in the Xilinx forums.

You are trying to implement a dataflow pipeline, as the name indicates the data is continuously flowing. The pipeline forks the data stream in two, one branch goes direct to the subtract function whereas the another is processed, then it goes to the join function.
The GaussianBlur takes some cycles to produce its output. The subtract function cannot start processing until both branches have data.
If you do not have a mechanisms to synchronize the branches the design will deadlock. This is the reason you need a FIFO in the shortest path.

Hope this helps.

Mario

1 Like

Perfect and good point that it is not really a PYNQ question.

Thanks again,
John