Canny Edge Detection Vitis Vision Library

Hi all,

Relatively new to the PYNQ community, I have followed some tutorials, so please bear with me!

I am currently trying to implement the Canny from the Vitis Vision Library on my PYNQ-Z2 Board.

Here is the xf_canny_accel.cpp file:

#include "ap_int.h"
#include "common/xf_common.hpp"
#include "common/xf_utility.hpp"
#include "hls_stream.h"
#include "imgproc/xf_canny.hpp"
#include "imgproc/xf_edge_tracing.hpp"
#include "xf_config_params.h"

#include "xf_canny_config.h"

extern "C" {
void canny_accel(ap_uint<INPUT_PTR_WIDTH>* img_inp,
                 ap_uint<OUTPUT_PTR_WIDTH>* img_out,
                 int rows,
                 int cols,
                 int low_threshold,
                 int high_threshold) {
// clang-format off
    #pragma HLS INTERFACE m_axi     port=img_inp  offset=slave bundle=gmem1
    #pragma HLS INTERFACE m_axi     port=img_out  offset=slave bundle=gmem2

// clang-format on

// clang-format off
    #pragma HLS INTERFACE s_axilite port=rows     
    #pragma HLS INTERFACE s_axilite port=cols     
    #pragma HLS INTERFACE s_axilite port=low_threshold     
    #pragma HLS INTERFACE s_axilite port=high_threshold     
    #pragma HLS INTERFACE s_axilite port=return
    // clang-format on

    int npcCols = cols;
    int divNum = (int)(cols / 32);
    int npcColsNxt = (divNum + 1) * 32;
    if (cols % 32 != 0) {
        npcCols = npcColsNxt;
    }
    printf("actual number of cols is %d \n", npcCols);

    xf::cv::Mat<XF_8UC1, HEIGHT, WIDTH, INTYPE> in_mat(rows, cols);
    xf::cv::Mat<XF_2UC1, HEIGHT, WIDTH, XF_NPPC32> dst_mat(rows, npcCols);

#pragma HLS DATAFLOW

    xf::cv::Array2xfMat<INPUT_PTR_WIDTH, XF_8UC1, HEIGHT, WIDTH, INTYPE>(img_inp, in_mat);
    xf::cv::Canny<FILTER_WIDTH, NORM_TYPE, XF_8UC1, XF_2UC1, HEIGHT, WIDTH, INTYPE, XF_NPPC32, XF_USE_URAM>(
        in_mat, dst_mat, low_threshold, high_threshold);
    xf::cv::xfMat2Array<OUTPUT_PTR_WIDTH, XF_2UC1, HEIGHT, WIDTH, XF_NPPC32>(dst_mat, img_out);
}
}

The xf_canny_config.h file is not modified too.

I export this out of Vitis HLS into an IP for my Vivado Design.
Here is my Vivado block diagram I have connected it using the automatic tool:

I have also enabled the S AXI HP0 Interface. I run synthesis, impl and generate the bitstream for Jupyter notebooks.

Here is the notebook:
New Canny-Copy1.ipynb (132.5 KB)

Sometimes I will get an empty black plot output, or sometimes when playing around with values and running the notebook a few times I will receive the output of the image being edge detected but it has 4 of the objects.

Help and guidance would be much appreciated, I think I might be doing the block diagram incorrectly in Vivado and am missing a required block. Another thing might be I am not correctly providing the image to the buffer to be processed.

Here is an image of the output sometimes:

Hi @MrMastive,

Welcome to our community.

Please note that the number of bits used per pixel in the output of your function is different to the number of bits at the input.

You are passing an image with a single channel and 8-bit per pixel, XF_8UC1. Whereas in the output, you are generating a single channel and 2-bit per pixel image, XF_2UC1. 4 time less bits per pixel, this explain the image you are getting.

You have two options:

Mario

Hi @marioruizm,

I exported both of the accelerated functions into IP Blocks separately. Here was the code of the edge tracing file:

/*
 * Copyright 2019 Xilinx, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include "xf_canny_config.h"

extern "C" {
void edgetracing_accel(ap_uint<INPUT_PTR_WIDTH>* img_inp, ap_uint<OUTPUT_PTR_WIDTH>* img_out, int rows, int cols) {
// clang-format off
    #pragma HLS INTERFACE m_axi     port=img_inp  offset=slave bundle=gmem3
    #pragma HLS INTERFACE m_axi     port=img_out  offset=slave bundle=gmem4
// clang-format on

// clang-format off
    #pragma HLS INTERFACE s_axilite port=rows     
    #pragma HLS INTERFACE s_axilite port=cols     
    #pragma HLS INTERFACE s_axilite port=return
    // clang-format on

    int npcCols = cols;
    int divNum = (int)(cols / 32);
    int npcColsNxt = (divNum + 1) * 32;
    if (cols % 32 != 0) {
        npcCols = npcColsNxt;
    }

    int npcCols_8 = cols;
    int divNum_8 = (int)(cols / 8);
    int npcColsNxt_8 = (divNum_8 + 1) * 8;
    if (cols % 8 != 0) {
        npcCols_8 = npcColsNxt_8;
    }

    xf::cv::Mat<XF_2UC1, HEIGHT, WIDTH, XF_NPPC32> _dst1(rows, npcCols);
    xf::cv::Mat<XF_8UC1, HEIGHT, WIDTH, XF_NPPC8> _dst2(rows, npcCols_8);

    xf::cv::Array2xfMat<INPUT_PTR_WIDTH, XF_2UC1, HEIGHT, WIDTH, XF_NPPC32>(img_inp, _dst1);
    xf::cv::EdgeTracing<XF_2UC1, XF_8UC1, HEIGHT, WIDTH, XF_NPPC32, XF_NPPC8, XF_USE_URAM>(_dst1, _dst2);
    xf::cv::xfMat2Array<OUTPUT_PTR_WIDTH, XF_8UC1, HEIGHT, WIDTH, XF_NPPC8>(_dst2, img_out);
}
}

It takes an input with 2-bit per pixel image and outputs an 8-bit per pixel image.

I placed the IP blocks into the block design in Vivado and used the automation tool again:

Then in jupyter notebooks, I repeated the process but included the edge tracing block by accessing its registers. I input the image into the input buffer, which is processed through the canny block and outputted at the output buffer. I then use that data in the output buffer to send it to a new input buffer for edge tracing and it is processed and outputted. However, when plotting the output after edge tracing it’s all black, an empty plot essentially. I’m still unsure on how to implement this correctly. I’ve also tried other methods to include xf::cv::edgetracing within the canny_accel function but no luck as well. I’,m trying my best to understand and so that all data types are correct but still unsure.

Here is my jupyter notebook file:
New Canny.ipynb (172.7 KB)

More guidance would be much appreciated!

I think your problem is that the intermediate buffers are not the correct size. out_buffer and in_buffer2

You could try to allocate these with 4 times less resolution, to accommodate for the fact that you are using only 2-bit per pixel at the output of canny and the data type is 8-bit.

out_buffer = allocate(shape=(height//4, width//4), dtype=np.uint8, cacheable=1)

In fact, you do not need to allocate in_buffer2, you can reuse out_buffer as input of the EdgeTracing function, perhaps with a better name.

*Note that I have not tried this in hardware with your code. My comment above is only a suggestion.

I’ve also tried other methods to include xf::cv::edgetracing within the canny_accel function but no luck as well.

This should work. Can you include the HLS code where you are doing this?

I also noticed that you are using Array2xfMat, where in the example that I linked earlier this is not used.

Mario

When I try to run the synthesis for the above EdgeTracing example in Vitis, I get an error:

/*
 * Copyright 2019 Xilinx, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include "xf_canny_config.h"

extern "C" {
void edgetracing_accel(ap_uint<INPUT_PTR_WIDTH>* img_inp, ap_uint<OUTPUT_PTR_WIDTH>* img_out, int rows, int cols) {
// clang-format off
    #pragma HLS INTERFACE m_axi     port=img_inp  offset=slave bundle=gmem3
    #pragma HLS INTERFACE m_axi     port=img_out  offset=slave bundle=gmem4
// clang-format on

// clang-format off
    #pragma HLS INTERFACE s_axilite port=rows     
    #pragma HLS INTERFACE s_axilite port=cols     
    #pragma HLS INTERFACE s_axilite port=return
    // clang-format on

    int npcCols = cols;
    int divNum = (int)(cols / 32);
    int npcColsNxt = (divNum + 1) * 32;
    if (cols % 32 != 0) {
        npcCols = npcColsNxt;
    }

    int npcCols_8 = cols;
    int divNum_8 = (int)(cols / 8);
    int npcColsNxt_8 = (divNum_8 + 1) * 8;
    if (cols % 8 != 0) {
        npcCols_8 = npcColsNxt_8;
    }
    // printf("actual number of cols is %d \n", npcCols);
    // printf("actual number of cols is multiple 8 :%d \n", npcCols_8);

    // printf("\nbefore allocate\n");
    xf::cv::Mat<XF_2UC1, HEIGHT, WIDTH, XF_NPPC32> _dst1(rows, npcCols, img_inp);
    xf::cv::Mat<XF_8UC1, HEIGHT, WIDTH, XF_NPPC8> _dst2(rows, npcCols_8, img_out);
    // printf("\nbefore kernel call\n");
    xf::cv::EdgeTracing<XF_2UC1, XF_8UC1, HEIGHT, WIDTH, XF_NPPC32, XF_NPPC8, XF_USE_URAM>(_dst1, _dst2);
    // printf("\nafter kernel call\n");
}
}

Console:

WARNING: [RTGEN 206-101] Design contains AXI ports. Reset is fixed to synchronous and active low.
ERROR: [RTGEN 206-102] Illegal connection is found on FIFO pin 'edgetracing_accel|p_dst2_data' connecting to 'call_ln50'('edgetracing_accel_Pipeline_VITIS_LOOP_719_1_VITIS_LOOP_720_2_VITIS_LOOP_721_31|p_dst2_data').
ERROR: [RTGEN 206-102] Illegal connection is found on FIFO pin 'edgetracing_accel|p_dst2_data' connecting to 'call_ln35'('edgetracing_accel_Pipeline_WR_FIN_PIPE|p_dst2_data').
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/gmem3' to 'm_axi'.
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/gmem4' to 'm_axi'.
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/img_inp' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/img_out' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/rows' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'edgetracing_accel/cols' to 's_axilite & ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on function 'edgetracing_accel' to 's_axilite & ap_ctrl_hs'.
INFO: [RTGEN 206-100] Bundling port 'img_inp', 'img_out', 'rows', 'cols' and 'return' to AXI-Lite port control.
INFO: [RTGEN 206-100] Generating core module 'mul_mul_11ns_11ns_22_4_1': 2 instance(s).
INFO: [RTGEN 206-100] Generating core module 'mul_mul_12s_6ns_18_4_1': 1 instance(s).
INFO: [RTGEN 206-100] Generating core module 'mul_mul_13ns_11ns_23_4_1': 1 instance(s).
INFO: [RTGEN 206-100] Generating core module 'mul_mul_23ns_6ns_29_4_1': 1 instance(s).
INFO: [RTGEN 206-100] Generating core module 'udiv_11s_6ns_11_15_seq_1': 1 instance(s).
INFO: [RTGEN 206-100] Generating core module 'udiv_12ns_11ns_12_16_seq_1': 1 instance(s).
INFO: [RTGEN 206-100] Generating core module 'urem_13ns_3ns_13_17_seq_1': 1 instance(s).
INFO: [RTGEN 206-100] Finished creating RTL model for 'edgetracing_accel'.
ERROR: [HLS 200-103] RTL generation terminated by exceptions!

So I was playing around to modify it to not receive errors and that’s what I came up with in my above reply. However modifying it, may be why it is no longer working.

What Vitis HLS version are you using? And what Vitis Accelerated Libraries branch?

Did you check changes in the Python code I suggested?

Mario

I am using Vitis HLS 2022.1 and on the 2022.1 update 3 branch. I originally tried it on the master branch also but it also had produced the same error. Really frustrating that this error is occurring!

I did check the python code, the edge tracing block still outputted a blank black plot. The allocation however for the Canny seemed to somewhat work. Dividing the height and width by 4 on the output buffer produces this on a plot:

And dividing by 2 on width and height for experimental purposes produced:

However, to get only 1 apple object I divided only the width by 4, the height stayed the same.


It produces a really squashed object. Not sure if this is the correct way to view it.

EDIT: After restarting PYNQ kernal, I can’t plot these no longer with the divided height and width… Jupyter Notebooks just hangs.

Okay, my mistake, the EdgeTrace is somewhat working now from the same version you replied to where I was using Array2xfMat so it was how I modified it so it would export out without error yesterday, I’m not sure why it wasn’t before I didn’t change anything. I get this result in the plot:

Here is my notebook for a clearer view:
New Canny (2).ipynb (204.0 KB)

EDIT: I restart pynq kernal and the edge tracing block stops working again. It’s really odd…

If I remove the // 4 on the width and height, the weird line noise is removed in the edgetrace plot, but there are a lot of small apple objects still. Quite not working yet :frowning: