I am building a Sobel filter by myself as an exercise.
Here is my block design
The input and output are 1080x1920 gray scale image. I use the High Performance Port to transfer image data, and use GPIO port to control the Sobel Filter IP.
Here is my port addresses
On the python driver code, I firstly load the image to a continuous memory on DRAM, and pass read and write addresses to PL, then trigger the IP to run by hw_sobel.write(0x00, 0x1)
.
My problem is, I don’t know how to measure the running time of my design properly. I currently use %timeit hw_sobel.write(0x00, 0x1)
, but it gives me unreasonable result,
The slowest run took 4.26 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 29.8 µs per loop
.
I think the reason is, hw_sobel.write(0x00, 0x1)
is running in an asynchronized manner with IP operating. And I need some suggestions for measuring running time in such case.
Thanks.