PYNQ-Z2, v2.6 I think, Vitis HLS and Vivado 2023
Hello! I’m new to PYNQ and FPGAs. I have an IP design that takes in two input streams, and outputs two streams. This works correctly. However, the input streams are input data, which is slow to calculate, but the board could calculate it by itself, if given 6 constant input values. I’m trying to make a new version that just takes those 6 input values (2 are ints, height and width of the output buffer – relevant in a sec), and outputs the same streams.
However, the issue I’m experiencing is that the DMA transfer hangs. Writing the 6 input values works fine, but when I call the DMA recv transfer, then write a 1 to AP_START (or in the opposite order), I find that the DMA wait() method hangs forever. It should terminate in under a second.
I had a similar issue to this before, where the problem was that the board did not know the length of the input/output buffers. However, this was fixed once I passed this number to the function, and just called read()/write() inside a for-loop that ran that many times. Now, instead of one input number, the stream length is the product of the two int inputs (height * width, it’s 2D). There are 2 nested for-loops, and the write() method is called inside the inner one, so it should run the correct number of times. However, if there’s some way I’m supposed to manually signal the end of the stream, I’m not aware of it. (Note: the stream uses ap_uint and should already have TLAST taken care of, so I don’t think that’s the issue. If it were, my previous example with input and output streams would not have worked.)
Does anyone know what the problem might be? I know I don’t give a ton of context, please let me know what other info I could provide that would be helpful!
Here is my HLS code:
#include <math.h>
#include <complex>
#include "ap_axi_sdata.h"
#include "hls_stream.h"
typedef ap_axis<32,0,0,0> transPkt;
// For converting byte-wise from float to integer and back, because hls streams use ints
union fp_int {
int i;
float fp;
};
inline std::complex<float> func_to_test(std::complex<float> z) {
return __complex_cos(z);
}
void my_ip(int width_px, int height_px, float xMin, float xMax, float yMin, float yMax,
hls::stream<transPkt>&out_hues, hls::stream<transPkt>&out_brightnesses) {
#pragma HLS INTERFACE mode=axis port=in_angles,in_moduluses,out_hues,out_brightnesses
#pragma HLS INTERFACE s_axilite port=width_px
#pragma HLS INTERFACE s_axilite port=height_px
#pragma HLS INTERFACE s_axilite port=xMin
#pragma HLS INTERFACE s_axilite port=xMax
#pragma HLS INTERFACE s_axilite port=yMin
#pragma HLS INTERFACE s_axilite port=yMax
#pragma HLS INTERFACE mode=s_axilite port=return
fp_int angle, modulus, hue, brightness;
transPkt io1pkt, io2pkt; // output packets for hue and brightness respectively
// here are the for-loops that should create a height*width length output
for (unsigned int x = 0; x<width_px; x++){
for (unsigned int y = 0; y<height_px; y++){
std::complex<float> z(
xMin + (x/(float)width_px)*(xMax-xMin),
yMin + (y/(float)height_px)*(yMax-yMin)
);
[various calculations removed -- it's a complex function color plot generator]
//AXIS output packets are expecting integer type
io1pkt.data = (angle / TWOPI) * 255;
io2pkt.data = frac_lightness * 255; // frac_lightness is generated in the removed code
out_hues.write(io1pkt); // write values to stream
out_brightnesses.write(io2pkt);
}
}
}