Can you send your HLS code? Specifically the loop where you’re calling input.read() and output.write(). I had a similar issue recently, though on a PYNQ-Z2, where my board did not know the correct length of data to read in and write out, and so it never terminated the DMA transfer because it never knew to stop.
The other catch is that if the DMA stream fails once, it may not work again until you fully restart the board, even if you switch which overlay you’re using. So maybe you tried one thing that failed, but the next thing you tried would have worked, if you’d restarted the board first. This may or may not be true for a KRIA, no idea. Worth a try though.
I can’t say for certain if this is the same issue since I’m new to FPGA development, but it sounds like the exact same thing as far as I can tell.