Understanding DMA's behavior

Hi all,
Recently I have been working on an introduction project with some IPs created in System Generator and HLS. The idea of the project-notebook was to understand the behavior of DMA.
I found some troubles and some quite-strange behavior that I can not understand.
Instead of creating a huge post, I preferred to create a repository with all necessary code and also a semi-explicative notebook.
The URL of the notebook is AXI_interfaces/AXI_interfaces.ipynb at master · Kayzh3r/AXI_interfaces · GitHub
The idea is to focus on the hierarchies multiplier_hls_stream and multiplier_stream. Both hierarchies do the same, multiplying an input AXI4 Stream of UInt32 to an AXILite Uint32. I have two main questions:

  1. Why the limit of transfer buffer in HLS Multiplier is 1024 samples of Uint32? If I select more, I meant 2048, when I call wait method the application hangs

  2. Why the limit of transfer buffer in Stream Multiplier is 4 samples of Uint32? If I select more, I meant 5, when I call wait method the application hangs having the DMA IP block configuration than HLS Multiplier

Thank so much, kind regards