I have an issue when loading a certain data pattern in my PL dual port BRAM, from where I then stream the loaded data to my DAC (RFDC).
Before I start - here are the settings I did in my Vivado Block Design:
Board: ZCU111
BRAM Controller Settings:
Data width: 32 bit
Memory Depth (auto): 1024
BRAM:
Mode: BRAM Controller,
Type: True Dual Port RAM
Port A & B: Write/Read width is 32 bit, Write/Read Depth 1024
Binary Counter IP:
Final Count Value: 1000(hex) corresponding to 4096(dec)
Increment value (hex): 1 !
Well, I would like to generate a 1024 data-word long sequence in Python (PYNQ), which is then stored in my BRAM (therefore the BRAM depth is 1024). The data width is 32 bit and the data from PYNQ is loaded to the BRAM Port A. From Port B, I would like then to stream the data pattern in a cyclic repetition to my RFDC DAC. The address is counted by a binary counter, which final value is set to 4096.
And here is the issue! I only can see my whole data pattern on my oscilloscope, if I set the counter to count up to 4096 instead of 1024 (which I would normally expect since my BRAM depth is only 1024…).
If I set the final count to 1024, then I observe only the first quarter of my data pattern at the oscilloscope.
Moreover, If I compare the repetition rate of my pattern shown on the osci with the clock rate used, than the pattern rate is a factor of 4096 smaller than my clock.
Thus, it seems, that my BRAM is indeed 4096 deep and filled my pattern…
I cannot explain this behaviour, since I followed in PYNQ all the instructions from other tutorials (e.g. this tutorial PYNQ Controlled NeoPixel LED Cube - Hackster.io).
I have attached a screenshots of my Jupyter Notebook code where I write my data to the BRAM.
Please note, that I used the range command up to 1024 only, which corresponds to 1024 data I generated. The address increments in steps of 4 (since using a 32 bit system). The memory size in the beginning, I set to 4096, since the address editor in Vivado tells me, that the address range is 4k (4 x 1024bit). Setting the mem_size to 4096 allows me to write a 1024 long array in the BRAM. I cannot see here any issue in my code…
Are you connecting the address generator to a second AXI port on the BRAM controller or to the BRAM_B port on the block memory directly? Are you able to supply an image of the block diagram?
I may just did a further progress…
I set the increments of my counter to 4 instead of 1.
Thus, I now count to 4096 in steps of 4 (1024 steps in total).
Doing so, shows me the whole pattern with the correct pattern repetition rate (1/1024 * clock rate).
I would explain this behaviour as follows:
Normally the address of a BRAM within the PL is incremented by 1.
However, when using the PS & PYNQ, we need to increment the address in steps of 4.
Therefore, the address range is 4096 bits. I did this when I wrote data into my BRAM (see my Jupyter Notebook).
I then assumed, that I can read from the BRAM with address increments of 1.
However, I now assume, that the address for reading needs to be also incremented by 4, since we did this also the same way when writing inside the BRAM.
I think you are on the right track. I’ve been working my way through the use guide for the Block Memory Generator and it looks like it uses pseudo-byte-level addresses when 32-bit addressing is enabled (as it required when using the AXI BRAM controller). A quick search came up with this reddit thread.
I think the solution is to pad out the lower couple of address bits with zeros and attach your counter to the higher-order bits - you need the 32-bit addressing for the AXI connection.
For your notebook cell 11, it looks like the final data you wrote is:
mmio.write(1022+4, 0x80018001)
And if you read mmio.read(1023*4), this is out of the range your just wrote. Something else (or your previous code) might have written those locations.
And remember the address you provided is byte address. To have it working, the address should always look like a multiple of 4. Otherwise I am not sure how the system is writing unaligned addresses.
Sorry for my late response.
Thank you all for the feedback!
You’re right Peter, I was addressing in PYNQ using address steps of 4, however, inside my Vivado Design I used address increments of 1, assuming that the BRAM works as in the “common” mode. However, if the checkbox “Generate address interface with 32 bits” is enabled, I need to address the BRAM the same way I did in PYNQ.
Doing so, I can now also see the correct data pattern when using an unsymmetric BRAM with different input and output data bits widths.
Regarding rock’s remark:
Thanks for the hint! Unfortunately, I had no error message in PYNQ, but I you’re right.
My range is set to 1023, excluding the number 1023.