MNIST + AXI Stream on PYNQ 2.7 (Attention to Details)

briansune · May 28, 2022, 5:13pm

Hello PYNQ community:

I wish you all doing great =]
Previous support posts are having trouble on AXI stream HLS about CNN design and DMA:
So I am going to share the details on 2020.2 HLS tool and PYNQ 2.7 design.

The Github Repository Link for this tutorial:

HLS AXI Stream

People keep asking about tlast is missing, which this is common that after version (not sure when) the structure C no longer interpret as we expected:

Old

struct AXI_DMA_IO{
	ap_int<16> data;
	ap_int<1>last;
};

New

#include <ap_axi_sdata.h>
typedef qdma_axis<16,0,0,0> AXI_DMA_IF;

Some may ask why it is qdma_axis rather than ap_axiu?
The control signals I experienced are strb or not.
So if always aligned TX / RX qdma_axis is enough!

Next, How to TX /RX on HLS?

Block slave → Block master
Remember that internal blocks connection can only use
hls::stream<data width> &<name>
rather than interface AXI stream

#include <hls_stream.h>
#include <ap_axi_sdata.h>
#include <ap_fixed.h>

typedef qdma_axis<16,0,0,0> AXI_DMA_IF;

void AXI_DMA_SLAVE(
	hls::stream<AXI_DMA_IF> &stream_in,
	hls::stream<AXI_VAL> &stream_out)
{
	AXI_DMA_IF Inbuf;

	Inbuf = stream_in.read();
	AXI_VAL status = Inbuf.data;
	stream_out.write(Inbuf.data);
}

void AXI_DMA_MASTER(
	hls::stream<AXI_VAL> &stream_in,
	hls::stream<AXI_DMA_IF> &stream_out)
{
	AXI_VAL tmp_val;
	AXI_DMA_IF Outbuf;

	tmp_val = stream_in.read();
	AXI_VAL status = tmp_val;
	Outbuf.data = tmp_val;
	Outbuf.last = 1;
	Outbuf.keep = -1;
	stream_out.write(Outbuf);
}

Now we had a complete idea on HLS side:

Next we can construct our Vivado Project:

Top design view

DMA blocks with HLS block

DMA Engine settings

Wonderful after synthesis and compile our design

Put both bit and hwh to the same folder of the PYNQ disk:

This is the Jupyter Note Book design

mnist.ipynb (12.9 KB)

For the CNN MNIST we are going to train on host PC:

Reference https://www.kaggle.com/code/oricou/mnist-without-cnn-and-softmax/notebook

These are the steps:

Float32 training with minimum layers required to achieve good accuracy
mnist_keras.py (1.9 KB)
Quantization + CNN inference compare
load_mnist.py (1.7 KB)
Model format convert
convert.py (246 Bytes)

We can see that the weights are converted in to 1+7 fixed point format here.

Accuracy are changed from 95.4 to 95.4 0.02% lost great!

After the conversion we will got a file named ‘model.tflite’ this will be load via Jupyter Book with Tensorflow Lite

Final FPGA inference result

Accuracy actually increase by 0.75% and ~- 93%

ARM Run Time # 10000 = 290.4783687591553
FPGA Run Time # 10000 = 39.431140661239624
Total Acceleration 7.366724976452236

github.com

briansune/PYNQ-2.7-MNIST/blob/main/jupyter/mnist.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e6d03e5d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "\n",
       "try {\n",
       "require(['notebook/js/codecell'], function(codecell) {\n",
       "  codecell.CodeCell.options_default.highlight_modes[\n",
       "      'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n",
       "  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n",
       "      Jupyter.notebook.get_cells().map(function(cell){\n",
       "          if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n",
       "  });\n",

This file has been truncated. show original

dorfell · June 5, 2022, 3:09pm

Hi Brian thanks for sharing, great work . Having tflite running on Pynq increase the possible applications. Also, thanks for giving some performance measures. That’s really helpful when we need to decide which development path is better for our application.
Regards,
Dorfell

briansune · June 6, 2022, 11:30am

Hello All,

A CONV+ACT+POOL->FC CNN is also committed onto the GitHub.
Acceleration rate is even more amazed:
Total Acceleration x 672.19
Accuracy is ~ 95.x when 8bit fixed point.

https://github.com/briansune/PYNQ-2.7-MNIST/tree/CNN

Topic		Replies	Views
Cnn implementation in pynq Support	1	545	October 17, 2023
Cnn in pynq board using hls Support	1	542	October 26, 2023
Tensor flow installation Support	1	433	April 18, 2023
How can I use my custom dataset on PYNQ Z2 for QNN, which is already trained on Pascal VOC dataset? Support	1	711	August 29, 2022
Creating convolution overlay Support	3	866	February 27, 2024