Hello,

I am working on the PYNQ DPU on zcu104. I retrained Resnet50 on the MNIST dataset, and I got an accuracy of 97.29%. However, after running the obtained Xmodel on ZCU104, I got an accuracy of 101.83 % .

I don’t know what’s the problem?

Hello,

I am working on the PYNQ DPU on zcu104. I retrained Resnet50 on the MNIST dataset, and I got an accuracy of 97.29%. However, after running the obtained Xmodel on ZCU104, I got an accuracy of 101.83 % .

I don’t know what’s the problem?

1 Like

Hi there,

Can you share some more info/code/model input/output shapes? I assume you just swapped the output layer of resnet for the 10 classes of mnist? Did you change anything in the inference notebook?

Thanks

Shawn

For the training of Resnet 50 I used the following code

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

train_images_gr = x_train.reshape(x_train.shape[0], 28, 28, 1)

train_images_3ch = np.stack([x_train]*3, axis=-1)

import cv2def resize_image_array(img, img_size_dims):

img = cv2.resize(img, dsize=img_size_dims,

interpolation=cv2.INTER_CUBIC)

img = np.array(img, dtype=np.float32)

return imgIMG_DIMS = (32, 32)

X_train = np.array([resize_image_array(img, img_size_dims=IMG_DIMS) for img in train_images_3ch])

Y_train =np.eye(10)[y_train]

input = tf.keras.Input(shape=(32,32,3))

efnet = tf.keras.applications.ResNet50(weights=‘imagenet’,

include_top = False,

input_tensor = input)

gap = tf.keras.layers.GlobalMaxPooling2D()(efnet.output)output = tf.keras.layers.Dense(10, activation=‘softmax’, use_bias=True)(gap)

func_model = tf.keras.Model(efnet.input, output)func_model.compile(optimizer=‘adam’,

loss=“categorical_crossentropy”,

metrics=[‘accuracy’])func_model.fit(X_train, Y_train, epochs=5, steps_per_epoch = 60000//32)

And for the deployement of the final Xmodel using PYNQ DPU on ZCU104, I used the following

from pynq_dpu import DpuOverlay

overlay = DpuOverlay(“dpu.bit”)

overlay.load_model(“Resnet50_test.xmodel”)

from time import time

import numpy as np

import mnist

import matplotlib.pyplot as plt

%matplotlib inline

from six.moves import urllibopener = urllib.request.build_opener()

opener.addheaders = [(‘User-agent’, ‘Mozilla/5.0’)]

urllib.request.install_opener(opener)

with np.load(“mnist.npz”, allow_pickle=True) as f:

x_test, y_test = f[‘x_test’], f[‘y_test’]test_images_gr = x_test.reshape(x_test.shape[0], 28, 28, 1)

test_images_3ch = np.stack([x_test]*3, axis=-1)

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

import cv2def resize_image_array(img, img_size_dims):

img = cv2.resize(img, dsize=img_size_dims,

interpolation=cv2.INTER_CUBIC)

img = np.array(img, dtype=np.float32)

return imgIMG_DIMS = (32, 32)

test_data = np.array([resize_image_array(img, img_size_dims=IMG_DIMS) for img in test_images_3ch])

test_label=np.eye(10)[y_test]

print(‘Test_images.shape: {}, of {}’.format(test_data.shape, test_data.dtype))

print(‘Test_label.shape: {}, of {}’.format(test_label.shape, test_data.dtype))

dpu = overlay.runnerinputTensors = dpu.get_input_tensors()

outputTensors = dpu.get_output_tensors()shapeIn = tuple(inputTensors[0].dims)

shapeOut = tuple(outputTensors[0].dims)

outputSize = int(outputTensors[0].get_data_size() / shapeIn[0])softmax = np.empty(outputSize)

output_data = [np.empty(shapeOut, dtype=np.float32, order=“C”)]

input_data = [np.empty(shapeIn, dtype=np.float32, order=“C”)]

image = input_data[0]

def calculate_softmax(data):

result = np.exp(data)

return result

total = test_data.shape[0]

predictions = np.empty_like(test_label)

print(“Classifying {} digit pictures …”.format(total))start = time()

for i in range(total):

image[0,…] = test_data[i]

job_id = dpu.execute_async(input_data, output_data)

dpu.wait(job_id)

temp = [j.reshape(1, outputSize) for j in output_data]

softmax = calculate_softmax(temp[0][0])

predictions[i] = softmax.argmax()stop = time()

correct = np.sum(predictions==test_label)

execution_time = stop-start

print(“Overall accuracy: {}”.format(correct/total))

print(" Execution time: {:.4f}s".format(execution_time))

print(" Throughput: {:.4f}FPS".format(total/execution_time))

Train_images.shape: (60000, 32, 32, 3), of float32

Train_label.shape: (60000, 10), of float32

Test_images.shape: (10000, 32, 32, 3), of float32

Test_label.shape: (10000, 10), of float32

In the future please use code blocks for code instead of blockquote… Will make it much easier for a reader to parse. I’m a bit confused why there’s 3 channels on the last dimension on your mnist data?

When you run this code does it not give you any warnings? I would double check all the shapes of your intermediary arrays to make sure there’s no unexpected behavior when doing matrix operations. For example, you shouldn’t need to format your test labels into one-hot encoded representation with `np.eye(10)[y_test]`

when the output from the dpu is already softmax’d and the maximum argument taken.

Thanks

Shawn

Dear Shawn,

Thank you for your prompt reply!

Ok I will and sorry for that.

There is 3 channels because I am using Resnet50 and I have read here that for Resnet50 we should resize MNIST to be (32,32,3).

Actually when running the part of predictions I got the following warning **/usr/lib/python3/dist-packages/ipykernel_launcher.py:2: RuntimeWarning: overflow encountered in exp** .

Considering the formatting of test labels into one-hot encoded representation with `np.eye(10)[y_test]`

, I have do it to keep the same preprocessing as in the training which is needed as I told you when retraining Resnet50 with MNIST dataset.

I have checked all the shapes of the intermediary arrays as you mentionned and I found that the output_data shape is (1, 1, 1, 1, 10) which I think should be (1,1,10). But I don’t know what’s causing this and how to solve it.

Thanks

I have changed test_label= y_test instead of np.eye(10)[y_test] and I got an accuracy of 98.52%. I don’t know if what I did is correct ?

That seems like a fairly common result for MNIST. I suppose the extra dimensions by doing the eye operation is what mucked up the original calculation.

Thanks

Shawn

I was wondering if it is correct because I have had an eye operation during the training.

1 Like