This post summarizes my unsuccessful attempt to correctly execute Tensorflow 2.5 in the Zybo-Z7 running Pynq 2.7.
The stages described in this post come from the following sources:
Installing Tensorflow in Pynq by Logictronix https://logictronix.com/wp-content/uploads/2019/04/TensorFlow_Installation_on_PYNQ_Nov_6_2018.pdf.
Installing Tensorflow in Raspberry PI by Sam Westby Tech: https://www.youtube.com/watch?v=QLZWQlg-Pk0
Wheels available for ARMV7l architecture by Katsuya Hyodo
Lastly, a demonstration video of the TensorFlow test can be found in: https://www.youtube.com/watch?v=3m7kavySEYE. Note that I haven’t speed up the video to give you an idea of the execution time. For users interested in just having the big picture please fast forward the video .
1. Installing Python3.7 (50 min approx.)
The Pynq 2.7 release comes with Python3.8, but the latest TensorFlow wheel available for the
armv7l architecture is compiled for Python3.7. For this reason, the former python version must be installed in the Pynq.
- $ cd /home/xilinx/
- $ wget http://www.python.org/ftp/python/3.7.0/Python-3.7.0.tar.xz
- $ tar -xf Python-3.7.0
- $ ./configure
- $ make install
2. Create a virtual environment for Python3.7 (5 min approx.)
To avoid unresolved dependencies between packages and be able to create an Ipython kernel in the future, it is necessary to create a virtual environment.
- $ cd /home/xilinx/
- $ python3.7 -m pip install virtualenv
- $ python3.7 -m virtualenv env
- $ source /env/bin/activate
3. Install TensorFlow in the virtual environment (> 5h approx.)
Warning: This step can take a lot of time. I’ll comment the estimated time for each command.
To get the TensorFlow 2.5 wheel there are two alternatives:
The first one: using the scripts:
- $ wget https://raw.githubusercontent.com/PINTO0309/Tensorflow-bin/main/previous_versions/download_tensorflow-2.5.0-cp37-none-linux_armv7l_numpy1195.sh
- $ chmod +x download_tensorflow-2.5.0-cp37-none-linux_armv7l_numpy1195.sh
- $ ./download_tensorflow-2.5.0-cp37-none-linux_armv7l_numpy1195.sh
OR The second one: Download the wheel file from https://drive.google.com/uc?id=1iqylkLsgwHxB_nyZ1H4UmCY3Gy47qlOS and copy it in /home/xilinx/, which is located in the ROOT partition.
Then, use these commands:
- (env) $ apt update # 2 min approx.
- (env) $ apt install libhdf5-dev # 2 min approx.
- (env) $ pip3.7 install --no-binary=h5py h5py # 2 h approx.
- (env) $ pip3.7 install tensorflow-2.5.0-cp37-none-linuxarmv7l.whl # 2h approx.
- (env) $ exec $SHELL
- $ cd /home/xilinx/
- $ source env/bin/activate
- (env) $ python3.7 -m pip install matplotlib # 30 min approx.
- (env) $ python3.7
- >>> import tensorflow
- >>> tensorflow.version
The last command should return ‘2.5.0’.
4. Create the Ipython Kernel (1h approx.)
To use TensorFlow in a Jupyter notebook a kernel must be created.
- (env) $ python3.7 -m pip install ipykernel # 1h approx.
- (env) $ python3.7 -m ipykernel install --user --name=tfenv
After this step the kernel for tensorflow has been created.
5. Testing tensorflow
The Notebook I used for testing Tensorflow can be found in https://gitlab.com/dorfell/fer_sys_dev/-/tree/master/01_hw/Pynq_Zybo-Z7. It is the same notebook used in the demonstration video.
In the last cell when executing “model.fit” the reported errors are:
- 2022-05-05 22:33:04.821349: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
- 2022-05-05 22:33:04.882152: W tensorflow/core/platform/profile_utils/cpu_utils.cc:118] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
As indicated, the cpu_utils.cc should have a “condition” for the armv7l architecture. As seen for the aarch64 in https://github.com/tensorflow/tensorflow/pull/46643/files/239839b09e02b5766d61041f31027de4eb882c45.
However, as shown in the video this *.cc file is not available since the wheel is a pre-compiled package.
After a quick search using ls and grep, the only available files are *.so libraries and *.h headers files.
A possible solution would be cross-compiling Tensorflow from source code, considering the Python3.8 available in the Pynq 2.7 release. And of course the armv7l architecure.
- The total installation time could be more than to 5 hours.
- The time spent executing the notebook can be over several seconds, but at the end failed in the model.fit step.
- This post didn’t aim to test the inference process, maybe the performance is better in forward propagation. In addition, integer inference with TensorFlow Lite should be considered as well.
- Is it worth to create a DPU overlay to increase performance? (Maybe something similar to Xilinx DNNDK?).
- If it is really need to train in the embedded system, other libraries such as PyTorch should be taking into account.
Thanks for reading the whole post , I hope you weren’t bored