With custom 2022.1 design, am able to load overlay, but fail to load model

I am using the following components :

  • PYNQ 3.0.1 on Ultra96-V2
  • PYNQ-DPU 2.5.1

I am attempting to use my custom Vitis 2022.1 / Vitis-AI 2.5 design, for which I have :

  • bitstream : u96v2-benchmark.bit
  • hardware description : u96v2-benchmark.hwh
  • vitis package : u96v2-benchmark.xclbin

I also have the following device tree content (for the DPU driver and interrupt scheme), but am not clear if this is used with PYNQ-DPU

  • device tree : u96v2-benchmark.dtbo

My design contains a B2304 DPU, for which I have compiled the MNIST model using the arch.json file associated with my custom design.

I am able to load overlay :

from pynq_dpu import DpuOverlay
overlay = DpuOverlay("u96v2-benchmark.bit")

I can successfully query the overlay :

overlay?

Output exceeds the [size limit](command:workbench.action.openSettings?%5B%22notebook.output.textLineLimit%22%5D). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?771f213e-7d47-49c9-be67-ba40c719ab00)

Type: DpuOverlay String form: <pynq_dpu.dpu.DpuOverlay object at 0xffff73890040> File: /usr/local/share/pynq-venv/lib/python3.10/site-packages/pynq_dpu/dpu.py Docstring: Default documentation for overlay u96v2-benchmark.bit. The following attributes are available on this overlay: IP Blocks ---------- DPUCZDX8G_1 : pynq.overlay.DefaultIP axi_gpio_1 : pynq.lib.axigpio.AxiGPIO axi_gpio_2 : pynq.lib.axigpio.AxiGPIO system_management_wiz_0 : pynq.overlay.DefaultIP zynq_ultra_ps_e_0 : pynq.overlay.DefaultIP Hierarchies ----------- None Interrupts ---------- None GPIO Outputs ------------

...

(1) inside this module; (2) an absolute path; (3) the relative path of the current working directory. By default, this class will set the runtime to be `dnndk`.

I am not, however, able to load the model :

overlay.load_model("mnist_classifier.xmodel")

I get the following error:

image

Looking for insight to resolve the issue …

Regards,

Mario.

1 Like

Hi Mario,

Just to make sure, do you have all files (.bit, .hwh, .xclbin, .xmodel) in the same folder when running the notebook?

When the jupyter kernel crashes like that, we might be able to get some insight from the /var/log/jupyter.log. Could you share if the log there has any errors?

Also VART/XRT messages appear in the dmesg output, just run dmesg after the crash to see if there are any errors to give you a clearer idea on what is going on.

Thanks
Shawn

1 Like

Shawn,
Thank you for your reply and suggestions.

I have the following files in the same directory as the notebook:

  • mnist_classifier.ipyb
  • mnist_classifier.xmodel
  • u96v2_benchmark.bit
  • u96v2_benchmark.hwh
  • u96v2_benchmark.xclbin

Here are the output of dmesg and /var/log/jupyter.log at certains points of execution …

BEFORE loading/running notebook:
• /var/log/jupyter.log
xilinx@pynq:~$ cat /var/log/jupyter.log
[I 18:17:18.644 NotebookApp] [jupyter_nbextensions_configurator] enabled 0.4.1
[W 2023-02-14 18:17:23.448 LabApp] ‘allow_root’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.448 LabApp] ‘ip’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.449 LabApp] ‘notebook_dir’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.449 LabApp] ‘password’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.449 LabApp] ‘port’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.449 LabApp] ‘iopub_data_rate_limit’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.449 LabApp] ‘cookie_options’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-02-14 18:17:23.450 LabApp] ‘cookie_options’ has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2023-02-14 18:17:23.505 LabApp] JupyterLab extension loaded from /usr/local/share/pynq-venv/lib/python3.10/site-packages/jupyterlab
[I 2023-02-14 18:17:23.506 LabApp] JupyterLab application directory is /usr/local/share/pynq-venv/share/jupyter/lab
[I 18:17:25.043 NotebookApp] Serving notebooks from local directory: /home/xilinx/jupyter_notebooks
[I 18:17:25.044 NotebookApp] Jupyter Notebook 6.4.12 is running at:
[I 18:17:25.044 NotebookApp] http://pynq:9090/
[I 18:17:25.044 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 18:17:25.891 NotebookApp] 302 GET / (127.0.0.1) 2.860000ms
[I 18:17:25.898 NotebookApp] 302 GET /tree? (127.0.0.1) 3.600000ms
[I 14:36:55.178 NotebookApp] 302 GET / (10.0.0.29) 2.960000ms
[W 14:36:55.191 NotebookApp] Clearing invalid/expired login cookie username-10-0-0-172-9090
[W 14:36:55.193 NotebookApp] Clearing invalid/expired login cookie username-10-0-0-172-9090
[I 14:36:55.196 NotebookApp] 302 GET /tree? (10.0.0.29) 7.020000ms
[I 14:36:57.583 NotebookApp] 302 POST /login?next=%2Ftree%3F (10.0.0.29) 5.600000ms

AFTER execution of : overlay = DpuOverlay(“u96v2-benchmark.bit”)
• dmesg
[ 290.556677] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 290.556823] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
[ 290.557130] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 293.773326] fpga_manager fpga0: writing u96v2-benchmark.bin to Xilinx ZynqMP FPGA Manager
[ 293.976770] [drm] skip kind 29(AIE_RESOURCES) return code: -22
[ 293.976805] [drm] found kind 8(IP_LAYOUT)
[ 293.976815] [drm] skip kind 9(DEBUG_IP_LAYOUT) return code: -22
[ 293.976835] [drm] skip kind 25(AIE_METADATA) return code: -22
[ 293.976840] [drm] found kind 7(CONNECTIVITY)
[ 293.976848] [drm] found kind 6(MEM_TOPOLOGY)
[ 293.977080] [drm] Memory 0 is not reserved in device tree. Will allocate memory from CMA
[ 293.977138] [drm] Memory 1 is not reserved in device tree. Will allocate memory from CMA
[ 293.977184] [drm] Memory 2 is not reserved in device tree. Will allocate memory from CMA
[ 293.977228] [drm] Memory 3 is not reserved in device tree. Will allocate memory from CMA
[ 293.977272] [drm] Memory 4 is not reserved in device tree. Will allocate memory from CMA
[ 293.977316] [drm] Memory 5 is not reserved in device tree. Will allocate memory from CMA
[ 293.978785] [drm] Failed to initial CU interrupt. Fall back to polling
[ 293.978842] cu_drv CU.2.auto: cu_probe: CU[0] created
[ 293.978917] cu_drv CU.2.auto: ffff00000792ec10 kds_cfg_update: CU(0) doesnt support interrupt, running polling thread for all cus
[ 293.979089] [drm] zocl_xclbin_read_axlf ac79398d-ff4e-ecc4-defe-596e025ac395 ret: 0
[ 294.150729] [drm] bitstream ac79398d-ff4e-ecc4-defe-596e025ac395 locked, ref=1
[ 294.150781] zocl-drm axi:zyxclmm_drm: ffff000001b39010 kds_add_context: Client pid(1551) add context Domain(0) CU(0xffffffff) shared(true)
[ 294.150848] zocl-drm axi:zyxclmm_drm: ffff000001b39010 kds_del_context: Client pid(1551) del context Domain(0) CU(0xffffffff)
[ 294.150859] [drm] bitstream ac79398d-ff4e-ecc4-defe-596e025ac395 unlocked, ref=0
○
• /var/log/jupyter.log
[W 14:40:35.632 NotebookApp] Config option template_path not recognized by ExporterCollapsibleHeadings. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:35.689 NotebookApp] Config option template_path not recognized by ExporterCollapsibleHeadings. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:35.847 NotebookApp] Config option template_path not recognized by TocExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:35.882 NotebookApp] Config option template_path not recognized by TocExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:35.959 NotebookApp] Config option template_path not recognized by LenvsHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:36.018 NotebookApp] Config option template_path not recognized by LenvsHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:36.104 NotebookApp] Config option template_path not recognized by LenvsTocHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:36.164 NotebookApp] Config option template_path not recognized by LenvsTocHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:36.363 NotebookApp] Config option template_path not recognized by LenvsLatexExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:36.395 NotebookApp] Config option template_path not recognized by LenvsLatexExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:37.237 NotebookApp] Config option template_path not recognized by LenvsSlidesExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:37.275 NotebookApp] Config option template_path not recognized by LenvsSlidesExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.089 NotebookApp] Config option template_path not recognized by ExporterCollapsibleHeadings. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.139 NotebookApp] Config option template_path not recognized by ExporterCollapsibleHeadings. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.296 NotebookApp] Config option template_path not recognized by TocExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.331 NotebookApp] Config option template_path not recognized by TocExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.392 NotebookApp] Config option template_path not recognized by LenvsHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.451 NotebookApp] Config option template_path not recognized by LenvsHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.536 NotebookApp] Config option template_path not recognized by LenvsTocHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.595 NotebookApp] Config option template_path not recognized by LenvsTocHTMLExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.788 NotebookApp] Config option template_path not recognized by LenvsLatexExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:38.820 NotebookApp] Config option template_path not recognized by LenvsLatexExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:39.639 NotebookApp] Config option template_path not recognized by LenvsSlidesExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:39.679 NotebookApp] Config option template_path not recognized by LenvsSlidesExporter. Did you mean one of: extra_template_paths, template_name, template_paths?
[W 14:40:40.674 NotebookApp] Notebook u96v2-benchmark-3.0.1/mnist_classifier.ipynb is not trusted
[I 14:40:41.231 NotebookApp] Kernel started: 66b72bc4-f130-48d3-8b49-c537f3440097, name: python3
[IPKernelApp] ERROR | No such comm target registered: jupyter.widget.control
[IPKernelApp] WARNING | No such comm: 2be93046-2697-4644-8e9f-5a9531a09f61

AFTER execution of : overlay.load_model(“mnist_classifier.xmodel”)
• dmesg
[ 354.901768] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 354.901873] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
[ 354.902536] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 354.902579] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
[ 354.903190] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 354.903231] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
[ 354.903468] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 354.903521] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
[ 354.903816] zocl-drm axi:zyxclmm_drm: zocl_create_client: created KDS client for pid(1551), ret: 0
[ 354.903994] zocl-drm axi:zyxclmm_drm: zocl_destroy_client: client exits pid(1551)
• /var/log/jupyter.log
[I 14:42:20.149 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel 66b72bc4-f130-48d3-8b49-c537f3440097 restarted

I don’t see anything egregious in the logs… Also I think you can attach logs as files, next time please do that it’s a bit easier to parse.

When loading the dpu overlay, did you get a message about the /etc/vart.conf being updated? What are the contents of your /etc/vart.conf?

One thing you could try is – rename your files u96v2_benchmark.(bit,hwh,xclbin) → dpu.(bit,hwh,xclbin), and copy them over to /usr/local/share/pynq-venv/lib/python3.10/site-packages/pynq_dpu. You can make bakups of the original ones. Then run rm /usr/lib/*.xclbin to make sure the old dpu.xclbins are purged. After that try running your notebook again, but bitstream set to dpu.bit.

Thanks
Shawn