Interrupt latency on IRQ_F2P and supporting CORE0_NFIQ or CORE0_NFIQ

I am seeing a 1 to 2 seconds latency using a pl_to_ps DMA. I have a counter connected to the DMA and I measure how many clock tics it takes when the DMA raises the interrupt until PYQN asyncio clears returns for the next DMA. a snippet of the code is shown below:
MEM = (2^26-2)/int(np.dtype(np.uint32).itemsize)

            with xlnk.cma_array(shape=(MEM,), dtype=np.uint32) as buf:
            while True:
                dma_recv.recvchannel.transfer(buf)
                await dma_recv.recvchannel.wait_async()
                # do something async with buf

on the PL side the DMA part is running at 200MHz and axilite bus is running at 100MHz.

Questions:

  1. Are NFIQ supported?
  2. Are there special settings in AXI interrupt controller?
  3. What is the typical interrupt latency in PYNQ?

Another thing I noticed after inspecting /proc/interrupts before and after running my program is that “fabric” which is irq # 61 has a value of 2X what I expect. i.e. I wait_async returns 30 times, I get 59 interrupts.