Yes, in Windows that works - supposedly (haven't tried it myself). It's the same mechanism that makes iGPU and dGPU work together in laptops, where the display is only connected to the iGPU. There is an extra framebuffer copy that can introduce latency, but you still get accelerated graphics. I don't know if this works with Linux, though.Does something like this Quadro P4 work for accelerating CAD programs like AutoCAD or the likes of Adobe Photoshop or Premiere when using another GPU for video output? I'm thinking workstation use, not vGPU or any virtualization.
I can't get this to work in a kvm guest, though. The kvm process just hangs on guest startup after a crash in vfio, as soon as I pass the physical GPU through along with the vGPU. If I just pass through one or the other I have no problems. I bet you won't run into this if you pass both GPUs through as physical (rather than mediated) devices, or if you're running on bare metal.
EDIT: I failed to mention this earlier, but while multi-GPU should work in general, the NVIDIA Tesla cards in particular can bet set to compute-only (TCC) and graphics (WDDM) modes, and default to TCC out-of-the-box. I believe you will need the GRID vGPU driver to put them in WDDM mode, and the regular Quadro driver does not let you do that. Take that with a grain of salt. I haven't played with that myself, and there may be patches/tools/hacks that I am unaware of, and remember this only matters if you use the GPU on bare-metal or in non-vGPU PCI pass-through mode.