I just set one up in my network a little over a week ago. I don't have any other GB10 machines to test against, but I can't say I've noticed any show-stopping negatives so far, considering I moved from an old test box with an AMD mobile 6600M with 8GB VRAM. Limitations in software definitely exist and if you stick to Ollama to serve models you're likely going to be disappointed. I just moved to llama.cpp with CUDA 13 in Docker with tweaks to improve concurrency and it's cut a lot of my agents' tool calling time by a factor of 3 or 4. Because of this I was able to move from a Qwen 3.6 35B, A3B Q4 quant to a Q5 with vision mods. The tokens per second throughput is slightly lower, but it's more than made up for it with the efficiency in using tools!
Realistically, the community is still pulling more and more performance out of this thing almost weekly, despite its spec-sheet weaknesses. I'm considering picking up a Gigabyte as a second unit, just to keep playing in the sandbox.