Yes, come to the BlackwellPerformance rtx6kpro discord there is a channel dedicated to its testingthoughts?
let us know how that works once loadedThis takes a lot of memory! It is loading on the 8x GB10 cluster right now, IIRC, using a 4-bit quant.
Please run the MMLU-Pro benchmark and compare with the models baseline score to see how brain damaged NVFP4 is.
Totally get it. One of the reasons we have the GB10 cluster is to run models at FP8 instead of going to 4-bit.That quant format has been a singular disappointment in both performance and accuracy of model output across the blackwell performance community.