@mashrooms thank you. I dug a bit deeper and it seems that both a DGX Spark and Ryzen AI Max+ are plenty fast for large MoE models but struggle with the bigger dense models due to their limited memory bandwidth. There is no hard data to be found on this other than the usual benchmaxed numbers and some user experiences.
I still do plan to have a local AI infrastructure but I think I should first try them via openrouter, deepinfra or a similar service. While many hail "open" models like Qwen as nearly as good as ChatGPT or Claude it might just be that they would not cope well with the things I intend to do with them ... which is not vibe coding until it appears to work and call it a day.