I think you're right in that it can highly vary depending on what else is going on in the system. because it is such a short test, I think the sampling duration isn't long enough to get a good "average" result. So, i guess it's not a great benchmark in that sense...
for CPU performance, I think I like looking at passmark scores, they have both a single-threaded, and overall (all cores) score that makes it easy to distinguish performance for single-threaded vs multi-threaded.
Well, yes and no. I think it does an adequate job of properly benchmarking the cpu if the testing environment is consistent between comparisons since even in real benchmarking if you run 100 other things during the benchmarks, those results will skew as well.
I actually use a few others to give me an overall idea of what performance level a system is at:
Octane 2.0 JavaScript Benchmark - 40k+ on this is fast for me, but it's also browser based, so there can be some skew from the browsers, but not as much as speed-battle.
MotionMark 1.0 - I've not used this extensively simply because of how long it takes to run and because it does seem like resolution makes a huge impact on the results. The results are all over the place, but it does seem like the test is good if I can control all the other variables.
Lite Brite Browser Performance Benchmark - This one is pretty quick and can also be browser swayed, but about as much as the octane test. I think my wife's macbook holds the record at completely the 'all' test in just over 20 seconds.
PenguinMark - This one is fun to watch and must be for higher powered systems because I'm lucky to see double-digit results if I even get >0 as most of my systems are too slow/old.
I'll use these to figure out if a system upgrade makes sense or did anything to really improve performance.
I use passmark as well to compare cpus and the single thread performance. But it is interesting to do a comparison test in a real-world scenario as well. For example, an upgrade from an i5-2500 to an i7-2600k should have made a difference according to passmark:
PassMark - CPU Comparison Intel i5-2500 vs Intel i7-2600K
But when I did the upgrade to a superior i7-2600k specimen that was actually delidded and could overclock very well, at stock clocks the speed-battle results were the exact same--not worth the upgrade at all.
And then there's the difference between some old lga 771 5130 xeons that were upgraded to dual x5470:
PassMark - CPU Comparison Intel Xeon 5130 vs Intel Xeon X5470
This should have been an absolutely huge upgrade and while speed-battle did almost double, there's a sense of lag with the new processors that I haven't quite figured out yet.
PS to OP: free bump