Take any CPU from the same generation with the same amount of cores and cache. Extrapolate its hash rate using the CPU frequencies. Done. Accurate by a few percent.
That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.Hmm, I am getting vastly lower mining rates on my E5-2670 V1s. I'm maxing out at around 400 h/s
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.
If you actually play around with the settings you will see the Hash / watt ratios. You will more than likely see the same hashing numbers if you did HT off / threads = nproc. That left over cache that you can't utilize isn't really performant when you use HT to work it.Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.
xmr-stak has a low power flag that can be used to allocate more cache (4mb per thread vs. 2mb for "false"). I found with 2680v2s that low power mode results in a higher overall hash vs affining to hyperthreaded cores.Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.
That sounds about right. My 2x 2680v2 get ~980-1000H/s so 60% on one of them sounds reasonable. I haven't tried it though, so don't take my word for it...Do you get more than 630 per 2680v2?
I just ran it default/auto and didn't really configure threads. I'm assuming it ran 20 threads (half of available) and I think 24-25 would've been ideal. I'm just not mining so I didn't really put much time into it… I was just benchmarking to make sure the server was performing within spec and keeping good temperature after installing Noctua coolers/fans (it was ).Hmm, I'm getting 1260 on a pair. I wonder what I'm doing right. All cores and threads enabled, 24 threads total, 10 cores and two hyperthreads per socket. xmr-stak, linux, no NUMA stuff other than the thread assignments.
Are your hyperthreads not 2 each per socket, maybe?
Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?ok, thats make sense. after a week of tuning to get the best perf/watt ratio, my dual E5-2450 got ~810H/s@145Watts with 18 threads, one 16GB ram stick runs at 800Mhz PC3L DDR3 4X4, onboard gpu and 4 Cores disabled (2 cores disabled each socket).
The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?
The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.
For CPUs with more cache than threads I don't really suggest running HTs. You're better off just doing multiple hashes on a single core, the AES unit isn't your bottleneck so it won't hurt your hashrate any. A little bit of work and optimization will net you ~950H/s on a pair of 2670v1. Best I've been able to work out of these older chips is 1400H/s on a pair of 2667v2.
Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.Anyone have their hash rate decrease with time?
E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.
lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.
Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:How do you have your 2667's configured?
My 2667s get 630 each = 1260 for the pair. xmr-stak on Windows, 12 full power threads per NUMA node (slightly better for me than running 7-8 threads with low power mode). They're also QS if that matters.
/* Socket 1 */
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 0 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 2 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 4 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 6 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 8 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 10 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 12 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 14 },
/* Socket 2 */
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 16 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 18 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 20 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 22 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 24 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 26 },
{ "low_power_mode" : true, "no_prefetch" : true, "affine_to_cpu" : 28 },
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 30 },
4H/s variance is nothing to be concerned about, it's normal behavior. Compared to GPU mining, that's actually exceptionally stable. The variance is likely just due to cache pressure at times from OS tasks or other processes - if configured properly your mining should be using the cache to its fullest, so any other running code (even small kernel tasks) will have a slight hashrate impact.Anyone have their hash rate decrease with time?
E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.
lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.
Latency doesn't have any bearing on hashrate. It does have a relationship with the pool's estimate of your hashrate, but that's calculated from difficulty and share submission interval, and is never going to be accurate. Based on the description of hashrate going down a mere 4H/s it's not the pool variance here, that's a local reading from the software.Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.
Look at hash rate in 24 hour increments, not 1 hour benchmarks.
Interesting technique, I'll have to try it when I get home. I'm sure Nicehash will be less likely to drop the connection too (they set difficulty really high). I notice you're actually overcommitting the cache too.Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:
Just had a lightbulb moment, because I hadn't thought about that. Makes total sense though, and explains why my stacking all the low power threads at the beginning didn't work too well. I'll also check to see if I can get any gains on my dual 2680v2 nodes (those are 580/600h/s using 9 threads with 0-3 low power under linux, 2 docker images due to NUMA).The interleaving seems to help with the cache pressure slightly, instead of jamming a bunch of dual threads on the same L3 segments. Keep in mind that Sandy and newer are using segmented LLC, not a monolithic one, so there is a small amount of latency involved when considering multiple threads.