Monero Mining Performance

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

alex_stief

Well-Known Member
May 31, 2016
884
312
63
39
Take any CPU from the same generation with the same amount of cores and cache. Extrapolate its hash rate using the CPU frequencies. Done. Accurate by a few percent.
 

GCM

Active Member
Aug 24, 2015
133
43
28
Hmm, I am getting vastly lower mining rates on my E5-2670 V1s. I'm maxing out at around 400 h/s
 

onsit

Member
Jan 5, 2018
98
26
18
33
Hmm, I am getting vastly lower mining rates on my E5-2670 V1s. I'm maxing out at around 400 h/s
That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.
 

mantis

Member
Nov 17, 2017
38
6
8
52
That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.
 

onsit

Member
Jan 5, 2018
98
26
18
33
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.
If you actually play around with the settings you will see the Hash / watt ratios. You will more than likely see the same hashing numbers if you did HT off / threads = nproc. That left over cache that you can't utilize isn't really performant when you use HT to work it.

The same reason some people even do HT off, and nproc - 2 cores.
 

Joel

Active Member
Jan 30, 2015
865
209
43
43
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.
xmr-stak has a low power flag that can be used to allocate more cache (4mb per thread vs. 2mb for "false"). I found with 2680v2s that low power mode results in a higher overall hash vs affining to hyperthreaded cores.
 

Trebor

New Member
Dec 18, 2017
12
0
1
61
Hmm, I'm getting 1260 on a pair. I wonder what I'm doing right. ;) All cores and threads enabled, 24 threads total, 10 cores and two hyperthreads per socket. xmr-stak, linux, no NUMA stuff other than the thread assignments.

Are your hyperthreads not 2 each per socket, maybe?
 

levifig

Member
Nov 27, 2017
52
13
8
levifig.com
Hmm, I'm getting 1260 on a pair. I wonder what I'm doing right. ;) All cores and threads enabled, 24 threads total, 10 cores and two hyperthreads per socket. xmr-stak, linux, no NUMA stuff other than the thread assignments.

Are your hyperthreads not 2 each per socket, maybe?
I just ran it default/auto and didn't really configure threads. I'm assuming it ran 20 threads (half of available) and I think 24-25 would've been ideal. I'm just not mining so I didn't really put much time into it… I was just benchmarking to make sure the server was performing within spec and keeping good temperature after installing Noctua coolers/fans (it was :p). :D
 

dexvx

Member
Mar 6, 2014
43
4
8
Dumb question, but is it worth it to mine XMR using a single (X99 board) E5-2699 v4 and 1 DIMM? Or do I need to populate all DIMM channels?
 

Joel

Active Member
Jan 30, 2015
865
209
43
43
Hashing doesn't care about system memory too much, mostly the CPU cache. So mine away!
 

Godfr33

Member
Jan 20, 2018
94
16
8
48
Mining turtle with 2 rigs. 1st 2 x E5-2620 6 core 12 threads 6mb/30mb 14 cores active, plus Vega 56, RX 580, 1060 totaling 2900 h/s. 450watts @ .2kwh XMR-Stak

Next rig 2x E5-2620 And MSI 660ti totaling 760h/s 240 watts.XMR-Stak

I have a open air stripped 2x E5-2620 getting 500 h/s Ubuntu. XMRig.

I have 2 more servers on standby and a Dell 5810 E5-1650v3 that I haven’t used yet. The E5-1650v3 nets 450h/s. Haven’t tested power draw.

Net 40k coins in past 2 days half assed!!
 

tom kennedy

New Member
Jan 27, 2018
1
0
1
44
ok, thats make sense. after a week of tuning to get the best perf/watt ratio, my dual E5-2450 got ~810H/s@145Watts with 18 threads, one 16GB ram stick runs at 800Mhz PC3L DDR3 4X4, onboard gpu and 4 Cores disabled (2 cores disabled each socket).
Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?
 

eureka

New Member
Jan 30, 2018
8
9
3
32
Las Vegas, NV
astr.al
Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?
The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.

For CPUs with more cache than threads I don't really suggest running HTs. You're better off just doing multiple hashes on a single core, the AES unit isn't your bottleneck so it won't hurt your hashrate any. A little bit of work and optimization will net you ~950H/s on a pair of 2670v1. Best I've been able to work out of these older chips is 1400H/s on a pair of 2667v2.
 
Last edited:

Joel

Active Member
Jan 30, 2015
865
209
43
43
The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.

For CPUs with more cache than threads I don't really suggest running HTs. You're better off just doing multiple hashes on a single core, the AES unit isn't your bottleneck so it won't hurt your hashrate any. A little bit of work and optimization will net you ~950H/s on a pair of 2670v1. Best I've been able to work out of these older chips is 1400H/s on a pair of 2667v2.


How do you have your 2667's configured?

My 2667s get 630 each = 1260 for the pair. xmr-stak on Windows, 12 full power threads per NUMA node (slightly better for me than running 7-8 threads with low power mode). They're also QS if that matters.
 

dexvx

Member
Mar 6, 2014
43
4
8
Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.
 

onsit

Member
Jan 5, 2018
98
26
18
33
Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.
Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.

Look at hash rate in 24 hour increments, not 1 hour benchmarks.
 

eureka

New Member
Jan 30, 2018
8
9
3
32
Las Vegas, NV
astr.al
How do you have your 2667's configured?

My 2667s get 630 each = 1260 for the pair. xmr-stak on Windows, 12 full power threads per NUMA node (slightly better for me than running 7-8 threads with low power mode). They're also QS if that matters.
Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:

Code:
/* Socket 1 */
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 0 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 2 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 4 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 6 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 8 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 10 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 12 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 14 },

/* Socket 2 */
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 16 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 18 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 20 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 22 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 24 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 26 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 28 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 30 },
The interleaving seems to help with the cache pressure slightly, instead of jamming a bunch of dual threads on the same L3 segments. Keep in mind that Sandy and newer are using segmented LLC, not a monolithic one, so there is a small amount of latency involved when considering multiple threads.

Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.
4H/s variance is nothing to be concerned about, it's normal behavior. Compared to GPU mining, that's actually exceptionally stable. The variance is likely just due to cache pressure at times from OS tasks or other processes - if configured properly your mining should be using the cache to its fullest, so any other running code (even small kernel tasks) will have a slight hashrate impact.

Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.

Look at hash rate in 24 hour increments, not 1 hour benchmarks.
Latency doesn't have any bearing on hashrate. It does have a relationship with the pool's estimate of your hashrate, but that's calculated from difficulty and share submission interval, and is never going to be accurate. Based on the description of hashrate going down a mere 4H/s it's not the pool variance here, that's a local reading from the software.
 

Joel

Active Member
Jan 30, 2015
865
209
43
43
Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:
Interesting technique, I'll have to try it when I get home. I'm sure Nicehash will be less likely to drop the connection too (they set difficulty really high). I notice you're actually overcommitting the cache too.

The interleaving seems to help with the cache pressure slightly, instead of jamming a bunch of dual threads on the same L3 segments. Keep in mind that Sandy and newer are using segmented LLC, not a monolithic one, so there is a small amount of latency involved when considering multiple threads.
Just had a lightbulb moment, because I hadn't thought about that. Makes total sense though, and explains why my stacking all the low power threads at the beginning didn't work too well. I'll also check to see if I can get any gains on my dual 2680v2 nodes (those are 580/600h/s using 9 threads with 0-3 low power under linux, 2 docker images due to NUMA).

If nothing else, your philosophy definitely shows that there's reasons to read up on the architecture.