Monero Mining Performance

alex_stief · Jan 23, 2018

Take any CPU from the same generation with the same amount of cores and cache. Extrapolate its hash rate using the CPU frequencies. Done. Accurate by a few percent.

GCM · Jan 24, 2018

Hmm, I am getting vastly lower mining rates on my E5-2670 V1s. I'm maxing out at around 400 h/s

onsit · Jan 24, 2018

GCM said:
Hmm, I am getting vastly lower mining rates on my E5-2670 V1s. I'm maxing out at around 400 h/s

That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.

mantis · Jan 24, 2018

onsit said:
That's about correct.. My E5-2660 only does 400 H/s each. You may need to tweak the bios a bit to turn off Hyper Threading and play around with how many processors are active. You may get ~420-430 per E5-2670.

Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.

onsit · Jan 24, 2018

mantis said:
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.

If you actually play around with the settings you will see the Hash / watt ratios. You will more than likely see the same hashing numbers if you did HT off / threads = nproc. That left over cache that you can't utilize isn't really performant when you use HT to work it.

The same reason some people even do HT off, and nproc - 2 cores.

Joel · Jan 24, 2018

mantis said:
Why would you turn hyperthreading off, when you need to run 10 threads on a 8 core cpu? You need the hyperthreads.

xmr-stak has a low power flag that can be used to allocate more cache (4mb per thread vs. 2mb for "false"). I found with 2680v2s that low power mode results in a higher overall hash vs affining to hyperthreaded cores.

Trebor · Jan 24, 2018

Do you get more than 630 per 2680v2?

levifig · Jan 25, 2018

Trebor said:
Do you get more than 630 per 2680v2?

That sounds about right. My 2x 2680v2 get ~980-1000H/s so 60% on one of them sounds reasonable. I haven't tried it though, so don't take my word for it...

Trebor · Jan 25, 2018

Hmm, I'm getting 1260 on a pair. I wonder what I'm doing right.

All cores and threads enabled, 24 threads total, 10 cores and two hyperthreads per socket. xmr-stak, linux, no NUMA stuff other than the thread assignments.

Are your hyperthreads not 2 each per socket, maybe?

levifig · Jan 25, 2018

Trebor said:
Hmm, I'm getting 1260 on a pair. I wonder what I'm doing right. All cores and threads enabled, 24 threads total, 10 cores and two hyperthreads per socket. xmr-stak, linux, no NUMA stuff other than the thread assignments.

Are your hyperthreads not 2 each per socket, maybe?

I just ran it default/auto and didn't really configure threads. I'm assuming it ran 20 threads (half of available) and I think 24-25 would've been ideal. I'm just not mining so I didn't really put much time into it… I was just benchmarking to make sure the server was performing within spec and keeping good temperature after installing Noctua coolers/fans (it was

).

dexvx · Jan 27, 2018

Dumb question, but is it worth it to mine XMR using a single (X99 board) E5-2699 v4 and 1 DIMM? Or do I need to populate all DIMM channels?

Joel · Jan 27, 2018

Hashing doesn't care about system memory too much, mostly the CPU cache. So mine away!

Godfr33 · Jan 27, 2018

Mining turtle with 2 rigs. 1st 2 x E5-2620 6 core 12 threads 6mb/30mb 14 cores active, plus Vega 56, RX 580, 1060 totaling 2900 h/s. 450watts @ .2kwh XMR-Stak

Next rig 2x E5-2620 And MSI 660ti totaling 760h/s 240 watts.XMR-Stak

I have a open air stripped 2x E5-2620 getting 500 h/s Ubuntu. XMRig.

I have 2 more servers on standby and a Dell 5810 E5-1650v3 that I haven’t used yet. The E5-1650v3 nets 450h/s. Haven’t tested power draw.

Net 40k coins in past 2 days half assed!!

tom kennedy · Jan 27, 2018

sharpnel said:
ok, thats make sense. after a week of tuning to get the best perf/watt ratio, my dual E5-2450 got ~810H/s@145Watts with 18 threads, one 16GB ram stick runs at 800Mhz PC3L DDR3 4X4, onboard gpu and 4 Cores disabled (2 cores disabled each socket).

Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?

eureka · Jan 30, 2018

tom kennedy said:
Sharpnel, You really got 800+ on a 2450? Best I could ever tweak to was 660 on my dual 2450L. What was the biggest mod to boost hashrate if I could ask?

The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.

For CPUs with more cache than threads I don't really suggest running HTs. You're better off just doing multiple hashes on a single core, the AES unit isn't your bottleneck so it won't hurt your hashrate any. A little bit of work and optimization will net you ~950H/s on a pair of 2670v1. Best I've been able to work out of these older chips is 1400H/s on a pair of 2667v2.

Joel · Jan 30, 2018

eureka said:
The L variant won't go 100% C0, seems to max out around 70-80% of clock time. That 20% margin seems to make sense here.

For CPUs with more cache than threads I don't really suggest running HTs. You're better off just doing multiple hashes on a single core, the AES unit isn't your bottleneck so it won't hurt your hashrate any. A little bit of work and optimization will net you ~950H/s on a pair of 2670v1. Best I've been able to work out of these older chips is 1400H/s on a pair of 2667v2.

How do you have your 2667's configured?

My 2667s get 630 each = 1260 for the pair. xmr-stak on Windows, 12 full power threads per NUMA node (slightly better for me than running 7-8 threads with low power mode). They're also QS if that matters.

dexvx · Jan 30, 2018

Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.

onsit · Jan 30, 2018

dexvx said:
Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.

Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.

Look at hash rate in 24 hour increments, not 1 hour benchmarks.

eureka · Jan 31, 2018

Joel said:
How do you have your 2667's configured?

My 2667s get 630 each = 1260 for the pair. xmr-stak on Windows, 12 full power threads per NUMA node (slightly better for me than running 7-8 threads with low power mode). They're also QS if that matters.

Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:

Code:

/* Socket 1 */
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 0 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 2 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 4 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 6 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 8 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 10 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 12 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 14 },

/* Socket 2 */
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 16 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 18 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 20 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 22 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 24 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 26 },
    { "low_power_mode" : true,  "no_prefetch" : true, "affine_to_cpu" : 28 },
    { "low_power_mode" : false,  "no_prefetch" : true, "affine_to_cpu" : 30 },

The interleaving seems to help with the cache pressure slightly, instead of jamming a bunch of dual threads on the same L3 segments. Keep in mind that Sandy and newer are using segmented LLC, not a monolithic one, so there is a small amount of latency involved when considering multiple threads.

dexvx said:
Anyone have their hash rate decrease with time?

E5-2699v4 getting around 875 H/s (which is inline with the STH benchmarks) the first hour or so. Then it drops to around 872 H/s. Now overnight, its around 871 H/s.

lm-sensors (Ubuntu 16.04.3 LTS) reports cores are all < 40C. Heatsink is only slightly warm. Using the servethehome/monero_xmring docker container.

4H/s variance is nothing to be concerned about, it's normal behavior. Compared to GPU mining, that's actually exceptionally stable. The variance is likely just due to cache pressure at times from OS tasks or other processes - if configured properly your mining should be using the cache to its fullest, so any other running code (even small kernel tasks) will have a slight hashrate impact.

onsit said:
Hash rate is based on network latency, and the pool's latency for processing your shares. Plus your shares expiring or being rejected because someone else may have solved a block quicker than you.

Look at hash rate in 24 hour increments, not 1 hour benchmarks.

Latency doesn't have any bearing on hashrate. It does have a relationship with the pool's estimate of your hashrate, but that's calculated from difficulty and share submission interval, and is never going to be accurate. Based on the description of hashrate going down a mere 4H/s it's not the pool variance here, that's a local reading from the software.

Joel · Jan 31, 2018

eureka said:
Best results I get are running mixed 4/4 single/dual threads per socket, single miner process. If you configure affinity properly there's no need to run two copies of stak - that'll only increase the cache contention with the extra management threads running. I interleave the single/dual threads like so:

Interesting technique, I'll have to try it when I get home. I'm sure Nicehash will be less likely to drop the connection too (they set difficulty really high). I notice you're actually overcommitting the cache too.

The interleaving seems to help with the cache pressure slightly, instead of jamming a bunch of dual threads on the same L3 segments. Keep in mind that Sandy and newer are using segmented LLC, not a monolithic one, so there is a small amount of latency involved when considering multiple threads.

Just had a lightbulb moment, because I hadn't thought about that. Makes total sense though, and explains why my stacking all the low power threads at the beginning didn't work too well. I'll also check to see if I can get any gains on my dual 2680v2 nodes (those are 580/600h/s using 9 threads with 0-3 low power under linux, 2 docker images due to NUMA).

If nothing else, your philosophy definitely shows that there's reasons to read up on the architecture.

Monero Mining Performance

Well-Known Member

Active Member

Member

Member

Member

Active Member

New Member

Member

New Member

Member

Member

Active Member

Member

New Member

New Member

Active Member

Member

Member

New Member

Active Member