Monero Mining Performance

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

DrPeter

New Member
Mar 10, 2017
12
3
3
69
AMD RX 480 and 470 cards seem to do better than Nvidia ones when it comes to Monero. On a slightly tweaked RX 480 Sapphire I'm doing around 570 hashes/second and probably pulling around 70 watts on the GPU itself. I'm going to be tweaking BIOS to try to drop that power usage even more, but for now this is just messing with Afterburner from MSI to adjust down core clock and increase memory clock and fiddling with parameters on the Claymore GPU miner. Check the bitcointalk.org Altcoin forums for tons more discussion on this. Folks with modded BIOS are getting much better ratio of hashes to power.
Wow. Are you sure you're using only 70 watts for 570s/h? That's really nice. Best I had seen was 90 watts for 690H/s with flashed BIOS.

That could be the more efficient card I'm looking for...

My 750Ti uses approx 50-60w to produce 220H/s.
Damn, is this the power used by your 750 only? Did read in several places that it could use way less (Power Usage of GeForce GTX 750 Ti With Various Crypto Algorithms - Crypto Mining Blog for example).

I'll probably get one of each and do some testing to see how it goes.

For the price I can find those second hand though, the 750ti would have a better initial price / hs by a slight margin though.
 

Marsh

Moderator
May 12, 2013
2,647
1,498
113
My best H/s per watt is using a ASUS 4nodes dual cpu server.
It depends on how much $$ spend on CPU.

Patrick have the same server with more $$ and better (E5-2628L v4 30MB L3 cache )CPU, it is producing slightly over 1000H/s with 160w. May be Patrick would correct my assumption here.

I use some cheap $110 E5-2650 v3 each ( with 25MB L3 cache ), it is producing 915H/s with 160w = 5.7 Hashrate / watt

I was hoping the newly purchased cheap PowerColor RX480 4gb card is more energy efficient. But is not the case.
The RX480 out of the box is producing 574H/s but consumes approx 130-150w.
 
  • Like
Reactions: DrPeter and eva2000

Patrick

Administrator
Staff member
Dec 21, 2010
12,520
5,828
113
@DrPeter on the AMD side the RX 480 and 470 with BIOS mods seem to be good.

On the NVIDIA side the GRID M40s are like 4x 750 Ti per card. The GTX 1050 Ti cards may be slightly faster at just under 300H/s but I have had entire systems running sub 100w with one of those mining.

@Marsh I think 897H/s per node x4 and the total power is more like 640w for the entire chassis. Apparently, the BIOS modded ones use less power and do more H/s.

The other potential consideration is that when you go to do machine learning, you want NVIDIA cards for CUDA. To me, the RX 480 is really a gaming card and maybe mining. The NVIDIA are gaming, mining and machine learning.
 
  • Like
Reactions: DrPeter and eva2000

DrPeter

New Member
Mar 10, 2017
12
3
3
69
I use some cheap $110 E5-2650 v3 each ( with 25MB L3 cache ), it is producing 915H/s with 160w = 5.7 Hashrate / watt
Impressive. I can't find any E5 v3 at such a cheap price in my area though. Would have been a nice base.

GRID M40s are like 4x 750 Ti per card
This one seems really nice but is more expensive than 4x 750 Ti.

The NVIDIA are gaming, mining and machine learning.
Thanks, will take that into account.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,520
5,828
113
This may make sense, but I did confirm that turning SMT off yielded the same H/s on the Ryzen 7 1700 as SMT on.

87w at the wall with 2x SSDs (1x 400GB 750 NVMe and 1 800GB S3610) and 32GB RAM plus the low-end 610 GPU.

@Marsh that comes out to 5.55H/s per W. The higher-end Ryzen 7 1700X I believe was 133W.
 

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
I am really waiting for all the Ryzen cpu's to be available before I dive in.

And the motherboard kinks to get worked out.

This may make sense, but I did confirm that turning SMT off yielded the same H/s on the Ryzen 7 1700 as SMT on.

87w at the wall with 2x SSDs (1x 400GB 750 NVMe and 1 800GB S3610) and 32GB RAM plus the low-end 610 GPU.

@Marsh that comes out to 5.55H/s per W. The higher-end Ryzen 7 1700X I believe was 133W.
 

Marsh

Moderator
May 12, 2013
2,647
1,498
113
I am going to strip down a system to test out my Evga 750Ti SC card.
@DrPeter
What is your OS? which miner program? could you share your config?

thanks
 

DrPeter

New Member
Mar 10, 2017
12
3
3
69
@Marsh I'm running ccminer on linux with cuda 8.0.
I did other tests:

203 h/s with ccminer 2.0 (by tpruvot@github) no special config
232 h/s using '-l 8x30' option with same software (same power consumption or maybe +1w)
243 h/s with ccminer-cryptonight (by tsiv@github) (+8w!!! - probably not worth it for 10h/s more)

Didn't find any way to reduce power consumption as of yet.
 
  • Like
Reactions: eva2000

Marsh

Moderator
May 12, 2013
2,647
1,498
113
@Klee
In the beginning of using xmr-stak-cpu, I did not understood how to config the thread assignment to the cpu core (NUMA issue).

My hashrate was only 815H/s with dual E5-2650 v3 CPU, I did "lscpu" command , then the light bulb moment.
I did not assign the affine_to_cpu correctly.

NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39

Here is my corrected
cpu_thread_num" : 24;
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 20},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 21},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 30},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 31},

Incorrect one:
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 20},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 21},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 22},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 23},

I made the change to config.txt, my hashrate jump up to 915H/s
 
  • Like
Reactions: staph and Patrick

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
Well my E5-2667 V3 ES cpu's are a HCC cpu with all but 8 cores deactivated. And sometimes, especially windows, the cores are numbered very interestingly. I have not had an issue yet with that under Linux so far except for when I do cat /proc/cpuinfo and get an odd output compared to lscpu.

Should be interesting...... thats why I have not dove into it yet and have been running Wolfs GPU miner.

Here is lscpu output.
@LinuxBeast:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Genuine Intel(R) CPU @ 2.90GHz
Stepping: 1
CPU MHz: 1214.587
CPU max MHz: 3200.0000
CPU min MHz: 1200.0000
BogoMIPS: 5802.24
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb intel_ppin tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts

And diving in more:
@LinuxBeast:~$ lscpu -a -e=socket,cpu,core,address,online,configured
SOCKET CPU CORE ADDRESS ONLINE CONFIGURED
0 0 0 - yes -
0 1 1 - yes -
0 2 2 - yes -
0 3 3 - yes -
0 4 4 - yes -
0 5 5 - yes -
0 6 6 - yes -
0 7 7 - yes -
1 8 8 - yes -
1 9 9 - yes -
1 10 10 - yes -
1 11 11 - yes -
1 12 12 - yes -
1 13 13 - yes -
1 14 14 - yes -
1 15 15 - yes -
0 16 0 - yes -
0 17 1 - yes -
0 18 2 - yes -
0 19 3 - yes -
0 20 4 - yes -
0 21 5 - yes -
0 22 6 - yes -
0 23 7 - yes -
1 24 8 - yes -
1 25 9 - yes -
1 26 10 - yes -
1 27 11 - yes -
1 28 12 - yes -
1 29 13 - yes -
1 30 14 - yes -
1 31 15 - yes -

When I do cat /proc/cpuinfo
It gives a interesting output with "core id", too long to post it all here so i'll post a highly edited output for CPU 0.

If you look at "core id" it list 0,2,3,5,8,10,12,14 but skips 1,4,6,7,9,11,13. So I'm thinking it is a 15 0r 16 core cpu with all but 8 deactived and that would also explain "L3 cache: 35840K" in the output of lscpu.

(For comparison the output for CPU 1 "core id" is 0,2,3,5,8,10,12,14 also.)

@LinuxBeast:~$ cat /proc/cpuinfo
processor : 0
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0

processor : 1
physical id : 0
siblings : 16
core id : 2
cpu cores : 8
apicid : 4
initial apicid : 4

processor : 2
physical id : 0
siblings : 16
core id : 3
cpu cores : 8
apicid : 6
initial apicid : 6

processor : 3
physical id : 0
siblings : 16
core id : 5
cpu cores : 8

processor : 4
physical id : 0
siblings : 16
core id : 8
cpu cores : 8
apicid : 16
initial apicid : 16

processor : 5
physical id : 0
siblings : 16
core id : 10
cpu cores : 8
apicid : 20
initial apicid : 20

processor : 6
physical id : 0
siblings : 16
core id : 12
cpu cores : 8
apicid : 24
initial apicid : 24

processor : 7
physical id : 0
siblings : 16
core id : 14
cpu cores : 8
apicid : 28
initial apicid : 28

@Klee
In the beginning of using xmr-stak-cpu, I did not understood how to config the thread assignment to the cpu core (NUMA issue).

My hashrate was only 815H/s with dual E5-2650 v3 CPU, I did "lscpu" command , then the light bulb moment.
I did not assign the affine_to_cpu correctly.

NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39

Here is my corrected
cpu_thread_num" : 24;
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 20},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 21},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 30},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 31},

Incorrect one:
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 20},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 21},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 22},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 23},

I made the change to config.txt, my hashrate jump up to 915H/s
 
Last edited:

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
First try:

HASHRATE REPORT
| ID | 2.5s | 60s | 15m | ID | 2.5s | 60s | 15m |
| 0 | 38.5 | 38.5 | (na) | 1 | 38.3 | 38.2 | (na) |
| 2 | 41.6 | 41.6 | (na) | 3 | 40.2 | 40.4 | (na) |
| 4 | 41.3 | 41.3 | (na) | 5 | 40.7 | 40.9 | (na) |
| 6 | 37.0 | 36.9 | (na) | 7 | 36.4 | 35.9 | (na) |
| 8 | 38.8 | 38.7 | (na) | 9 | 37.1 | 37.1 | (na) |
| 10 | 41.4 | 41.4 | (na) | 11 | 40.7 | 40.7 | (na) |
| 12 | 41.3 | 41.2 | (na) | 13 | 41.6 | 41.5 | (na) |
| 14 | 36.6 | 36.5 | (na) | 15 | 36.4 | 36.4 | (na) |
| 16 | 38.8 | 38.6 | (na) | 17 | 38.2 | 38.1 | (na) |
| 18 | 41.7 | 41.7 | (na) | 19 | 40.8 | 40.7 | (na) |
| 20 | 41.0 | 40.9 | (na) | 21 | 41.6 | 41.3 | (na) |
| 22 | 37.0 | 36.9 | (na) | 23 | 36.8 | 36.7 | (na) |
| 24 | 38.2 | 38.1 | (na) | 25 | 37.1 | 37.1 | (na) |
| 26 | 41.7 | 41.7 | (na) | 27 | 40.3 | 40.4 | (na) |
| 28 | 40.9 | 40.9 | (na) | 29 | 41.2 | 41.2 | (na) |
| 30 | 36.4 | 36.4 | (na) | 31 | 36.7 | 36.7 | (na) |
-----------------------------------------------------
Totals: 1256.3 1254.4 (na) H/s
Highest: 1259.7 H/s

Nice improvement from 1150 H/s

I did :
cpu_thread_num" : 32;
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 0},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 1},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 2},
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 3},

All the way to :
{ "low_power_mode" : false, "no_prefetch" : true, "affine_to_cpu" : 31}

I don't know if thats optimal.
 
Last edited:

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
RESULT REPORT
Difficulty : 100001
Good results : 16 / 16 (100.0 %)
Avg result time : 55.9 sec
Pool-side hashes : 1525014

Top 10 best results found:
| 0 | 1033126 | 1 | 581793 |
| 2 | 424778 | 3 | 303741 |
| 4 | 217583 | 5 | 212512 |
| 6 | 158809 | 7 | 137214 |
| 8 | 136236 | 9 | 130390 |

Error details:
Yay! No errors.


CONNECTION REPORT
Pool address : pool.minexmr.com:3333
Connected since : 2017-03-11 20:10:11
Pool ping time : 189 ms

Network error log:
Yay! No errors.
 

Marsh

Moderator
May 12, 2013
2,647
1,498
113
what happen if you up the thread count to 34? ( 35mb cache /2 = 17 per cpu )
 

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
Now with Wolfs GPU miner on just my dual RX480 and with Xmr-stack-cpu i'm hitting just over 2400 H/s.

Still plan to tinker with the settings on xmr-stack-cpu, maybe free up one or two threads instead of using all 32.

But first i'll let it run overnight.
 
  • Like
Reactions: DrPeter

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
what happen if you up the thread count to 34? ( 35mb cache /2 = 17 per cpu )

It runs no faster in Wolfs cpu and gpu miners, I am assuming because it has only 16 threads per cpu.
 
Last edited:

Patrick

Administrator
Staff member
Dec 21, 2010
12,520
5,828
113
I tried getting Wolf's and sg-miner for Ubuntu working for the RX 480. The AMD Pro and SDK installed. It keeps giving a compiler error not finding CL/cl.h

I want to see what this Ryzen box can do!