Monero Mining Performance

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

cafcwest

Member
Feb 15, 2013
136
14
18
Richmond, VA
It does make me ponder taking the spare C6100 I have in the FS and running it. Payback period 10 months if it is just eating extra power.
What value are you trying to 'payback'? The asking price of the server? I considered picking it up from you before you opened it up and discovered how much RAM it had and priced it more appropriately.
 

Kal G

Active Member
Oct 29, 2014
160
44
28
44
Just for comparison, I fired Wolf's cpuminer up on a D-1541 (using nproc - 1). Resulted in: 179 H/s.

* Updated to fix issue with thread count.
 
Last edited:

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
This is not hurting my desire to explore and benchmark more systems:
upload_2017-1-4_8-24-46.png

@cafcwest that was my thinking.
 
Last edited:
  • Like
Reactions: gigatexal

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Added 3 more CPUs today including the E3-1515M V5

I was not able to get an OpenCL miner running but did not try too hard.
 

cafcwest

Member
Feb 15, 2013
136
14
18
Richmond, VA
This is not hurting my desire to explore and benchmark more systems:
View attachment 4106

@cafcwest that was my thinking.

A little dip today, but nothing too serious.

Because the gap between CPU and GPU isn't as significant with Monero, I did some budget server gear number crunching this evening searching for cheap core counts. I see a path to putting together a 4P Opteron 6276 build in the $300-$400 range. At current value, that's a 3-4 month ROI. Hmmmm...
 
  • Like
Reactions: gigatexal

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
XMR to USD is down to $12.50. The fun with crypto currencies!

Added dual E5-2699 V4, dual E5-2670 V1 and dual E5-2620 V1
 

fractal

Active Member
Jun 7, 2016
309
69
28
33
BTW @fractal any tips on getting these to work on ARM? I may try on some larger CPUs.
No real tricks. Wolf has a version for armv8 processors with aes crypto extensions that is faster than without but the stock version runs just fine on armv7/armv8 chips. See v0.10 beta for ARMv8-A (ARM64) • /r/Monero for more details on the aes version. I just downloaded the code on my pine64/ubuntu system built it. It took a few iterations to install all the dependencies but that is to be expected building open source projects from source.

It does work your processor pretty hard so make sure your power / cooling are up to snuff. This shouldn't be an issue with server class hardware but many of the cheap SBCs lack adequate cooling and you may need to be careful how you power it. My pine64 went into thermal limit almost immediately after starting the client and dropped to half speed to keep the core below 100c. A cheap stick-on heat sink on the processor let it run at 3/4 speed..
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Yea, there is a big difference between dev board CPUs and server class CPUs in terms of expectations on cooling.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Well got 16.4KH/s tonight. Not quite sure if I could hit 20K at this point but still not a bad figure.

The good/ bad news is that I found that different CPUs work better with different CPU core numbers. That makes things more complex from a docker perspective.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
So here is the big update for today, nproc-1 works well on lower core count Xeon D, Xeon E3 and Xeon Phi x200 chips. On some of the dual socket systems, using the number of threads equal to physical cores seems to work best. This is a good illustration on why testing on more than 3 different types of platforms is a good idea :)

I now have a second docker image doing nproc/2 for the number of threads. The results on the higher-end machines are dramatic:
4x Intel Xeon E7-8890 V4 = 2280H/s - no longer with us :-(
2x Intel Xeon E5-2699 V4 = 1220H/s -> 1723H/s
1x Intel Xeon Phi 7210 = 1117H/s -> 602H/s (case to use nproc-1)
2x Intel Xeon E5-2698 V4 = 1010H/s -> 1572H/s
2x Intel Xeon E5-2690 V3 = 840H/s -> 1100H/s
2x Intel Xeon E5-2683 V3 = 826H/s -> 969H/s
2x Intel Xeon E5-2670 V3 = 793H/s -> 989H/s
2x Intel Xeon E5-2650L V3 = 720H/s -> 809H/s
2x Intel Xeon E5-2620 V4 = 620H/s -> 824H/s
2x Intel Xeon E5-2670 V1 = 590H/s -> 785H/s
2x Intel Xeon E5-2620 V1 = 363H/s -> 462H/s
2x Intel Xeon X5675 = 340H/s (Fractal)
1x Intel Xeon D-1587 = 219H/s -> 318H/s
1x Intel Xeon E5-1515M V5 = 185H/s -> 175H/s (case to use nproc-1)
1x Intel Xeon D-1541 = 178H/s -> 178H/s (case to use either nproc/2 or nproc-1)
1x Intel Xeon D-1540 = 157H/s -> 157H/s (case to use either nproc/2 or nproc-1)
2x Intel Xeon E5620 = 150H/s (cafcwest)
1x Intel Xeon E3-1245 V5 = 140H/s -> 138H/s (case to use nproc-1)
1x Pentium D1508 = 50H/s -> 47H/s (case to use nproc-1)

I also tried nproc/2 - 1 (e.g. physical cores -1 on HT system.) That had some cases where it did better, and in some cases it did worse but the margin was very small in either way. On 2 core machines nproc/2-1 ravages performance.

When I mentioned adding over 3KH/s last evening, that is what I did. Also added E5-2683 V3 to this list with both figures.
 
  • Like
Reactions: gigatexal

Kal G

Active Member
Oct 29, 2014
160
44
28
44
Updated hash rate for D-1541 (using 6 threads): 249 H/s

Try setting the number of threads to the CPU cache size / 2.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Updated hash rate for D-1541 (using 6 threads): 249 H/s

Try setting the number of threads to the CPU cache size / 2.
Hmmm... L3 cache size? How many threads are you using?
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
@Kal G and the plot thickens, worse performance on the E5-2699 V4 using L3/2 than CPU/2.

That change (to 12MB L3/ 2 = 6 threads) got me to 242 on the D-1541 and 388 on the D-1587.
 

Kal G

Active Member
Oct 29, 2014
160
44
28
44
@Patrick, It's definitely not a perfect formula, but the cryptonight algorith requires 2MB of memory per thread to calculate hashes. Keeping it all in L3 cache or below definitely sped it up on the platforms I've tested.

For example:

Intel Xeon E3-1240 v1 went from 129 H/s to 231 H/s (going from 7 to 4 threads, respectively). L3 cache size is 8MB.
 
  • Like
Reactions: Patrick

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
@Kal G very interesting. What pool are you using? I tried the Minergate (here is my affiliate) pool + their Ubuntu CLI miner but that one seems slower than the Wolf's + moneropool even using the same number of threads. The advantage is that it is easier to mine others and they have a way better dashboard.

Are you doing Docker or bare metal?
 

Kal G

Active Member
Oct 29, 2014
160
44
28
44
@Patrick, bare metal on the D-1541s, under a VM on ESXi 6.0U2 on the E3-1240 v1.

I'm using MoneroHash for a pool and Wolf's cpuminer compiled from the Git repo.