Monero Mining Performance

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Marsh

Moderator
May 12, 2013
2,644
1,496
113
@zer0sum

1 process of xmr-stak-cpu.

Please see my post on page 18 post 353

If you need more , I am happy to post my config.txt
 

zer0sum

Well-Known Member
Mar 8, 2013
849
473
63
@zer0sum

1 process of xmr-stak-cpu.

Please see my post on page 18 post 353

If you need more , I am happy to post my config.txt
I think I'm still a little confused :)

So my E5-2670v1 CPU's should be:
  • NUMA node0 CPU(s): 0-7,16-23
  • NUMA node1 CPU(s): 8-15,24-31
They have 20MB L3 cache each so I think I need to have "cpu_thread_num" : 20,
But then how do I setup the thread affinity:

"cpu_threads_conf" : [
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 0 },
THROUGH TO >
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 19 },

OR

"cpu_threads_conf" : [
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 0 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 1 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 2 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 3 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 4 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 5 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 6 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 7 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 16 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 17 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 8 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 9 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 10 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 11 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 12 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 13 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 14 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 15 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 24 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 25 },
 

apollo69

New Member
Mar 27, 2017
6
0
1
39
Hi @apollo69 I am on a ski slope right now checking the forums on a chair lift so I am going to take this as a to-do for later today.

The simple answer is that I am going to make an image specifically for testing this.

The AWS side is a bit tricky. You will likely not use every thread because what you are trying to do is occupy 2MB L3 cache chunks.

I will research and revert.
Thanks that would be great, enjoy the slopes!

The specs for this processor say there is 30MB smart cache, so that would mean 15 threads right?
Intel® Xeon® Processor E5-2670 v3 (30M Cache, 2.30 GHz) Product Specifications


And is there anyway for now to stop and start the miner manually once it's running?
 

Marsh

Moderator
May 12, 2013
2,644
1,496
113
Config.txt should be

{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 16 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 17 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 24 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 25 },

You could try both config.txt and see the result very quickly.
 

zer0sum

Well-Known Member
Mar 8, 2013
849
473
63
Config.txt should be

{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 16 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 17 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 24 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 25 },

You could try both config.txt and see the result very quickly.
So I should start at 16 and work upwards from there?

Interestingly...Nicehash miner runs 2 separate instances with 10 threads each and does not affine them at all.
It is pulling 460+ H/s on each CPU and I can't get anywhere near 920+ H/s using a single instance of xmr-stak :(
 

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
Config from my dual E5-2667 VS ES, with 35mb l3 cache per cpu so 70 mb total.

I have not played with it much.

"cpu_thread_num" : 32,

"cpu_threads_conf" : [
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 0 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 1 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 2 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 3 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 4 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 5 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 6 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 7 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 8 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 9 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 10 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 11 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 12 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 13 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 14 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 15 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 16 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 17 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 18 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 19 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 20 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 21 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 22 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 23 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 24 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 25 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 26 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 27 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 28 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 29 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 30 },
{ "low_power_mode" : false, "no_prefetch" : false, "affine_to_cpu" : 31 },
],

[2017-03-27 19:26:16] : New block detected.
HASHRATE REPORT
| ID | 2.5s | 60s | 15m | ID | 2.5s | 60s | 15m |
| 0 | 37.1 | (na) | (na) | 1 | 36.8 | (na) | (na) |
| 2 | 40.8 | (na) | (na) | 3 | 35.5 | (na) | (na) |
| 4 | 40.6 | (na) | (na) | 5 | 40.0 | (na) | (na) |
| 6 | 35.6 | (na) | (na) | 7 | 35.1 | (na) | (na) |
| 8 | 37.5 | (na) | (na) | 9 | 36.0 | (na) | (na) |
| 10 | 38.5 | (na) | (na) | 11 | 38.8 | (na) | (na) |
| 12 | 39.4 | (na) | (na) | 13 | 40.2 | (na) | (na) |
| 14 | 35.4 | (na) | (na) | 15 | 35.1 | (na) | (na) |
| 16 | 37.6 | (na) | (na) | 17 | 37.1 | (na) | (na) |
| 18 | 39.8 | (na) | (na) | 19 | 38.8 | (na) | (na) |
| 20 | 36.4 | (na) | (na) | 21 | 40.0 | (na) | (na) |
| 22 | 35.7 | (na) | (na) | 23 | 35.9 | (na) | (na) |
| 24 | 37.2 | (na) | (na) | 25 | 37.4 | (na) | (na) |
| 26 | 40.9 | (na) | (na) | 27 | 39.3 | (na) | (na) |
| 28 | 40.5 | (na) | (na) | 29 | 40.3 | (na) | (na) |
| 30 | 35.3 | (na) | (na) | 31 | 32.2 | (na) | (na) |
-----------------------------------------------------
Totals: 1207.1 (na) (na) H/s
Highest: 1206.5 H/s
 

apollo69

New Member
Mar 27, 2017
6
0
1
39
Hi @Patrick - I read all 25 pages on this thread, and thought it might make sense to list the lscpu output, to help determine what is the best configuration whenever you get a chance.

I did get a 20-25% boost from 820 h/s to 1000 h/s which is nice, just by adding support for large pages with this command: sudo sysctl -w vm.nr_hugepages=128

Not sure if that is the optimal number as I haven't tested it, and not sure exactly what this does!

Also read somewhere in the xmr-stak-cpu Github config, about changing the memory lock settings with ulimit, but couldn't figure out how to do this or what is the optimal setting for this box, or for Wolf CPU miner (screenshot below): xmr-stak-cpu/config.txt at master · fireice-uk/xmr-stak-cpu · GitHub

upload_2017-3-28_17-19-24.png

upload_2017-3-28_17-24-40.png

upload_2017-3-28_17-19-39.png
 

Marsh

Moderator
May 12, 2013
2,644
1,496
113
If you run xmr-stak-cpu as root , then ulimit is ulimited .

To allow for normal user
If you are using Ubuntu
add these two lines in /etc/limits.conf
* soft memlock 626688
* hard memlock 626688

logout and login again
check by running command "ulimit"
 

apollo69

New Member
Mar 27, 2017
6
0
1
39
If you run xmr-stak-cpu as root , then ulimit is ulimited .

To allow for normal user
If you are using Ubuntu
add these two lines in /etc/limits.conf
* soft memlock 626688
* hard memlock 626688

logout and login again
check by running command "ulimit"
thanks, do you know what are the best settings for the Wolf CPU miner on Patrick's Docker images?
 

apollo69

New Member
Mar 27, 2017
6
0
1
39
Also I'm still only getting 82% CPU utilization overall so I ran mstat, turns out vCPUs 0-17 are only running at 70% - how do I fix this or set the affinity?

vCPUs 18-35 are running at 95%, which I assume is normal?

Not sure if this is related to the NUMA node configuration:

upload_2017-3-28_21-3-57.png


upload_2017-3-28_20-59-6.png
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
What works best is using 2MB L3 cache per thread. Therefore, if you had full use of those two E5-2676 V3 chips with 30MB L3 cache you would ideally want 30 threads total even though you have 48 threads total.

xmr-stak-cpu does work better for tuning the last bit in terms of affinity, but I asked the author for a way to do this automatically and was rejected.
 

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
What works best is using 2MB L3 cache per thread. Therefore, if you had full use of those two E5-2676 V3 chips with 30MB L3 cache you would ideally want 30 threads total even though you have 48 threads total.

xmr-stak-cpu does work better for tuning the last bit in terms of affinity, but I asked the author for a way to do this automatically and was rejected.
I prefer xmr-stak-cpu for two reasons, first it compiled every time without error, with wolfs I was having to chase down compilation errors and fixing them, and second it seems to give a little bit better speed improvement.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
I prefer xmr-stak-cpu for two reasons, first it compiled every time without error, with wolfs I was having to chase down compilation errors and fixing them, and second it seems to give a little bit better speed improvement.
Well, the Docker versions never have compilation errors so that has not been an issue.

The main problem I have had to solve is trying to figure out how to run these on a host of different CPUs. With a handful of machines, stak is fine to tune. For every machine is different tuning the custom config files are no-gos.
 

Klee

Well-Known Member
Jun 2, 2016
1,289
396
83
Well, the Docker versions never have compilation errors so that has not been an issue.

The main problem I have had to solve is trying to figure out how to run these on a host of different CPUs. With a handful of machines, stak is fine to tune. For every machine is different tuning the custom config files are no-gos.
I only have three machines mining now, if I had a rack of them that defiantly would be an issue.

Speaking of Wolfs, on my Dual E5-2667 V3 ES machine at least once a day wolfs gpu miner seems to be mining but the pools shows its not and I look at my stats on the pool's website a while ago it showed:

"Hash Rate

Total Hashes Submitted
Last Share Submitted Worker ID
1.17 KH/s 1721290348 3 minutes ago 45P...iRL.worker1<<<<<<<<<(xmr-stack-cpu miner on the 2667 machine)
1.02 KH/s 124552243 less than a minute ago 45P...iRL.worker4
650.68 H/s 201306305 less than a minute ago 45P...iRL.worker3
0.00 H/s 5050762408 about 7 hours ago 45P...iRL<<<<<<<<<<(Wolfs-gpu-miner on the 2667 machine)
0.00 H/s 141130704 11 minutes ago 45P...iRL.worker2"


The Wolfs-gpu miner has not connected to the pool for the last 7 hours even tho it seems to be running fine in the terminal window.

So I just now compiled the xmr-stak-amd miner and will run it today.

Also I changed my config on both xmr-stak-cpu and amr-stak-amd and dedicated two cpu threads to the gpu miner and have two less for the cpu miner(30 threads now down from 32).

After running it a few minutes I seemed to have better gpu performance and just slightly less cpu mining performance compaired to having affinity=0 on both gpu's

2017-03-29 06:23:27] : New block detected.
[2017-03-29 06:23:38] : Difficulty changed. Now: 166413.
[2017-03-29 06:23:38] : New block detected.
[2017-03-29 06:23:47] : Result accepted by the pool.
[2017-03-29 06:24:08] : Difficulty changed. Now: 249620.
[2017-03-29 06:24:08] : New block detected.
HASHRATE REPORT
| ID | 10s | 60s | 15m | ID | 10s | 60s | 15m |
| 0 | 573.7 | 573.7 | 572.2 | 1 | 580.3 | 580.4 | 580.2 |
-----------------------------------------------------
Totals: 1154.0 1154.1 1152.4 H/s
Highest: 1154.4 H/s
HASHRATE REPORT
| ID | 10s | 60s | 15m | ID | 10s | 60s | 15m |
| 0 | 573.3 | 573.7 | 572.3 | 1 | 580.3 | 580.4 | 580.2 |
-----------------------------------------------------
Totals: 1153.5 1154.1 1152.5 H/s
Highest: 1154.4 H/s

HASHRATE REPORT
| ID | 2.5s | 60s | 15m | ID | 2.5s | 60s | 15m |
| 0 | 41.1 | 41.7 | (na) | 1 | 40.3 | 40.5 | (na) |
| 2 | 41.2 | 41.3 | (na) | 3 | 40.7 | 41.7 | (na) |
| 4 | 36.9 | 37.1 | (na) | 5 | 37.0 | 37.0 | (na) |
| 6 | 37.3 | 37.1 | (na) | 7 | 37.0 | 36.6 | (na) |
| 8 | 40.2 | 40.1 | (na) | 9 | 39.5 | 39.2 | (na) |
| 10 | 40.4 | 40.1 | (na) | 11 | 40.5 | 39.8 | (na) |
| 12 | 35.8 | 35.8 | (na) | 13 | 36.2 | 36.1 | (na) |
| 14 | 41.1 | 41.3 | (na) | 15 | 40.5 | 40.6 | (na) |
| 16 | 41.8 | 41.8 | (na) | 17 | 40.8 | 40.7 | (na) |
| 18 | 38.8 | 40.7 | (na) | 19 | 41.4 | 41.7 | (na) |
| 20 | 36.5 | 36.9 | (na) | 21 | 36.6 | 36.7 | (na) |
| 22 | 37.6 | 36.9 | (na) | 23 | 37.6 | 37.7 | (na) |
| 24 | 40.6 | 40.7 | (na) | 25 | 39.8 | 39.7 | (na) |
| 26 | 39.9 | 39.9 | (na) | 27 | 40.5 | 40.0 | (na) |
| 28 | 36.3 | 36.3 | (na) | 29 | 35.4 | 35.5 | (na) |
-----------------------------------------------------
Totals: 1169.4 1171.1 (na) H/s
Highest: 1178.8 H/s
[2017-03-29 06:24:30] : Result accepted by the pool.
[2017-03-29 06:24:44] : New block detected.
 
Last edited:
  • Like
Reactions: Marsh

poutnik

Member
Apr 3, 2013
119
14
18
When using xmr-stak-cpu, the amount of bad shares shot up rather quickly in my case, like 25% bad shares. I had to go back to wolf's miner, where I have 99.9% - 100% good shares. I'm connecting to minergate, which could also be the source of the bad shares. But the strange thing is that wolf's miner works perfect... Do you also have any such issue?
 

Marsh

Moderator
May 12, 2013
2,644
1,496
113
No, I was mining with MinerGate for a while with xmr-stak-cpu.
There were bad shares but only very small amount.
I am still using xmr-stak-cpu ( getting higher hash rate than wolf miner ) with minexmr.com

Adde: since I don't have problem with xmr-stak-cpu, so I never play with the setting timeout
"call_timeout" : 10,
"retry_time" : 10,
"giveup_limit" : 0,

You may want to play with these settings.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
@apollo69 - the test container is running on a few different systems right now.

What pool are you using?