Monero Mining Performance

Discussion in 'Cryptocurrency Mining and Markets' started by Patrick, Nov 18, 2016.

  1. Joel

    Joel Active Member

    Joined:
    Jan 30, 2015
    Messages:
    807
    Likes Received:
    155
    6 cores, same as you (note I actually went into BIOS and disabled unused cores). I do run two separate Docker images pinned to each NUMA node, though when I was testing it didn't make much difference for me.
     
    #1281
  2. jetbird

    jetbird New Member

    Joined:
    Dec 28, 2017
    Messages:
    19
    Likes Received:
    4
    For comparison, what address are you using to mine? I am currently using;

    us.monero.hashvault.pro:3333

    Thanks,

    Jeff
     
    #1282
  3. Joel

    Joel Active Member

    Joined:
    Jan 30, 2015
    Messages:
    807
    Likes Received:
    155
    Nicehash. The pool won't matter at all for what your miner software reports. Actual effective hash rate is a different story of course.
     
    #1283
  4. sno.cn

    sno.cn Active Member

    Joined:
    Sep 23, 2016
    Messages:
    158
    Likes Received:
    49
    My new Ryzen 7 1700 is up and running.

    3.8 GHz @ 1.29v, running nice and cool at 45 C under a cheap AC Freezer 33. I haven't touched the memory yet so it's running low and slow right now.

    xmrig with hugepages and affinity 0x5555 (cores 0,2,4,6,8,10,12,14) in Windows 10 is doing 608 h/s.

    My 2 x E5-2670V2 is doing around 870 h/s on 20 threads without hugepages and no affinity set.
     
    #1284
  5. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    a question about a note in the OP

    Using (MB L3 cache/ 2) for threads

    what does this means? suppose you have 2 cpus with each 32 core/64 threads and 64mb of L3 cache.
    does it mean running 64mb/2 = 32 threads inside mining software? i dont mean 32 cpu threads.

    can some explain me?
     
    #1285
  6. alex_stief

    alex_stief Active Member

    Joined:
    May 31, 2016
    Messages:
    404
    Likes Received:
    103
    IIRC, the default algorithm needs 2mb of last level cache per thread. So in your case, 32 threads per CPU.
    It's been a long time, but I think there is a different algorithm that only needs 1mb of cache per thread. This would allow you to use all 64 HWthreads of each CPU. I could be wrong though.
    My guess is that you are talking about AMD Epyc CPUs? In order to get decent performance out of them, you need to run one worker per NUMA node. Spinning up one worker on all cores/threads simultaneously will lead to poor performance.
     
    #1286
    Last edited: Oct 21, 2019
  7. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    thanks for your answer.

    i am running xmrig-miner on 2 socket epyc 7551 ( 2* 32 cores, 2* 64 threads of cpu's).

    does this mean that i have to run 4 instances of xmrig (on windows 10, 64bit) to fully load all my cpu's with mining?
     
    #1287
  8. alex_stief

    alex_stief Active Member

    Joined:
    May 31, 2016
    Messages:
    404
    Likes Received:
    103
    Each of your CPUs has 4 NUMA nodes. At least if you did not fiddle with the memory interleaving options in the bios. In Linux you can check NUMA topology with lscpu and/or lstopo. In Windows a quick look at the task manager should suffice. Right-click on the the image of CPU utilization, and change the view to NUMA.
    That makes 8 workers in total. No idea how to get the workers pinned to the correct cores in Windows. I only did this in Linux.
    Or if you really want to squeeze the last percent of performance out of it: The L3 cache on each NUMA node is segmented again into 2 chunks. Running one worker for each chunk of L3 (consisting of 4 cores in your case) would be ideal. But then again, you would probably want to run Linux if the last bit of performance was important ;)
     
    #1288
  9. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    upload_2019-10-22_18-21-42.png

    i have totall of 8 NUMA nodes. i use bitsum process laso to set cpu-affinity when i am in BOINC/WCG and F@H. but with xmrig i cann't get done with process lasso to set cpu affinity to multiple instances of xmrig.exe....

    and another question...with L3 and 2 chucnks....do you mean L3 in 16way as shown in cpu-z?

    if i switch to Linux, can you help with NUMA nodes binding to miner-worker?? and which Linux distro??....last time i used Linux was in 2000....19 yrs ago :)
     
    #1289
  10. jims2321

    jims2321 Active Member

    Joined:
    Jul 7, 2013
    Messages:
    180
    Likes Received:
    42

    Your running cn/r so you might need to install numactl first. Then run something like this

    “seq 0 1 | xargs -P 0 -I node numactl -N node '/.../bin/randomx-benchmark' --mine --largePages --jit --nonces 100000 --init 8 --threads 8”

    Substituting the appropriate app for the randomx-benchmark app. the critical part is seq 0 1 | xargs -P 0 -I node numactl -N node

    Here is a link to how to do it.

    Managing Process Affinity in Linux
     
    #1290
    Last edited: Oct 22, 2019
  11. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    any clue how to do it in windows 10?
     
    #1291
  12. alex_stief

    alex_stief Active Member

    Joined:
    May 31, 2016
    Messages:
    404
    Likes Received:
    103
    No, each die in the Zen microarchitecture consists of 2 compute complexes, each with its own L3 Cache. See e.g. Sizing Up Servers: Intel's Skylake-SP Xeon versus AMD's EPYC 7000 - The Server CPU Battle of the Decade?

    Pinning workers to a certain range of cores is easy in Linux. I had several "worker" scripts that started an instance of xmrig with taskset. E.g.
    taskset -c 0-3,32-34 xmrig -t 7 ...
    This started xmrig with 7 threads, pinned to hardware threads 0,1,2,3,32,33,34. These threads belong to the first die on the first Epyc 7301 CPU in a dual-socket system, SMT enabled.

    And then a "supervisor" script that starts all workers at the same time. It is a bit of work to set it up, but the performance and efficiency increase is worth it.
     
    #1292
  13. jims2321

    jims2321 Active Member

    Joined:
    Jul 7, 2013
    Messages:
    180
    Likes Received:
    42
    #1293
  14. Klee

    Klee Well-Known Member

    Joined:
    Jun 2, 2016
    Messages:
    1,211
    Likes Received:
    361

    I only use Windows 10 if I mine with GPU's and only if it is a dedicated GPU miner.

    I played around a bit with my Open Compute servers with Hyper threading being disabled and yes it does mine a with little bit higher hash rate, don't remember the details but It was not a huge difference.

    But since its a pain to shutdown and to add a video card and keyboard just to go into the bios to turn it off or back on on the OC servers and hyper threading is just so use useful for other things I just leave it enabled.
     
    #1294
  15. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    my system is 2* epyc 7551. this means 2* 32cores, or totall of 64 cores, which are also seen (real and hardware 64 cores) in Process Laso (from Bitsum).

    i can set cpu affinity inside Process Lasso for XMRIG-01.EXE 0-31, hash speed around 3000 h/s.

    running second instance of xmrig (XMRIG-02.EXE) with cpu affinity 32-63, hash speed around 450 h/s. while hashrate of xmrig-01 drops to around 2500-2600 h/s.

    when i run xmrig, it says 64 threads available.

    logs of xmrig

    Code:
     * ABOUT        XMRig/4.3.1-beta MSVC/2017
     * LIBS         libuv/1.31.0 OpenSSL/1.1.1c hwloc/2.0.4
     * HUGE PAGES   permission granted
     * CPU          AMD EPYC 7551 32-Core Processor (2) x64 AES
                    L2:32.0 MB L3:128.0 MB 64C/128T NUMA:8
     * DONATE       1%
     * ASSEMBLY     auto:ryzen
     * POOL #1      pool.supportxmr.com:7777 coin monero
     * COMMANDS     hashrate, pause, resume
     * OPENCL       disabled
    [2019-10-23 00:04:57.787] use pool pool.supportxmr.com:7777  94.23.247.226
    [2019-10-23 00:04:57.788] new job from pool.supportxmr.com:7777 diff 40000 algo cn/r height 1950534
    [2019-10-23 00:04:57.789]  cpu  use profile  cn  (64 threads) scratchpad 2048 KB
    [2019-10-23 00:04:58.443]  cpu  READY threads 64/64 (64) huge pages 100% 64/64 memory 131072 KB (654 ms)
    my totall L3 cache is 128MB (for dual socket epyc 7551). this means i can run xmrig in totall 64cores/threads, because of 2MB L3 cache per thread.

    so this means, hardware limitations... am i correct? having 64 cores, mining on 32 cores (or 64 threads). half of my totall cores.

    or can i fully load all my cores (or threads) under Linux with mining? or maybe also under Linux limited to L3 cache size per mining thread??
     
    #1295
    Last edited: Oct 23, 2019
  16. alex_stief

    alex_stief Active Member

    Joined:
    May 31, 2016
    Messages:
    404
    Likes Received:
    103
    Things don't quite add up here.
    First things first: do you have SMT enabled?
    Running a second instance of xmrig on the second CPU should not affect the hash rate of the first CPU. And you should see about the same hash rate on each of your CPUs. Provided pinning worked correctly.
    And you need to tell each instance of xmrig how many threads it should use.
     
    #1296
  17. ari2asem

    ari2asem Member

    Joined:
    Dec 26, 2018
    Messages:
    189
    Likes Received:
    16
    SMT enabled. no any other changes in BIOS about memory and cpu. in BIOS all default.

    upload_2019-10-23_18-0-31.png

    each green bar in the right above corner is around 100% core load. under RULE you can see i signed different cores to different EXE-files from different locations.

    xmrig.exe is version 4.4.0 beta
    xmrig-01.exe is version 4.3.1 beta

    upload_2019-10-23_18-2-8.png

    ......................................................................

    this below screenshots are after changing cpu affinity for
    xmrig.exe ==> 0-31 (xmrig version 4.4.0)
    xmrig-01.exe == > 32-63 (xmrig version 4.3.1)
    hashrate around 3000 hashes/second for 2 instances together
    upload_2019-10-23_18-11-50.png

    upload_2019-10-23_18-12-20.png


    still having max hashrate around 3000 hashes/second for both instances together


    i think my limiting factor here is L3 cache size. the same i expect under Linux.

    @alex_stief .....can you post screenshots of your hashing values with so many programs/miners running and with cpu usage?
     

    Attached Files:

    #1297
  18. alex_stief

    alex_stief Active Member

    Joined:
    May 31, 2016
    Messages:
    404
    Likes Received:
    103
    Sorry, it has been more than a year since I mined my last coins. Not sure if I will get back to it any time soon.
    What you could do is disabling SMT. You won't use more than one thread per core, and this makes finding the correct threads to pin easier.
    Isn't there a tool similar to lstopo in Windows?
     
    #1298
    Last edited: Oct 23, 2019
Similar Threads: Monero Mining
Forum Title Date
Cryptocurrency Mining and Markets Monero XMR Mining on 64 core AMD EPYC Aug 10, 2019
Cryptocurrency Mining and Markets AVX512 and Monero mining Jul 15, 2019
Cryptocurrency Mining and Markets 8x NVIDIA Tesla V100 32GB PCIe Monero Mining Speed Jan 5, 2019
Cryptocurrency Mining and Markets NVIDIA Tesla P100 16GB Monero Mining Speed Aug 24, 2018
Cryptocurrency Mining and Markets Vega mining Monero7 with xmr-stak stabiltity? Apr 15, 2018

Share This Page