System absurdly underperforming on Madmax plotter (chia)

boomheadshot

Member
Mar 20, 2021
64
4
8
Can anybody help me out?

I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.

2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)

Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.

I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.

The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.


Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
View attachment 19068


What I have already tried:
  1. Different versions of the windows Chia plotter
  2. Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
    On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)
  3. 256 buckets seems to be faster than 128, but not a drastic difference
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.

Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.

So there really must be a problem with my EPYC platform.
My setup:

ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.

My only wild guess is that maybe there's a bug with 8 channels or something.

$20 PayPal to the first who figures out the culprit.

Thanks in advance.
 

iotapi322

Member
Sep 8, 2017
66
14
8
45
I run exclusively on ubuntu with a dual e5-2640V4 with 192MB of ram - I have another system with an i9-9900k running with an nvme.
I can tell you that reading all the posts, I don't think this stuff runs very well on windows. It was really designed for linux.
The 256 buckets uses less RAM per thread but creates more memory IO. So you can run in less memory but uses more cpu.

Your screen capture looks like there is something super wrong with your 2x300gb sas drives. How do you have them striped? With windows or hardware raid? If I were you I would do this:
run ubuntu
setup disk 1 to just be your ssd.
setup disk 2 to be your ram disk
run with the defaults for buckets which is 256
and set the number of threads to just 16, that way with your huge ram disk you have enough memory left over to run your threads. Let's see where you are at after that.

My setup is two 10k rpms in a raid 0 and ram disk.



Bash:
Multi-threaded pipelined Chia k32 plotter - 8903136
Final Directory: /mnt/farm/
Number of Plots: 20
Crafting plot 1 out of 20
Process ID: 1828247
Number of Threads: 40
Number of Buckets: 2^8 (256)
Pool Public Key:   ab46dfde7939950e47ce288fa713f4e129201b4873046920796ed64f6ffdcce1c17f43e02a0d2a2013d172def96ae21a
Farmer Public Key: b65eed2d4d9e6914cb44639cda5ca3b2538985f7c80429fc55b077475a2aa574f81dfa1b92ae9d0d46c0188d72d1096a
Working Directory:   /chia-tmp/
Working Directory 2: /tmp/ramdisk/
Plot Name: plot-k32-2021-06-16-07-11-76db43d7d4d1288754c1be41623f09d76d96e59a59715c959425acfc6de169e5
[P1] Table 1 took 11.1062 sec
[P1] Table 2 took 110.523 sec, found 4295052787 matches
[P1] Table 3 took 123.456 sec, found 4295139572 matches
[P1] Table 4 took 149.252 sec, found 4295123836 matches
[P1] Table 5 took 145.532 sec, found 4295216237 matches
[P1] Table 6 took 141.122 sec, found 4295197070 matches
[P1] Table 7 took 111.7 sec, found 4295141838 matches
Phase 1 took 793.689 sec
[P2] max_table_size = 4295216237
[P2] Table 7 scan took 7.61618 sec
[P2] Table 7 rewrite took 40.7403 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 27.007 sec
[P2] Table 6 rewrite took 136.613 sec, dropped 581442769 entries (13.537 %)
[P2] Table 5 scan took 27.2503 sec
[P2] Table 5 rewrite took 124.904 sec, dropped 762199480 entries (17.7453 %)
[P2] Table 4 scan took 77.371 sec
[P2] Table 4 rewrite took 179.682 sec, dropped 828987343 entries (19.3007 %)
[P2] Table 3 scan took 87.3058 sec
[P2] Table 3 rewrite took 110.638 sec, dropped 855266638 entries (19.9124 %)
[P2] Table 2 scan took 80.4731 sec
[P2] Table 2 rewrite took 120.328 sec, dropped 865654723 entries (20.1547 %)
Phase 2 took 1029.82 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 61.9299 sec, wrote 3429398064 right entries
[P3-2] Table 2 took 43.6586 sec, wrote 3429398064 left entries, 3429398064 final
[P3-1] Table 3 took 45.3018 sec, wrote 3439872934 right entries
[P3-2] Table 3 took 44.1284 sec, wrote 3439872934 left entries, 3439872934 final
[P3-1] Table 4 took 193.582 sec, wrote 3466136493 right entries
[P3-2] Table 4 took 43.3227 sec, wrote 3466136493 left entries, 3466136493 final
[P3-1] Table 5 took 225.865 sec, wrote 3533016757 right entries
[P3-2] Table 5 took 43.3526 sec, wrote 3533016757 left entries, 3533016757 final
[P3-1] Table 6 took 234.513 sec, wrote 3713754301 right entries
[P3-2] Table 6 took 46.1179 sec, wrote 3713754301 left entries, 3713754301 final
[P3-1] Table 7 took 56.4786 sec, wrote 4295141838 right entries
[P3-2] Table 7 took 54.0049 sec, wrote 4294967296 left entries, 4294967296 final
Phase 3 took 1096.15 sec, wrote 21877145845 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 114.646 sec, final plot size is 108835493589 bytes
Total plot creation time was 3034.39 sec (50.5732 min)
 

GStorie

New Member
Jun 17, 2021
1
0
1
From the Madmax plotter github

RAM usage depends on <threads> and <buckets>. With the new default of 256 buckets it's about 0.5 GB per thread at most.

128 Threads may be using up all your ram. Like 1st reply said, lower your threads and see if you get a better plot time.
I have
2-2680V2
128GB Ram
112GB Ram Drive and 500GB Sata SSD
10 Threads takes 1 hour
40 Threads takes 40 Minutes
 

boomheadshot

Member
Mar 20, 2021
64
4
8
I run exclusively on ubuntu with a dual e5-2640V4 with 192MB of ram - I have another system with an i9-9900k running with an nvme.
I can tell you that reading all the posts, I don't think this stuff runs very well on windows. It was really designed for linux.
The 256 buckets uses less RAM per thread but creates more memory IO. So you can run in less memory but uses more cpu.

Your screen capture looks like there is something super wrong with your 2x300gb sas drives. How do you have them striped? With windows or hardware raid? If I were you I would do this:
run ubuntu
setup disk 1 to just be your ssd.
setup disk 2 to be your ram disk
run with the defaults for buckets which is 256
and set the number of threads to just 16, that way with your huge ram disk you have enough memory left over to run your threads. Let's see where you are at after that.

My setup is two 10k rpms in a raid 0 and ram disk.



Bash:
Multi-threaded pipelined Chia k32 plotter - 8903136
Final Directory: /mnt/farm/
Number of Plots: 20
Crafting plot 1 out of 20
Process ID: 1828247
Number of Threads: 40
Number of Buckets: 2^8 (256)
Pool Public Key:   ab46dfde7939950e47ce288fa713f4e129201b4873046920796ed64f6ffdcce1c17f43e02a0d2a2013d172def96ae21a
Farmer Public Key: b65eed2d4d9e6914cb44639cda5ca3b2538985f7c80429fc55b077475a2aa574f81dfa1b92ae9d0d46c0188d72d1096a
Working Directory:   /chia-tmp/
Working Directory 2: /tmp/ramdisk/
Plot Name: plot-k32-2021-06-16-07-11-76db43d7d4d1288754c1be41623f09d76d96e59a59715c959425acfc6de169e5
[P1] Table 1 took 11.1062 sec
[P1] Table 2 took 110.523 sec, found 4295052787 matches
[P1] Table 3 took 123.456 sec, found 4295139572 matches
[P1] Table 4 took 149.252 sec, found 4295123836 matches
[P1] Table 5 took 145.532 sec, found 4295216237 matches
[P1] Table 6 took 141.122 sec, found 4295197070 matches
[P1] Table 7 took 111.7 sec, found 4295141838 matches
Phase 1 took 793.689 sec
[P2] max_table_size = 4295216237
[P2] Table 7 scan took 7.61618 sec
[P2] Table 7 rewrite took 40.7403 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 27.007 sec
[P2] Table 6 rewrite took 136.613 sec, dropped 581442769 entries (13.537 %)
[P2] Table 5 scan took 27.2503 sec
[P2] Table 5 rewrite took 124.904 sec, dropped 762199480 entries (17.7453 %)
[P2] Table 4 scan took 77.371 sec
[P2] Table 4 rewrite took 179.682 sec, dropped 828987343 entries (19.3007 %)
[P2] Table 3 scan took 87.3058 sec
[P2] Table 3 rewrite took 110.638 sec, dropped 855266638 entries (19.9124 %)
[P2] Table 2 scan took 80.4731 sec
[P2] Table 2 rewrite took 120.328 sec, dropped 865654723 entries (20.1547 %)
Phase 2 took 1029.82 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 61.9299 sec, wrote 3429398064 right entries
[P3-2] Table 2 took 43.6586 sec, wrote 3429398064 left entries, 3429398064 final
[P3-1] Table 3 took 45.3018 sec, wrote 3439872934 right entries
[P3-2] Table 3 took 44.1284 sec, wrote 3439872934 left entries, 3439872934 final
[P3-1] Table 4 took 193.582 sec, wrote 3466136493 right entries
[P3-2] Table 4 took 43.3227 sec, wrote 3466136493 left entries, 3466136493 final
[P3-1] Table 5 took 225.865 sec, wrote 3533016757 right entries
[P3-2] Table 5 took 43.3526 sec, wrote 3533016757 left entries, 3533016757 final
[P3-1] Table 6 took 234.513 sec, wrote 3713754301 right entries
[P3-2] Table 6 took 46.1179 sec, wrote 3713754301 left entries, 3713754301 final
[P3-1] Table 7 took 56.4786 sec, wrote 4295141838 right entries
[P3-2] Table 7 took 54.0049 sec, wrote 4294967296 left entries, 4294967296 final
Phase 3 took 1096.15 sec, wrote 21877145845 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 114.646 sec, final plot size is 108835493589 bytes
Total plot creation time was 3034.39 sec (50.5732 min)

Okay, so I'm trying to get it to work on Ubuntu again


artem@artem-KRPA-U16-Series:~/Desktop/chia-plotter$ ./build/chia_plot -n 1 -r 128 -u 256 -t /mnt/sas10k1/ -2 /mnt/sas10k2/ -p 91e9a8b4b85a95084d99279382b03ea4e2304b3eaf4b7abfab8fb7cfc494f69e3a8e665b743af1550cfeb650b5cac7cc -f 82fa17feede93d58ca05c3e101eb66d8eb8bdc01f31b5bf46f3ac1548e6b699ed0c9eaed714a410b16c37dec1142f304
Multi-threaded pipelined Chia k32 plotter - 5ff6462
Final Directory: /mnt/sas10k1/
Number of Plots: 1
Crafting plot 1 out of 1
Process ID: 5630
Number of Threads: 128
Number of Buckets: 2^8 (256)
Pool Public Key: 91e9a8b4b85a95084d99279382b03ea4e2304b3eaf4b7abfab8fb7cfc494f69e3a8e665b743af1550cfeb650b5cac7cc
Farmer Public Key: 82fa17feede93d58ca05c3e101eb66d8eb8bdc01f31b5bf46f3ac1548e6b699ed0c9eaed714a410b16c37dec1142f304
Working Directory: /mnt/sas10k1/
Working Directory 2: /mnt/sas10k2/
Plot Name: plot-k32-2021-06-17-21-28-c30c39b28ebafb442009de2691db434b241bbc503c66f034382d2a1d45a85ee9

And.... nothing. The terminal window doesn't even give me a Table 1 after like 10 minutes, although in htop it seems to be working. But the threads are underutilized and so is the ram

 

iotapi322

Member
Sep 8, 2017
66
14
8
45
Your swinging for the fences every time,

change your threads to 16

are your hard drives in a raid array? What does your /etc/fstab look like?
 

boomheadshot

Member
Mar 20, 2021
64
4
8
Your swinging for the fences every time,

change your threads to 16

are your hard drives in a raid array? What does your /etc/fstab look like?

I've tried with with 16 threads and 128 threads, it's really slow in both cases. Even on ubuntu
each drive is in its own raid 0

I'm not using 128 threads AND a RAMDisk right now, just 2 sas drives for the temps

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda33 during installation
UUID=9b0e2734-f2d3-495e-80e6-089fdf987aaf / ext4 errors=remount-ro 0 1
# /boot/efi was on /dev/sda2 during installation
UUID=6467-1BAC /boot/efi vfat umask=0077 0 1
/swapfile none swap sw 0 0
/dev/disk/by-uuid/1AD2FD01D2FCE245 /mnt/sas10k1 auto nosuid,nodev,nofail,x-gvfs-show 0 0
/dev/disk/by-uuid/6C48E92F48E8F92A /mnt/sas10k2 auto nosuid,nodev,nofail,x-gvfs-show 0 0
/dev/disk/by-uuid/2E70F54270F51177 /mnt/sas10k3 auto nosuid,nodev,nofail,x-gvfs-show 0 0
/dev/disk/by-uuid/8AD40501D404F16D /mnt/sas10k4 auto nosuid,nodev,nofail,x-gvfs-show 0 0

okay, so I think the problem in Ubuntu is that all the cores are at 400 MHz and not turboing up.

I'm trying to get ZenStates-Linux to work, I almost nailed everything down but can't get the GUI to work. Whenever I try to


$ sudo apt install pip3 python3-tk wheel

this is what I get

artem@artem-KRPA-U16-Series:~/Desktop$ sudo apt install pip3 python3-tk wheel
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package pip3
E: Unable to locate package wheel

What I did:
1. python --version doesn't work
2. tried to install it, could install python 3, but then when I install python, python --version still didn't work
3. Then I ran something like "python is python3"
4. Still had the same error
5. Decided to install PySimpleGui
6. Worked once, but now I have the same error as with tkinter
ModuleNotFoundError: No module named 'tkinter'
the same with pysimplegui, but it's installed... I tried googling for solutions, but just can't get it to work. FML, why isn't this retard-friendly?! I can read and follow instructions, but this is just driving me nuts :(

Okay, after tearing out all of the remaining hair out of my head, I give up on getting this gui to work. Will try to get the sensors to work for now and then use the overclocking tool without the gui (which I'm kind of scared of)
 

NateS

Active Member
Apr 19, 2021
124
64
28
Sacramento, CA, US
Have you tried doing some standard non-Chia benchmarks on your system, just to make sure its baseline performance is in line with expectations? It's possible that this isn't a Chia plotting problem at all, and something about your system really is making it underperform.
 

boomheadshot

Member
Mar 20, 2021
64
4
8
I never have to adjust the turbo on my cpu in linux, the motherboard handles that.

For some reason, whenever I start plotting, the CPU freq sags down to 400 MHz on all cores. I was using " cat /proc/cpuinfo | grep "MHz"" to output the results.

Okay, I must really be doing something wrong. I tried RAMdisk tmp2 + nvme tmp 1 with 16 threads and it was still quite slow. Here is the log:


PS C:\Users\Administrator> cd C:\Users\Administrator\Desktop\misc5
PS C:\Users\Administrator\Desktop\misc5> .\chia_plot.exe -n 1 -r 16 -u 256 -t K:\ -2 R:\ -d E:\chia\ -p [redacted] -f [redacted]
Multi-threaded pipelined Chia k32 plotter - aff2601
Build 0.5.0 for Windows. Check for latest updates: https://stotiks.github.io/chia-plotter/

Final Directory: E:\chia\
Number of Plots: 1
Crafting plot 1 out of 1
Process ID: 3084
Number of Threads: 16
Number of Buckets: 2^8 (256)
Pool Public Key: [redacted]
Farmer Public Key: [redacted]
Working Directory: K:\ <--- NVMe ssd
Working Directory 2: R:\ <--- this is the RAMdisk that I made using ImDisk
Plot Name: plot-k32-2021-06-19-13-44-f8fad5e48f6cc1e0b8d32a516fe7a919e3f0498dd4b434e220dcbe76164cc8cd
[P1] Table 1 took 57.4672 sec
[P1] Table 2 took 436.65 sec, found 4294940427 matches
[P1] Table 3 took 537.128 sec, found 4294981486 matches
[P1] Table 4 took 660.143 sec, found 4294958771 matches
[P1] Table 5 took 677.28 sec, found 4294877532 matches
[P1] Table 6 took 715.938 sec, found 4294744527 matches
[P1] Table 7 took 349.345 sec, found 4294570107 matches
Phase 1 took 3434.19 sec
[P2] max_table_size = 4294981486
[P2] Table 7 scan took 18.6943 sec
[P2] Table 7 rewrite took 66.6579 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 13.1188 sec
[P2] Table 6 rewrite took 34.6522 sec, dropped 581251855 entries (13.534 %)
[P2] Table 5 scan took 13.4228 sec
[P2] Table 5 rewrite took 31.846 sec, dropped 761959252 entries (17.7411 %)
[P2] Table 4 scan took 13.251 sec
[P2] Table 4 rewrite took 31.6031 sec, dropped 828873955 entries (19.2988 %)
[P2] Table 3 scan took 12.9941 sec
[P2] Table 3 rewrite took 31.9996 sec, dropped 855084604 entries (19.9089 %)
[P2] Table 2 scan took 13.2812 sec
[P2] Table 2 rewrite took 31.226 sec, dropped 865533924 entries (20.1524 %)
Phase 2 took 317.49 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 59.7737 sec, wrote 3429406503 right entries
[P3-2] Table 2 took 47.3752 sec, wrote 3429406503 left entries, 3429406503 final
[P3-1] Table 3 took 60.352 sec, wrote 3439896882 right entries
[P3-2] Table 3 took 41.5664 sec, wrote 3439896882 left entries, 3439896882 final
[P3-1] Table 4 took 59.1728 sec, wrote 3466084816 right entries
[P3-2] Table 4 took 39.7642 sec, wrote 3466084816 left entries, 3466084816 final
[P3-1] Table 5 took 59.8889 sec, wrote 3532918280 right entries
[P3-2] Table 5 took 42.2431 sec, wrote 3532918280 left entries, 3532918280 final
[P3-1] Table 6 took 64.564 sec, wrote 3713492672 right entries
[P3-2] Table 6 took 44.7969 sec, wrote 3713492672 left entries, 3713492672 final
[P3-1] Table 7 took 71.5989 sec, wrote 4294570107 right entries
[P3-2] Table 7 took 51.2289 sec, wrote 4294570107 left entries, 4294570107 final
Phase 3 took 644.22 sec, wrote 21876369260 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 113.1 sec, final plot size is 108830601267 bytes
Total plot creation time was 4509.15 sec
Started copy to E:\chia\plot-k32-2021-06-19-13-44-f8fad5e48f6cc1e0b8d32a516fe7a919e3f0498dd4b434e220dcbe76164cc8cd.plot
Copy to E:\chia\plot-k32-2021-06-19-13-44-f8fad5e48f6cc1e0b8d32a516fe7a919e3f0498dd4b434e220dcbe76164cc8cd.plot finished, took 772.639 sec, 0 MB/s avg.

When I look at other people's logs (for example, here, even with SAS drives their p1 table 1 and p1 table took take much less), so it's got me scratching my head.

Instead of my ES rome cpu, I've taken out my x99 + e5-2678v3 build and tried to see if something's just not working on the amd epyc.

Well, with 2 parallel plots p1 table 1 times being 482 seconds and 523 seconds respectively, it seems like there is a fundamental problem, and I've got no idea what it is.


My drives are in NTFS, can that even make that much of a difference? I've left them in NTFS so that I can switch between ubuntu and windows to see the performance difference. I don't know what else I'm doing differently

Edit: okay, so right now on the x99 + e5-2678v3 build p1 table 2 was finished in 1032 seconds, which is actually the fastest that I've ever had before without a RAMdisk. I've recently noticed that p1 table 1 isn't always indicative of the whole situation, especially if the cores downclock and then they have to turbo back up, so if the frequencies aren't locked, it turns out to be much slower. But from P1 table 2, you can gauge the performance just by looking at the table times.

edit2: the other plotter showing 1365 seconds for p1 table 2 -_-

edit3: also really slow on the x99 build
PS C:\WINDOWS\system32> cd C:\Users\Artem\Desktop\misc03
PS C:\Users\Artem\Desktop\misc03> .\chia_plot.exe -n -1 -r 10 -u 256 -t I:\ -2 M:\ -d J:\chia\ -p [redacted] -f [redacted]
Final Directory: J:\chia\
Number of Plots: infinite
Process ID: 6092
Number of Threads: 10
Number of Buckets: 2^8 (256)
Pool Public Key: [redacted]
Farmer Public Key: [redacted]
Working Directory: I:\
Working Directory 2: M:\
Plot Name: plot-k32-2021-06-20-13-23-87b823acbe5df445f4418d63f23b023dc3f4ef57b96f3e1dfaa1ebaba2e19226
[P1] Table 1 took 482.303 sec
[P1] Table 2 took 1365.52 sec, found 4294931052 matches
[P1] Table 3 took 2013.65 sec, found 4294882170 matches
[P1] Table 4 took 2476.01 sec, found 4294765808 matches
[P1] Table 5 took 2343.94 sec, found 4294454390 matches
[P1] Table 6 took 2035.41 sec, found 4293954419 matches
[P1] Table 7 took 1477.04 sec, found 4293021057 matches
Phase 1 took 12196.2 sec
[P2] max_table_size = 4294967296

*sigh*, I still don't get what I'm doing wrong.
 
Last edited:

amalurk

Active Member
Dec 16, 2016
254
78
28
100
I don't think 75 minutes is that bad for an ES CPU that might not turbo or have great clocks at 16 threads and for an unknown NVME that might not write that fast after SLC cache is used up.. These amazing 20 min plot times are with RAMdisks with very fast low timing RAM which you probably don't have and many more threads than 16 or very high-performance threads like a 5900x.

Here is something you could do in just 5 minutes, download CPU-Z and post a shot of the CPU and Memory screens so we can see more info about your processor and RAM and then go to the Bench Tab in CPU-Z and run the CPU Single and CPU Multi Thread benchmarks and tell us the scores.
 

boomheadshot

Member
Mar 20, 2021
64
4
8
I don't think 75 minutes is that bad for an ES CPU that might not turbo or have great clocks at 16 threads and for an unknown NVME that might not write that fast after SLC cache is used up.. These amazing 20 min plot times are with RAMdisks with very fast low timing RAM which you probably don't have and many more threads than 16 or very high-performance threads like a 5900x.

Here is something you could do in just 5 minutes, download CPU-Z and post a shot of the CPU and Memory screens so we can see more info about your processor and RAM and then go to the Bench Tab in CPU-Z and run the CPU Single and CPU Multi Thread benchmarks and tell us the scores.
Bro, 75 minutes was with 3.0 GHz locked in the overclocking program that was shared in one of the neighboring threads. 3.0 GHz, locked frequency (makes a big difference for table 1, don't know how important it really is, but I guess it does make a difference. It does 7700 in R15 like this, so the CPU isn't that weak. Peeps are doing plots in 20 minutes on 128 threads, I really feel like I should be doing at least 40-45 minute plots with 16 threads and temp2 being a RAMdisk. My X99 build also seems slow right now, even on ubuntu:


oem@KRPA-U16-Series:~/Desktop/chia-plotter/build$ ./chia_plot -n 1 -r 20 -u 128 -t /mnt/sas10k1/ -2 /mnt/sas10k2/ -d /media/oem/Elements/ -p [redacted] -f [redacted]
Multi-threaded pipelined Chia k32 plotter - 9e649ae
Final Directory: /media/oem/Elements/
Number of Plots: 1
Crafting plot 1 out of 1
Process ID: 3896
Number of Threads: 20
Number of Buckets P1: 2^7 (128)
Number of Buckets P3+P4: 2^7 (128)
Pool Public Key: [redacted]
Farmer Public Key: [redacted]
Working Directory: /mnt/sas10k1/
Working Directory 2: /mnt/sas10k2/
Plot Name: plot-k32-2021-06-20-18-00-ce6aac96de4b3223ec168ebfa8e1542c07d1e7d496d4ff603d7acd993122af65
[P1] Table 1 took 240.553 sec
[P1] Table 2 took 1274.85 sec, found 4294882881 matches
[P1] Table 3 took 2219.32 sec, found 4294821772 matches

Here is htop , the cpus do spike up once in a while, but it still seems like it's underutilized for the most part.
Hers is iostat not sure if it means anything

I'm using the same SSDs for the OS's, just swapping between hardware (one has windows 2019 server, the other has Win 10 Pro and ubuntu 20.04, IDK if it matters that I'm swapping it like this. Mentioning this just in case).

I'll try finishing phase 1 on this X99 build, I'll write everything down and then I'll hook up my Rome ES build again and benchmark it in CPU-Z.
 

amalurk

Active Member
Dec 16, 2016
254
78
28
100
Yeah it is very strange then. RAM in wrong config leading to not using all the channels? Or not enough power from power supply or processor overheating and backing off frequencies significantly. I think if you are going through the HP P410 that could be an issue too but, obviously not with the RAMdisk so doesn't really change things. Hard to say any of those though if it can sustain good benches in other highly threaded benchmarks.

Table 1 should definitely be a lot lower, even my 5600x does Table 1 in 16.7s with a measly 4 threads to PCIe4 Enterprise NVME. Strangely, table 2 seems to take around 10x-12x Table 1.

My understanding is lower buckets take more RAM so if lower buckets are slower then could be RAM issue?
 

iotapi322

Member
Sep 8, 2017
66
14
8
45
I agree with this ^^^

I think you might be thermally throttled or something like that.

I have a dual cpu e5-2640v4 board with 160GB of ram( so 40 threads in a RAM disk) and it takes me 31 mins to make a plot.
 

boomheadshot

Member
Mar 20, 2021
64
4
8
Yeah it is very strange then. RAM in wrong config leading to not using all the channels? Or not enough power from power supply or processor overheating and backing off frequencies significantly. I think if you are going through the HP P410 that could be an issue too but, obviously not with the RAMdisk so doesn't really change things. Hard to say any of those though if it can sustain good benches in other highly threaded benchmarks.

Table 1 should definitely be a lot lower, even my 5600x does Table 1 in 16.7s with a measly 4 threads to PCIe4 Enterprise NVME. Strangely, table 2 seems to take around 10x-12x Table 1.

My understanding is lower buckets take more RAM so if lower buckets are slower then could be RAM issue?
My X99 can't be overheating (Noctua NH-D15S), and when I made plots on my Epyc ES, I wouldn't even load all of the cores. It was much more stressed under Monero mining, but it held up well. Icegiant prosiphon elite on the Epyc CPU.

1000 Watt PSU (FSP Aurum PRO 1000W), but under load when plotting this CPU only consumes ~200 Watts. It was 400 when I mined monero, and that was fine.

Edit: in the screenshot at the bottom of this message, you see that the 12V reading doesn't really sag much, on one of my shitty PSU's it would drop down to like 11.7V lol

There's a problem where the CPU reports abnormally high temperatures, but it doesn't affect the clocks, at the middle of this page I went into more detail about this.



The only thing that I've noticed is that the system would only use ~9 GBs of ram when doing 2 plots, 16 threads each. 100 GB would go into standby mode (I've stumbled into this problem while gaming, and yes it does cause stuttering in games). I tried using EmpyStandbyList.exe, but it didn't make a difference.

@iotapi322: but what about my X99 build? It wasn't thermally throttling, I just checked, the temps were in the 40s on ubuntu while plotting. And it wasn't thermally throttling when I mined monero on all 128 threads @ ~420 watts. Now it's like ~180ish watts while plotting

edit: Here is my CPU-Z benchmark (SMT disabled, it seemed to do a better job on Windows 2019 server, I noticed there was a bug that displays 2 NUMA nodes, but that's just a bug (forgot where I read about it), but I still tried to disable SMT, did improve table 1 times, but from table 2 onwards there is no difference. I don't believe the overheating because it doesn't throttle, and the IPMI sensor reports the same temps as the CCD temps. It was under much more stress when I mined monero and it didn't die/shut off on me. I have a huge fan blowing on the VRM and another fan blowing on the HP P410

Here is the memory tab , although under the SPD tab I get this and none of the slots show the modules. Maybe it's cuz I just swapped from the X99 back to the ASUS KRPA-U16 motherboard
 
Last edited:

amalurk

Active Member
Dec 16, 2016
254
78
28
100
You scores are much lower than they should be. Single core is esepcially low, I think it should be closer to double that. A 7502P scores 400 or so in single core CPUZ. Could it just be ES processor issues?
 

boomheadshot

Member
Mar 20, 2021
64
4
8
You scores are much lower than they should be. Single core is esepcially low, I think it should be closer to double that. A 7502P scores 400 or so in single core CPUZ. Could it just be ES processor issues?
First it's at about 250 for the single core, then drops to about 200. I tried the single core performance preset, and it was actually 340, but it set the voltage to 1.1, and when I lowered it to 1.05 (afraid of using >1.05), it actually crashed on the second bench.

But when I LOWERED the frequency to 2.6 from 3.0, the single core actually went up from 200 to 260.
1624217516603.png

The ES's are indeed finicky, but I still think I'm doing something wrong, because it was still slow on the 2678v3.

EDIT: DISREGARD THIS COMMENT, THIS WAS HAPPENING ON AN UNSTABLE WINDOWS 10 WITH A MEMORY LEAK
 
Last edited:

mirrormax

Active Member
Apr 10, 2020
136
58
28
hm something seems off, my guess is thermal throttling of some sort?
for reference, i can do 4.5s table1 with 128 threads, on my dual epyc system(numactl to one cpu)
lowest plot time doing singles i was right under 1300sec, big overclock though 3.5 or 3.6ghz.
this on nvmes, ramdisk didnt help speeds for me, even running the whole thing on tmpfs.

your cpuz scores also look lowish at least for multi. but iam on the 53-04 OEM cpu thats pretty close to stock+unlocked.

what NPS mode do you run at? raise the max TDP in bios if possible maybe they are TDP limited or something.
and check out these links for more info on bios settings.

 

boomheadshot

Member
Mar 20, 2021
64
4
8
hm something seems off, my guess is thermal throttling of some sort?
for reference, i can do 4.5s table1 with 128 threads, on my dual epyc system(numactl to one cpu)
lowest plot time doing singles i was right under 1300sec, big overclock though 3.5 or 3.6ghz.
this on nvmes, ramdisk didnt help speeds for me, even running the whole thing on tmpfs.

your cpuz scores also look lowish at least for multi. but iam on the 53-04 OEM cpu thats pretty close to stock+unlocked.

what NPS mode do you run at? raise the max TDP in bios if possible maybe they are TDP limited or something.
and check out these links for more info on bios settings.

Thanks for the links. I've just decided to apply the Best single core preset in the program, getting about 340 single core in CPU-Z now and 21k multithread (3.4 GHz) Seems stable so far. I'll leave it overnight and see how it goes.

Did you apply any specific settings? I left NPS at auto, and I've also tried 1NPS, no difference. Try memory interleaving disabled, turned out worse. Don't really know what else to do on such platforms, I'm only familiar with overclocking desktop platforms, nothing like this.
 

mirrormax

Active Member
Apr 10, 2020
136
58
28
Thanks for the links. I've just decided to apply the Best single core preset in the program, getting about 340 single core in CPU-Z now and 21k multithread (3.4 GHz) Seems stable so far. I'll leave it overnight and see how it goes.

Did you apply any specific settings? I left NPS at auto, and I've also tried 1NPS, no difference. Try memory interleaving disabled, turned out worse. Don't really know what else to do on such platforms, I'm only familiar with overclocking desktop platforms, nothing like this.
Yes haven't found any settings that make a massive difference, i leave smt on though disabled most memory encryptions, set cpu tdp to 280, determinism to power.
How many kh did you get mining