THE CHIA FARM

msg7086

Active Member
May 2, 2017
378
134
43
34
Looks like I wasn't clear enough. There are rumours that K32's are already broken because an institution is already able to make a plot with the requirements of a challenge. That means you can do a plot fast enough so that every challenge answer from you is to be accepted and you receive XCH. I would call this 'plotting on the fly' or just jumping the queue. This also would mean, that someone figured the Hash algo ... Isn't it so?

But they are rumors.
You were clear enough, and those are essentially PoW. However it's still unclear how many plots it would need to answer a challenge.

It's easy to craft a plot that passes plot filter, but finding a plot that actually answers the challenge would be a lot more difficult. It also depends on how much it would cost to PoW the answer. If it requires thousands of cores and hundreds of terabytes of memory to solve many challenges, it may not make sense to PoW if each block is cheap enough. PoW would be paying more to win less value.
 

RimBlock

Active Member
Sep 18, 2011
838
28
28
Singapore
Nice dashboard.
What software is this ?
Thanks, Built using Splunk. Trial version / free version (after 60 days).

Have other items below those in teh screenshot, mainly for log warning and error messages.

I would have preferred to have a candlestick visualisation for the challenge response timings but have not found a decent one for Splunk yet.

Key reasons to use Splunk for my use case
  • Single install for main server.
  • Single install for each remote reporting server.
  • Some experience using at work so knowledge benefits work and home.
  • Free version limited to 500MB/day which I am well under (20MB/day) pulling from 4 debug logs at INFO level.
  • The only items I would use from those removed are lookup tables and the deployment server which I can work around.
Whilst I would have liked to have used the ELK stack, after having a play, the install was nowhere near as easy as it is for Splunk. The install instructions has bits missing (ie setting up accounts in each component so they can talk to each other). I was pretty suprise there was not a unified install script to grab, install and connect all the required pieces for an initial basic install.

I may go back to this but it is not my current priority.

Next on the list is setting up gantt charts for plotter runs
 
  • Like
Reactions: Marsh

boomheadshot

Member
Mar 20, 2021
63
3
8
Can anybody help me out?

I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.

2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)

Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.

I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.

The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.


Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
1623859725824.png


What I have already tried:
  1. Different versions of the windows Chia plotter
  2. Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
    On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)
  3. 256 buckets seems to be faster than 128, but not a drastic difference
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.

Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.

So there really must be a problem with my EPYC platform.
My setup:

ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.

My only wild guess is that maybe there's a bug with 8 channels or something.

$20 PayPal to the first who figures out the culprit.

Thanks in advance.
 
Last edited:

NateS

Active Member
Apr 19, 2021
159
86
28
Sacramento, CA, US
You were clear enough, and those are essentially PoW. However it's still unclear how many plots it would need to answer a challenge.

It's easy to craft a plot that passes plot filter, but finding a plot that actually answers the challenge would be a lot more difficult. It also depends on how much it would cost to PoW the answer. If it requires thousands of cores and hundreds of terabytes of memory to solve many challenges, it may not make sense to PoW if each block is cheap enough. PoW would be paying more to win less value.
You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly. Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.

Can anybody help me out?

I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.

2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)

Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.

I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.

The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.


Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
View attachment 19068


What I have already tried:
  1. Different versions of the windows Chia plotter
  2. Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
    On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)
  3. 256 buckets seems to be faster than 128, but not a drastic difference
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.

Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.

So there really must be a problem with my EPYC platform.
My setup:

ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.

My only wild guess is that maybe there's a bug with 8 channels or something.

$20 PayPal to the first who figures out the culprit.

Thanks in advance.
64 core Rome is effectively a NUMA machine, right? You might try running a few processes in parallel, giving all the cores and memory from one node to each process, to make sure you're not bottlenecked by core-to-core communications.

And I'm not surprised it crashed with a 110G ramdisk if you've only got 128G of ram and you're running 128 threads. The required ram outside of the ramdisk scales with the number of threads, and I don't think 18G is nearly enough for 128 threads.

Lastly, it can be helpful to leave a little system resources available for the OS. On my 8 core 16 thread machine for example, running with 14 threads was faster than with 16.
 

boomheadshot

Member
Mar 20, 2021
63
3
8
You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly. Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.



64 core Rome is effectively a NUMA machine, right? You might try running a few processes in parallel, giving all the cores and memory from one node to each process, to make sure you're not bottlenecked by core-to-core communications.

And I'm not surprised it crashed with a 110G ramdisk if you've only got 128G of ram and you're running 128 threads. The required ram outside of the ramdisk scales with the number of threads, and I don't think 18G is nearly enough for 128 threads.

Lastly, it can be helpful to leave a little system resources available for the OS. On my 8 core 16 thread machine for example, running with 14 threads was faster than with 16.
I tried 16 threads with a ramdisk, but it was still incredibly slow.

I'm actually a noob on this kind of stuff, but I came across this video, and there have been really fast threadripper plot times with all threads being dedicated to 1 plot. Can't I do the same thing?
 
Last edited:

msg7086

Active Member
May 2, 2017
378
134
43
34
Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.
That's when PoW becomes cheaper than PoST, which is against the original purpose of the project (that PoST runs cheaper and greener than PoW). If it happens, then the whole project goes trash bin.


You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly.
That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.

Assuming 4x drives are used to store 512 plots, that's about 40W, and each block is 18.75 seconds apart. If a GPU can craft a plot by using 200W power, it has to do in 3.75 seconds to break even. And 51TB at $20/TB is $1020.

The GPU has to be sold close to $1020, and has to plot in 3.75 seconds, and probably has to have 256GB VRAM on card. I doubt we'll have this in the very near future.

Also, the chia team can fork and change the plot filter to also consider the content of the file. In that case it would be very difficult for GPU to get around the plot filter.
 

gb00s

Active Member
Jul 25, 2018
686
242
43
Poland
That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.

Assuming 4x drives are used to store 512 plots, that's about 40W, and each block is 18.75 seconds apart. If a GPU can craft a plot by using 200W power, it has to do in 3.75 seconds to break even. And 51TB at $20/TB is $1020.

The GPU has to be sold close to $1020, and has to plot in 3.75 seconds, and probably has to have 256GB VRAM on card. I doubt we'll have this in the very near future.

Also, the chia team can fork and change the plot filter to also consider the content of the file. In that case it would be very difficult for GPU to get around the plot filter.
What am I missing here? How much time do you need to generate the 512 plots and how much energy do you need for them? What hardware do you need in extra to generate the plots compared with a GPU generate plots just on the fly or from where are these plots coming from? And why a GPU with 256GB is mandatory? Because of the max plot size? What if just 1 disk dies? You have to generate 128 plots again before you are equal with a GPU plotting on the fly. A GPU plotter is already in the working and not so far away. A protocol change lets you generate 512 plots again. On the fly, shall be no problem ...
 

NateS

Active Member
Apr 19, 2021
159
86
28
Sacramento, CA, US
That's when PoW becomes cheaper than PoST, which is against the original purpose of the project (that PoST runs cheaper and greener than PoW). If it happens, then the whole project goes trash bin.



That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.

Assuming 4x drives are used to store 512 plots, that's about 40W, and each block is 18.75 seconds apart. If a GPU can craft a plot by using 200W power, it has to do in 3.75 seconds to break even. And 51TB at $20/TB is $1020.

The GPU has to be sold close to $1020, and has to plot in 3.75 seconds, and probably has to have 256GB VRAM on card. I doubt we'll have this in the very near future.

Also, the chia team can fork and change the plot filter to also consider the content of the file. In that case it would be very difficult for GPU to get around the plot filter.
Agreed, except you don't necessarily need 256GB on the GPU -- CXL solves the problem of attaching a lot of storage to GPUs very elegantly, and it's coming very soon.

And as for whether a $1020 GPU could create a plot in 3.75 seconds, that remains to be seen. But the Chia devs didn't think the plotting algorithm was highly parallelizable until someone made one, and it cut plot times from ~6hrs to ~30min on standard hardware. And the same dev that made that parallel plotter says he plans to port it to run on GPUs in the future, which should give another good speedup, though how much is not known. It's possible todays sub-$1020 GPUs can already do it on a sufficiently optimized plotter, but even if not, future GPUs a few years down the road might be able to.

The filter could in theory be changed, but the whole purpose of it was to avoid having to spin up a hard drive to check a plot that has no chance to win. If you're having to do reads in the plot anyway, that defeats the purpose of it. Other possible countermeasures could include shortening the response time, which would negatively affect users on slower networks, or increasing the required plot size, which would invalidate most currently existing plots.

In general, since compute power is growing much much faster than storage density, I expect at least some of these countermeasures will have to be deployed eventually. How soon is hard to say. But I agree that just letting PoST devolve into PoW takes away most of what's unique about this coin, so I'm sure the devs will do everything they can to prevent that from happening.

Of course, if it does effectively devolve into PoW via GPU plot generation, then you also have to consider whether it's worth it to use your GPUs for that vs just mining a different PoW coin directly, and at least at the moment, it's probably better to just mine Eth on the GPUs.
 

msg7086

Active Member
May 2, 2017
378
134
43
34
CXL solves the problem of attaching a lot of storage to GPUs very elegantly, and it's coming very soon.
True. Let's see how it goes!

But the Chia devs didn't think the plotting algorithm was highly parallelizable until someone made one, and it cut plot times from ~6hrs to ~30min on standard hardware.
I mean, unless the plotting process is solely sequential, there will be ways to parallelize. I can't think of why they believed it can't be parallelized. Very strange to me.

If you're having to do reads in the plot anyway, that defeats the purpose of it.
Yes. Maybe there's a way to integrate certain data in the plot file into the plot filter that can be proved, but does not involve any reads during idle. We can store that part of data with plot id in the plot index to avoid unnecessary reads, and if that part of data can be verified by other clients then we can prevent the plot id crafting attack.
 

msg7086

Active Member
May 2, 2017
378
134
43
34
What am I missing here? How much time do you need to generate the 512 plots and how much energy do you need for them? What hardware do you need in extra to generate the plots compared with a GPU generate plots just on the fly or from where are these plots coming from?
You are missing the point where when a plot is stored, you don't have to generate it every time a new challenge appears. We are trading time with space.

What we are doing is, we generate 512 plots and we store that on the hard drives. The same plots can be reused 4608 times a day, and have equal chance to win a block every single time. This physically uses 52TiB drives, and consumes about 40W power, ignoring the one time cost to generate 512 plots.

If you create plots using GPU and discard it after each challenge, then you'll be generating 4608 multiplies by how much you can generate per 18.75 seconds. This physically uses 1 GPU, and consumes about 350W (RTX3090 as an example).

Let's do a TCO analysis.

Assumption: Profit now is $120/PiB. RTX3090 is $1500 MSRP. Hard drives are $20/TB. Power bill is calculated as $1/W/year. A GPU can produce α plots per second. Numbers are farming for 1 year.

HDD farming
- setup: $20 * 52 = $1040
- operation: $40 / year
- profit: $120 * 0.052 * 365 = $2277.6

GPU mining
- setup: $1500
- operation: $350 / year
- profit: $120 * 0.052 * 365 * (18.75 * α)

To compensate the $300 difference on power bill, you need to generate more than 1 full plot per 18.75 seconds. To break even, you need to generate a plot in 16.5 seconds, or α > 0.0606.

If the profit drops to $20/PiB, then it's 10.321 seconds, or α > 0.097.

Of course in reality you have to generate 2 full plots within 25 seconds to squeeze 2 attempts on 1 challenge.

On the fly, shall be no problem ...
We can always use the same GPU plotter to generate plots on hard drives. You can't just compare a fancy GPU plotter on a futuristic GPU with an old Xeon E5 CPU. If you can do it 4608 times per day, I for sure can do 128 times per day and dump the result onto hard drives.
 

Bert

Active Member
Mar 31, 2018
458
169
43
43
If a system can be built to generate plots on demand, chia will be broken because same system can announce 100 exabyte of available plots and take over the network. Devs will never let this happen and we are nowhere close to generate plots on the fly. AFAIS plotting is not infinitely scalable.
 

Bert

Active Member
Mar 31, 2018
458
169
43
43
Can anybody help me out?

I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.

2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)

Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.

I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.

The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.


Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
View attachment 19068


What I have already tried:
  1. Different versions of the windows Chia plotter
  2. Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
    On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)
  3. 256 buckets seems to be faster than 128, but not a drastic difference
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.

Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.

So there really must be a problem with my EPYC platform.
My setup:

ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.

My only wild guess is that maybe there's a bug with 8 channels or something.

$20 PayPal to the first who figures out the culprit.

Thanks in advance.

1. Try lowering thread count to 16.
2. Use Ubuntu, windows is not good for plotting
3. On Ubuntu get iostat and check if you are io bound. It is not clear to me whether you have the right temp drives and io capacity . If you are not iobound, we can go deeper to look where the bottleneck is.


BTW mad plotter is not going to increase throughput a lot more but your plotting set up is extremely imbalanced so mad plotter seems to be the best option.
 
Last edited:

Bert

Active Member
Mar 31, 2018
458
169
43
43
Well when it comes to plotting I pretty much discovered the approach covered here:

https://www.reddit.com/r/chia/comments/o1ggyf
One big difference is, I use servers not workstations. I just made a new build using cse-216 case and total cost must be a lot less than $1000. I used old opteron cpus and I was able to generate close to 40 plots a day. I cannot get the exact number since I didn't have hard drives to store plots so I couldn't run a full test.
 

gb00s

Active Member
Jul 25, 2018
686
242
43
Poland
-80+% from ATH in 1.5 months .... ;)

XCH_BTC_USD.png

So you need 600% from here to be back on ATH. In cryptos, no problem ....

1873 plots .... Still 0 reward.
 

mirrormax

Active Member
Apr 10, 2020
168
72
28
iam kinda over chia, theres now more interesting copies like flax and chaingreen, both decentralized without premine.
 

Marsh

Moderator
May 12, 2013
2,534
1,380
113
June 12 won 2 xch
June 18 stop plotting , total 3,800 plots
June 22 won 2 xch , won total 8 xch since mid May , calculator show 17days to win

Start plotting again, committing another 100TB.
I ran out of high density drives already, start using 3TB to 6TB drives.