You were clear enough, and those are essentially PoW. However it's still unclear how many plots it would need to answer a challenge.Looks like I wasn't clear enough. There are rumours that K32's are already broken because an institution is already able to make a plot with the requirements of a challenge. That means you can do a plot fast enough so that every challenge answer from you is to be accepted and you receive XCH. I would call this 'plotting on the fly' or just jumping the queue. This also would mean, that someone figured the Hash algo ... Isn't it so?
But they are rumors.
Thanks, Built using Splunk. Trial version / free version (after 60 days).Nice dashboard.
What software is this ?
You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly. Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.You were clear enough, and those are essentially PoW. However it's still unclear how many plots it would need to answer a challenge.
It's easy to craft a plot that passes plot filter, but finding a plot that actually answers the challenge would be a lot more difficult. It also depends on how much it would cost to PoW the answer. If it requires thousands of cores and hundreds of terabytes of memory to solve many challenges, it may not make sense to PoW if each block is cheap enough. PoW would be paying more to win less value.
64 core Rome is effectively a NUMA machine, right? You might try running a few processes in parallel, giving all the cores and memory from one node to each process, to make sure you're not bottlenecked by core-to-core communications.Can anybody help me out?
I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.
2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)
Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.
I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.
The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.
Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
View attachment 19068
What I have already tried:
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.
- Different versions of the windows Chia plotter
- Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)- 256 buckets seems to be faster than 128, but not a drastic difference
Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.
So there really must be a problem with my EPYC platform.
My setup:
ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.
My only wild guess is that maybe there's a bug with 8 channels or something.
$20 PayPal to the first who figures out the culprit.
Thanks in advance.
I tried 16 threads with a ramdisk, but it was still incredibly slow.You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly. Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.
64 core Rome is effectively a NUMA machine, right? You might try running a few processes in parallel, giving all the cores and memory from one node to each process, to make sure you're not bottlenecked by core-to-core communications.
And I'm not surprised it crashed with a 110G ramdisk if you've only got 128G of ram and you're running 128 threads. The required ram outside of the ramdisk scales with the number of threads, and I don't think 18G is nearly enough for 128 threads.
Lastly, it can be helpful to leave a little system resources available for the OS. On my 8 core 16 thread machine for example, running with 14 threads was faster than with 16.
That's when PoW becomes cheaper than PoST, which is against the original purpose of the project (that PoST runs cheaper and greener than PoW). If it happens, then the whole project goes trash bin.Currently, storage still wins out, but if someone comes out with a GPU plotter that can generate plots very cheaply, that may change.
That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.You don't necessarily have to construct a winning plot reliably every time to still have a huge advantage if you can make them on the fly. Just being able to always pass the plot filter means that the one plot you're creating is equivalent to having created and stored 512 plots. At that point, it becomes an economic question of whether it's cheaper to create and store 512 plots, or cheaper to run the hardware to generate 1 plot on the fly.
What am I missing here? How much time do you need to generate the 512 plots and how much energy do you need for them? What hardware do you need in extra to generate the plots compared with a GPU generate plots just on the fly or from where are these plots coming from? And why a GPU with 256GB is mandatory? Because of the max plot size? What if just 1 disk dies? You have to generate 128 plots again before you are equal with a GPU plotting on the fly. A GPU plotter is already in the working and not so far away. A protocol change lets you generate 512 plots again. On the fly, shall be no problem ...That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.
Assuming 4x drives are used to store 512 plots, that's about 40W, and each block is 18.75 seconds apart. If a GPU can craft a plot by using 200W power, it has to do in 3.75 seconds to break even. And 51TB at $20/TB is $1020.
The GPU has to be sold close to $1020, and has to plot in 3.75 seconds, and probably has to have 256GB VRAM on card. I doubt we'll have this in the very near future.
Also, the chia team can fork and change the plot filter to also consider the content of the file. In that case it would be very difficult for GPU to get around the plot filter.
Agreed, except you don't necessarily need 256GB on the GPU -- CXL solves the problem of attaching a lot of storage to GPUs very elegantly, and it's coming very soon.That's when PoW becomes cheaper than PoST, which is against the original purpose of the project (that PoST runs cheaper and greener than PoW). If it happens, then the whole project goes trash bin.
That's correct. Each plot created on the fly has equal chance to win against 512 plots stored.
Assuming 4x drives are used to store 512 plots, that's about 40W, and each block is 18.75 seconds apart. If a GPU can craft a plot by using 200W power, it has to do in 3.75 seconds to break even. And 51TB at $20/TB is $1020.
The GPU has to be sold close to $1020, and has to plot in 3.75 seconds, and probably has to have 256GB VRAM on card. I doubt we'll have this in the very near future.
Also, the chia team can fork and change the plot filter to also consider the content of the file. In that case it would be very difficult for GPU to get around the plot filter.
True. Let's see how it goes!CXL solves the problem of attaching a lot of storage to GPUs very elegantly, and it's coming very soon.
I mean, unless the plotting process is solely sequential, there will be ways to parallelize. I can't think of why they believed it can't be parallelized. Very strange to me.But the Chia devs didn't think the plotting algorithm was highly parallelizable until someone made one, and it cut plot times from ~6hrs to ~30min on standard hardware.
Yes. Maybe there's a way to integrate certain data in the plot file into the plot filter that can be proved, but does not involve any reads during idle. We can store that part of data with plot id in the plot index to avoid unnecessary reads, and if that part of data can be verified by other clients then we can prevent the plot id crafting attack.If you're having to do reads in the plot anyway, that defeats the purpose of it.
You are missing the point where when a plot is stored, you don't have to generate it every time a new challenge appears. We are trading time with space.What am I missing here? How much time do you need to generate the 512 plots and how much energy do you need for them? What hardware do you need in extra to generate the plots compared with a GPU generate plots just on the fly or from where are these plots coming from?
We can always use the same GPU plotter to generate plots on hard drives. You can't just compare a fancy GPU plotter on a futuristic GPU with an old Xeon E5 CPU. If you can do it 4608 times per day, I for sure can do 128 times per day and dump the result onto hard drives.On the fly, shall be no problem ...
Can anybody help me out?
I've been trying to use the Mad max plotter on my 64 core Rome ES, but it's really slow.
2 x 300 gb 10k SAS drives = 7.5 hour plots with 128 threads (used to be ~15 hours on the GUI/powershell but on each drive, so no improvement really)
Sloth Tech TV churned out a plot with 2 x 300 gb 10k SAS drives in 73 mins on DDR3 memory, I'm nowhere near that.
I even tried to make a RAMDisk with ImDisk (when I made a 110 GiB drive, the plots crashed. I have 128 gb total, so 110 GiB tmp 2 drive, an NVMe ssd (Corsair Force MP600 1TB, yes, it's not great, but it's not supposed to be slowing me down this much). I tried giving a few more gigs to the RAMDisk and setting less threads, but it was still TOO slow.
The first thing that I immediately notice is how slow it takes to create Table 1. A threadripper in RAMDisk took 6 seconds to make Table 1. When I tried it with the RAMDisk overnight, it was like 140 seconds, but it crashed. On SAS drives it takes like ~300 seconds, on HDD's it takes 400+ secs.
Here is my current attempt, on 2 x 300gb 10k SAS drives, but it's still way slower than it should be, and I'm running out of ideas:
View attachment 19068
What I have already tried:
I've tried the windows Madmax plotter on my i9 7900X, I just tested Table 1 on a SATA ssd and it was 79 seconds with 20 threads.
- Different versions of the windows Chia plotter
- Different OS's (right now on Windows 2019 Server, tried Win 10 Pro, and Ubuntu.)
On Ubuntu I couldn't get the overclock tool to work, and at default clocks it was still really slow on phase 1 (same times as Windows)- 256 buckets seems to be faster than 128, but not a drastic difference
Another interesting thing that I noticed is that when I launch the plotter with just 1 thread on the 2S cpu, the table gets made in ~130 seconds, but when I add more threads, it actually happens slower. Btw, in task manager is shows 2 sockets instead of 1, on Win 10 Pro it showed one as it should, but the times were the same. So when I set 128 threads, I see only half of them getting used up. But it's still way slower than it should be.
So there really must be a problem with my EPYC platform.
My setup:
ASUS KRPA-U16 version 0302
2s1404e2vjug5 64-core CPU
M393A2K43DB3-CWE x 8 sticks
Windows Server 2019 version 1809 build 17763
HP Smart array p410 1gb for the SAS drives, so yeah it might be shit BUT what's up with the RAMDisk tmp2 + NVMe temp 1? It's still not supposed to be that slow.
The BIOS is pretty much stock.
My only wild guess is that maybe there's a bug with 8 channels or something.
$20 PayPal to the first who figures out the culprit.
Thanks in advance.