Does anyone have a link to Bram Cohen tweet where he shits all over NoSSD and calls them scammers? That tweet did not age well!
Now he is doing the same “it’s a scam” to this new grind pro team on reddit. The team is not even asking for any money just need help having hardware made! He simply can’t help himself.
Quick update for you guys. I found out some important information. The GrinderPro team had only built a grand total of ONE prototype system with their current partner before the partner said they could not build anymore due to a FPGA shortage.The team was in debt to said partner and are only receiving revenue sharing from that one 297 PiB system.
I have no evidence, but I think they are being taken advantage of, and their partner is probably building more system behind their back (stealing revenue). I think they sort of suspect this which is why they want a US based partner.
They think it will be easier to make the hardware in the US but I told them all our quickturn PCBs are actually made in Shenzhen…
I don’t think ANYTHING else in crypto is using FPGAs at all, other than those CVP-13 (first time I’ve even heard of them though).
ASIC all the way as far as I know, for the last 3-5 years or so.
Thanks for the link. Sourcing the VU13P is not an issue, but that is not the direction we are heading. I’m hoping we can use new off-the-shelf PCIe 5.0 FPGA accelerator cards, instead of custom boards, to prove out the concept.
My company mines bitcoin, but our main business is ERCOT trading in the Day-Ahead Market. We don’t actually make money mining bitcoin, except during periods of negagtive pricing. We have a team of PhD quants who just try to predict the weather. All I need is acess to GrinderPro’s code and we can take it from there.
looks like the FPGA grinding is real. over on the madmax discord they found a farm with the same signature as Grinder Pro described around 300PB that ramped up in a few days and then ran steady state. wow bram just got owned
The attack is possible by simply storing only the last table in each plot so you effectively have 98% compression. The problem is this would require actually plotting 297 PiB so it would not be efficient.
BUT, if you have a much more efficient novel way to plot and then just store the last table you CAN pull it off.
In any case if this works the real efficiency is less than 203 TiB/w because you have to consider the preplotting energy spent to generate the last table. But once you preplotted the system would spoof 297 PiB using 2500w or whatnot.
I am the Dr. Yang they are talking about. I live in Shenzhen and I can come to the campus to verify your claim if you want more people to believe in you. I have worked on the code for plotting and farming, so I know the process. But I don’t know anything about FPGAs except the fact that they are very fast, so I have no capacity to steal your intellectual property. The verification process may involve getting a recent challenge and then you disconnect your system from Internet and then show how fast you can grind to farm the challenge. You can also convince me if you can explain well in simple terms about the weakness of the plot format, how it is plotted, and why you can efficiently attack it. I may not understand all the FPGA thing, but probably will understand enough to believe your claim if I find it reasonable.
I found this compression technique over a year ago, you only store the final Y values where you have multiple equal Y, like 3 or 4 or 5. DrPlotter uses 4, and I think NoSSD 3.0 uses 3, but they include more than just the final table.
If you only store the final Y, then you need to do a full re-plot to get the proofs back. However this attack is not grinding, since you actually need to plot all the effective netspace first. And if you try to get the 3x, 4x or 5x gain you need to plot 4 times, 12 times, and I think around 32 times more plots.
And you also need storage to store this data, albeit it’s not much yes.
But having to plot first makes this much much less efficient financially. Sure FPGA plotting does help, but it’s a monumental effort to implement. I can’t really believe someone actually did this. GPU plotting is soo easy compared to that.
Overall, if you ignore plotting cost, you can get a 3*5 = 15x efficiency boost over simple grinding. So to spoof 297 PiB, you need to plot a k32 phase 1 in about 11 ms, still unbelievably fast given just 3 FPGAs with external RAM.
As a reference 16x A100 would need around 500 ms. They have HBM, but FPGA doesn’t need as much bandwidth since more work can be done on-chip without temp storage in RAM. I think those pretty much cancel each other out (external RAM + FPGA == GPU with HBM).
TL;DR; I think we’re still missing a factor of 50x to get the claimed efficiency, assuming 16x A100 is equal to 3x VU13P in performance, which is generous towards the FPGAs, given that the A100s have double the inter-connect bandwidth (300 GB/s vs 150 GB/s).
Hi, thank you for the insights. So in your opinion FPGA grinding is not possible, but 98% compression yes. Is this 98% compression possible for everybody today? Because if it isn’t one way forward I see is just modify the plotters to save the 2% required, it will make the plotting process a lot longer to store the same quantity of data but it will mean the same for everybody and at the end of the day with same space everybody will have the same chances. Is hard to believe compression over 98% will ever be possible so theoritically would make this the very bottom from where Chia could then recover in price. What do you think? Thanks
A little bit off-topic. I just asked ChatGPT whether it is likely to grind the tables using ASIC if a memory heavy FPGA can be developed. The answer is yes, and the efficiency gained is massive:
In cryptocurrency mining, where similar FPGA to ASIC transitions have occurred, ASICs have been observed to be up to 100x more power-efficient and several times faster. For instance, Bitcoin mining ASICs can achieve performance improvements of 5x to 10x in terms of hashing power over FPGA designs, with energy efficiency gains in the range of 20x to 50x.
Does this mean that using ASIC to grind the plots is already very efficient if an ASIC grinder has been developed? How many times more efficient can it be compared to GPU?
Yes but it’s highly inefficient, since you will have PCIe and host RAM bottlenecks.
It’s better on FPGA if you can fit 256G RAM on one board, or manage to inter-connect multiple FPGAs.
Plotting k32 phase 1 takes around 800 GiB of total RAM bandwidth best case, that’s reads + writes added together.
So to do phase 1 in 11 ms to spoof ~300 PiB with “98%” compression you need 72 TiB/s RAM bandwidth. For reference the best HBM FPGA you can buy today has around 820 GiB/s, so you would need 90 of those. Or around 720 VU13P, not 3.
ChatGPT is quite dumb for corner cases, no it wont nearly scale the same way as for Bitcoin. For Chia plot grinding your main problem is RAM size and bandwidth, whereas for Bitcoin it’s purely on-chip compute.