Ultimate Single Socket (x86) Chia Plotter - my endless adventure!

FirstEver · August 26, 2021, 12:51pm

Hello Chia community!

In May this year I started building the Ultimate Single Socket (x86) Chia Plotter.

Here is the specification:

AMD Ryzen Threadripper PRO 3995WX
ASUS Pro WS WRX80E-SAGE SE WIFI
8 x 32GB Corsair Vengeance LPX Black DDR4 DRAM 3200MHz C16 (non ECC)
18 x Corsair MP600 Gen4 PCIe SSD (only 10 in use - AMD RAID limitation)
2 x ASUS Hyper M.2 X16 PCIe 4.0 X4 Expansion Card
EVGA SuperNOVA 1600 T2

At the beginning, I would like to mention that MadMax or BladeBit plotter was not available in May , I used Swar Chia Plot Manager and I was creating 40 plots in parallel. This generated 77 plots per day or 1 plot every 18.7 minutes (CPU load was between 30-65% and temperatures between 60-77’C). Unfortunately, plotting 40 in parallel was not an ideal solution.

The problems I found.

The motherboard did not support AMD DOCP DRAM before BIOS version 0504 - that was a nightmare.
After shutting down the computer, it cannot be turned back on - you need to reset the BIOS and set everything up again - seriously ASUS? I reported it twice and didn’t get any reply from ASUS technical support.
I have 18 x SSD NVMe and can only use 10? This because you must use PCIe RAID mode for PCIe to detect all 4 SSDs in PCIe adapters. But the computer will not boot if it detects more than 10 NVMe SSDs because AMD RAID does not support that number of drives. By switching the PCIe RAID mode to 8x/8x you can bypass this limitation, but PCIe adapters will only detect 2 SSDs. Theoretically, it is possible to mount 7 x PCIe M.2 adapters with 2 x SSDs each (in 8x/8m mode) + 3 x SSDs in built-in M.2 ports = total 17 SSDs. But this guy uses the same motherboard and is able to boot with 23 x NVMe drives - how does he do that ??

I have switched to MadMax in parallel.

1 plot = 28 min
2 plots = ~35 min each = ~17.5 min / plot (+25% longer)
4 plots = I need 512GB of RAM but I guess it will ~45 min each = ~11 min / plot

Benchmark - best RAM Disk software for Windows that supports 110GB+

ImDisk Toolkit 7.6.0.0
Passmark OSFMount 3.1.1000
Primo Ramdisk Server 6.5.0
Qiling Disk Master Free 5.5.1
SoftPerfect RAM Disk 4.2.0
StarWind RAM Disk 5.6
Ultra RAMDisk Pro 1.70

Plans for this month.

I am planning to upgrade to 512MB RAM and switch to BladeBit plotter and buy another 50-70 hard drives.

Does anyone know?

How to bypass the 10 SSD NVme limitation on ASUS motherboard?
Can DDR4-3200 CL22 ECC memory be overclocked to CL16 (for example when I increase the voltage to 1.2V → 1.35V)?
What is the optimal system (Linux) configuration for BladeBit?

Finally, in the last 3 months I have been VERY LUCKY in farming. I got 2.4 times more coins than Chia calculator shows.

Good luck with Chia!
Marcin aka First Ever

Bonus
My EXPERIMENTAL external 8 x M.2 NVMe connected via 8 x USB 3.2 Gen2 10Gbps

Fuzeguy · August 26, 2021, 1:16pm

Talk about overkill but 11 min is BladeBit-like …that all must have cost a pretty penny!

Couple questions… is Primo Ramdisk Server 6.5.0 actually x2-x3 faster than any other RAM disk you tried? That’s truly amazing if so!! And how do you attach these external SSDs to the motherboard? I see your ext. heatsink (awesome), but what is the device below it where all wires go? Last, for Madmax and 4 plots parallel, why do you need 512GB RAM? For me, with a 3955RX and two parallel I never see more than ~15GB used?

Now - heehee – what next to go faster still (in Windows)?

bambinone · August 26, 2021, 1:33pm

I would suggest asking on the Level1Techs forum.

rfc2324 · August 26, 2021, 1:36pm

I also tested different RAM disk software on my system: ImDisk, OSFMount, Primo. The latter shows best performance in synthetic benchmark, but ImDisk provided best times in real world chia plotting.

3995WX has 64 cores… That’s insane. But you don’t have enough RAM to really saturate it and make use of MadMax, unfortunately.

I would try just 2 MadMax processes with 2 RAM disks, allocate affinity of half of the cores to each. You won’t even need more than two NVME for this setup. You should try running 512 buckets on phase 1-2 and 256 on phase 3-4, with 32 threads per each process (if running out of RAM, reduce it to 16 or so).

Some tips for this setup:

MadMax documentation says it needs 110 GB RAM disk, but in reality you can make 107 GB and it will work
Use Ubuntu WSL v1 (windows system for linux virtual machine) and you will get +15% raw performance

My setup is Ryzen 5900x + 128 GB DDR4 3600 CL16 and I run only 1 MadMax process with 12 threads, and getting plot times consistently sub-26 minutes.

dctech · August 27, 2021, 2:23am

If you are on Win with MadMax, even with emulated Linux, you get 15-30% performance hit from my observation. I’m on 5950X, 128Gb 3600 Balistix, and 1x2TB NVMe 3.0 on Dark Hero which does ~20min per plot under Pop! OS 21.04 (Ubuntu 21.04). Used to do SWAR on multiple NVMes and I’m glad MadMax came along, huge thanks to dev as it saves the life of my NVMe!

Grandmast3rr · August 27, 2021, 5:57am

Howdy! I’ve been working on the 3995x for awhile now. Given your setup with a few minor tweaks you could get 4x plots parallel on madmax around 31min/37min per set. PM me if you wish too go over my build and compare notes!

Astelith · August 27, 2021, 8:42am

For a high end plotter forget nvme, lots of money that are going to waste, in my setup I have a HPE DL-580 G8 with 1TB of DDR3 to act as ramdisk for the temp1 folder, the same motherboard as you, the Asus Sage and a TR Pro 3975 and 256GB (fully manual tuned 3200 ram), the temp2 resides on local ramdisk, between the ramdisk and the plotter I have a direct infiniband connection of 100Gbit, I can plot in 18minutes, I was about to try 2 plots simultaneously but I ran out of space, soon I will be expanding the farm and do more tests but withjj your CPU and 512 you can probably do 4 at a time for most likely 20 min each, giving a beastly output, and most important you can plot an infinite number of plot for less money than all that nvmes

FirstEver · August 27, 2021, 9:37am

Hi, can you tell me what RAM are you using and settings - thanks.

I am testing the latest Windows 11 build with the latest AIDA64 Beta build: https://download.aida64.com/aida64extreme_build_5754_tj5kp8hxfz.zip

The system is optimally tuned (max boost 4300MHz for one core and 4200MHz for all cores!)

mehdital · August 27, 2021, 12:37pm

Nice system, but can you share with us your dollar per plot/day (Basically total system costs without the hard drives divided by the number of plots it can do in 24h)? It seems to be extremely high.

FirstEver · August 30, 2021, 9:05am

Hi,
This is a long term investment and I am not going to sell any XCH coins.

You have my full rig specs, you can check prices on Google.

with 256GB RAM you can get approx. 80-140 plots/day
with 512GB RAM you can get approx. 130-200* plots/day

P.S
For Windows DO NOT INSTALL the latest AMD Chipset drivers 3.08.17.735 (-33% performance)

mehdital · August 30, 2021, 11:26am

Still, how much do you pay per 1 plot/day ? Seems unnecessary to be honest and wasteful. Even considering power costs, having few ryzen based systems might do the job in a much cheaper way.

FirstEver · August 30, 2021, 11:54am

This topic was created to share knowledge on how to optimally configure Threadripper Pro platforms. You can calculate the profitability yourself and please do not spam here.

Fuzeguy · August 30, 2021, 1:12pm

I just installed those AMD drivers you mention a couple days ago. Along w/revised SSDs configuration, I just got the BEST EVER plotting times w/my AMD ThreadRipper Pro 3955RX 64GB RAM.

MadMax ~70 PPD or 20.5 minutes/plot filling a 14GB drive.

What problem did you experience with the new drivers? Maybe only BB related?

bambinone · August 30, 2021, 2:35pm

I appreciate the fact that you built this system before madMAx/BladeBit, but it might be time to liquidate and build some cheaper plotters. I’m getting 76 PPD with my bargain-bin X299 system: i9-7960X, 128GB DDR4-3200, Optane 900P 280GB. Those four parts including the motherboard cost about $1500 after tax and shipping. You could build two similar systems complete with PSUs, cheap GPUs, etc. for less than the cost of the 3995WX.

Grandmast3rr · August 31, 2021, 11:26am

185 PPD still in testing. 5900x build is less than 1500 and easily outperforms 7960, has 4.0 and many other fun upgrades. Point is that optimizing and applying more than just chia itself to the system is the goal. Theres literally nothing you cant do with a nice 39xx build. Price is high but take into account the extensive list of items you will not have to get being you have 128 lanes at x4.0 to add stuff to… networking, jbod, plotter, forks farms, etc all in one package. You’d be lucky to do 10% of what these systems are capable of with whats currently on the market.

bambinone · August 31, 2021, 1:27pm

My board and CPU cost about $700, which is more-or-less what you’d pay for a half-decent X570 board and 5900X. X299 has 44 or 48 usable PCIe Gen 3.0 lanes and quad-channel memory, X570 has 20 usable PCIe Gen 4.0 lanes and dual-channel memory. I think a Ryzen 5000-series system would make a great plotter and general-use system when you’re done plotting. (I’d have gone that route if I wasn’t doing VFIO.)

To be absolutely clear, I am not saying that Skylake-X is just as good or even comparable to Threadripper (Pro) 3000-series. The 7960X isn’t even in the same galaxy as the 3995WX. My point is that the 7960X (or the 5900X) is about 1/10th the price of the 3995WX, and you can run madMAx with 128GB DDR4 and one NVMe. Or you can throw together a cheap C602 or C612 dual-socket system and run BladeBit with 416+GB DDR3 or DDR4. If you’re building a system for a finite amount of plotting and don’t have other use-cases in mind, the cost per plot of the 3995WX is way too high.

I get this. The tinkering is most of the fun. And the learning opportunities are golden.

I get this, too. I built my X299 workstation with other use-cases in mind, it just happened to be a good fit for madMAx. And it was fun to overclock.

I think you could just as well do all that stuff with 44 PCIe Gen 3.0 lanes, and again, at a fraction of the cost. (As long as you’re not running a whole bunch of NVMe drives, which is no longer necessary for plotting.) But maybe I’m not thinking big enough.

At some point it just comes down to budget and preference. If you can afford a $5K CPU and you’ll get lots of use out of it when you’re done plotting, go nuts. I just hate the perception that you have to spend that much to start plotting. I went with a $400 CPU and bought more hard drives with the $4,600 I saved. To each his or her own…

Grandmast3rr · August 31, 2021, 2:35pm

I understand what your saying. I originally went with the 3955x over other AMD/Intel builds based on the TDP and expansion. Now that i climbed into the 95, xch is childs play to what you can actually accomplish with it. I wouldnt say go out and buy high end 64c cpus just to plot. You can get a k32 a day on a RPi lmao. 20min plots on 2697 v3 headless with 128g+ ddr4 ram is well under 2k.

Fuzeguy · August 31, 2021, 3:36pm

You both are rational & pretty spot on re: choices between systems. I might add that I found the 3955x underwhelming using the authentic chia plot GUI in Windows. I went from 11 PPD on a Ryzen 3600 to only 30 PPD on the TR 3955 pro, even with 144GB memory. It made more plots possible, but not so very impressive considering the TR was >$2K. Plus too much effort to manage getting even that.

But I’ve grown very fond of the 3955RX now w/Madmax @70 PPD. TR pro is a workhorse, and is stable as a boulder. MM barely used 16GB, so I knocked it down to 64GB. I can punch out chia drives easily, just leave the thing for days on end, plugging away. With its built in 10Gbps network card, I connect to my 3600 farmer no issues (at 2.5Gbps).

The last thing is stealth, it looks like an HP desktop from Staples, low key power, without all the hardware mess and noise. I’m all for that, as is the wife

lihp · September 1, 2021, 9:04am

Seriously: I just joined the forum and read weird consumer grade posts?

Nice you checked benchmark bandwidth. Did you check IOPS and latency during operations for the nvmes? Did you compare ram disk latency?
ASUS boards are still consumer boards for WX.
Seriously: non-ECC RAM in a memory heavy application on crypto keys?
AMD RAID is one of the worst ideas possible: performance, reliability, migration,… Broadcom Raid controller, sw-raid are the way to go, but then with enterprise grade nvmes. Under Linux mdadm or even better a decent sw-raid outperforms this setup without questions asked - by leaps and bounds. Decent sw-raid does 30GB/s read and 15GB/s write on 4x enterprise grade PCIe 4.0 nvmes at ~150 us latency, up to ~400 us (4x Kioxia CM6-V 3.2TB or Samsung PM1735 3.2 TB - the Samsung is bad under sustained load, so the Kioxia is way better). Take 4x Intel P5800X and you get down to ~14 - 130 us with 30GB/s read and write. This “ultimate system” - except the CPU - is a piece of consumer sh…, errr crapola.
Is Windows really the ideal platform to mine? Maximum system performance? Heavily tuned and optimized with optimized code Windows users are lucky to achieve 92% of comparable Linux performance. But thats only for real Windows experts setting up the system and there are no tuning guides for that performance in some weird howtos over the Internet…
…

To put things in perspective: 24C/48T Epyc Milan under Linux does app. 100 PPD with 4 plot nvmes on classic Chia-Gui at 36 - 44 plots parallel. Considering TR Pro specs, that 64C TR Pro CPU can easily do 200+ PPD, probably even 300 PPD. As told above: with 4 nvme drives you can achieve way higher performance than with your RAM disks. This “Ultimate Single Socket Chia Plotter” on Windows is like beating a dead horse to win the Olympics - not gonna happen.

Best option: maybe madmax plotter with 512GB RAM, running 2 plots fully in ramdisk for tmp and tmp2 may save some of your investments. Still you can put 14 of your 18 nvmes on hold anyway.

So in all due respect, sorry to burst your bubble: besides the CPU I consider your “ultimate build” a fan project - far away from performance reality. I suggest to add some LED stripes.

PS-Edit: All the above numbers are valid as of August 2021.

Fuzeguy · September 1, 2021, 12:17pm

Seriously: no respect - at all. I’ll suggest you put your holier than thou ‘expertise’ on the shelf troll. When you have something created in this, our reality, to show for yourself, you might be slightly worth listening to, but I’ll say it’s not looking good.