The bottleneck of plotting is not CPU, SSD or the amount of RAM!

latone · May 10, 2021, 2:42am

damn, i need a faster cpu!

can you pls tell me your time for phase 1,2,3,4?

codinghorror · May 10, 2021, 2:45am

Well if you can throw money at the problem, then there are no problems just buy a ton of fast CPUs and NVME drives and go to town.

Capper308 · May 10, 2021, 6:14pm

I am plotting on a Ryzen 3800X machine. I have 32 GB RAM and two 1 TB Nvme drives.
I plot using CLI. I have 3 instances of the plotter running and I do one plot on each about every 8 hours.
The first Nvme drive is for the initial plotting, the second nvme drive is the intermediate drive and then my completed plot transfers over the network to my Share. (mapped drive “Y:”)

My command is this:

chia plots create -k 32 -b 5120 -r 4 -n 10 -t D:\Chia-temp\Chia1 -2 F:\Chia1 -d Y:\

The above is for my first instance, I change the folders for instance 2 and 3 respectively.
example: … -t D:\Chia-temp\Chia2 -2 F:\Chia2 -d Y:\ and so on.

Edit, I totally forgot to state, that before I stared using the intermediate drive, it was taking like 12 hours to do a plot.

freeze · May 10, 2021, 6:39pm

Damn, what cores are those?

That’s slower than what I got onto SSDs in a modest dual CPU system (Xeon Silver 4114, 2x 10C @ 2.2GHz, Skylake-SP) - 24332s on my first plot, not solo as other plots started well before that finished.

Are you sure your processes aren’t allocating memory from the wrong node? I’m not sure core pinning prevents this. Check with numastat -p chia.

And on that note, are your ramdisks allocated on the right nodes too (tmpfs mpol option)?

latone · May 10, 2021, 9:17pm

Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz, about 1/2 the speed of the CPUs that appear to be getting 2x performance, so it now made sense to me.

As far as tmpfs, I didn’t set mpol because I approached it from the other side, i.e., pin the core while using default mpol=local. And that is what I saw happening, with numastat, and vmstat not showing many soft faults. In fact, with so much memory, I then noticed it hardly hit the fs anyway.

A change in the s/w may get the type of speedup I was thinking, esp for phase 1 (whereas only so much can be done for sorting random numbers).

Not that any of it matters at this point, I filled the PB I had hanging around, I’m in it now only for the curiosity.

variableresults · May 10, 2021, 10:39pm

I would look at how much disk queuing is going on on the temp drives. I think that’ll tell you where the lag is happening for most everyone. I have a fast rig with PCIE4 SSDs with 500-600k R/W IOPS and parallel plotting almost always causes disk queuing. Likely doesn’t have anything to do with RAM or CPU usage as it’s the SSD telling the OS to back off.

Generally, I see significant increases in the disk queue when I go above two parallel plots in Phase 1 on the same SSD.