I’ve read that it was possible to get away with running MadMax using a Ram disk for Temp 1 and Temp 2 using only 256GB of Ram? Has anyone tried this? I want to build a new plotting machine and am trying to determine if it’s worth going with 256GB of Ram or not.
The problem I see it running into is when it completes a plot and begins copying it to the final destination while the next plot begins. Wouldn’t this cause it to run out of space?
Yes, but you can do
-w to force it to wait for the copy before starting the next plot.
256G RAM is not worth it, you can use two 128G device to plot, my 5950X with SN550 get a 26-28 min, there will be a little increase with 256G RAM, but use two 5950X with 128G, there will be a double
It is true that using all ram doesn’t speed things up much if temp 2 is 110GB ramdrive and temp1 is an SSD/NVME, or array of raid 0 fast HDDs, then you could spread the ram over two machines and plot more per day.
I was doing this, with two servers but have now switched to one machine plotting with all of the ram, because its less hands on and uses less power and I am happy to plot a little slower as the netspace isnt rising much, payouts are steady etc…
You do not need to wait for copy before starting next plot. You just need to write the plot on an something fast. No problem with 20 threads and RAID0 (E5-2690v2 & 2 x 960GB SSD)
May I ask: What are your plot times on madMAx with SSD+RAMdisk vs. all RAMdisk?
Just to give more context. I’m currently plotting with 2 different machines and would like to transition to just one machine. Both my current machines have 128GB DDR4 Ram and a Ram drive as temp 2 (one is a Dual Xeon E5 V3 doing about 24 minute plots, and the other is a Dual Xeon E5 V4 doing about 25 minute plots), and both use an SK Hynix Gold 1TB NVME as temp 1.
I was fortunate enough to get my current machines for great prices and should make out pretty decent selling them once I replace them with one. For me, Chia is more of a hobby than anything else, and making money is just a benefit. I love tinkering with Hardware and wanted to try using a straight Ram drive for both temp destinations as I have not done this yet. Plus it will be so much easier just to manage one machine (and keep the insane heat caused by my farm and plotting machines down a little).
My only concern is not wanting to have to wait for the copy of a plot to finish to start another plot as it will just waste time, and I want to write to a regular 7200 RPM HDD and not an SSD (so I don’t have to transfer things over later). So if it’s possible to not have to wait for copy and be able to work writing to a regular HDD I think I’m going to go for it, but if the only way is to write to an SSD as the final or wait for copy I’ll probably not go with trying 256GB Ram for both temps.
If the goal is to consolidate hardware from two machines into one, then you can make use of 256GB RAM by running 2 plotter processes in parallel, each using its own 107GB temp2. You can even use a single NVME as temp1, but stagger the processes, so that its load is spread out evenly.
I observed temp1 drive usage during different phases and noticed sometimes the drive is at very low saturation level (so my particular system was bottlenecking somewhere else).
Each plotter process should use affinity to make sure they do not attempt to hijack each other’s threads.
I’d expect your overall plotting capacity to decrease, but it will achieve the goal of more efficient hardware use.
Thanks for the reply, I think this is probably my best bet.
I will have two spare 1TB NVME drives so I will run 2 separate plotter processes with one to each drive. The processors I am going to pick up are 2x E5-2697A (16 cores each) and will just run -t 16 for each plotter process. Do you think putting -t 16 will be sufficient so it won’t hijack each others threads?
I think it should do pretty well setup this way and may even match or beat my current times with using 2 different machines.
Seems like it would be super easy and fun to try…
I hook my HDDs right up to the plotter too, it’s a lot easier than trying to move plots over the network. However, I still use a fast SSD in-between to free up temp space ASAP.
Even better, turn off memory interleave and use
numactl to bind each madMAx instance to its own NUMA node.
Thanks! This part is a bit over my head lol. Anyway you explain what I would exactly need to do?
Well, it’s going to vary from system to system, but because you have a dual-socket system, half the memory is assigned to each socket. Most scalable systems are set up out of the box to interleave memory between the sockets for compatibility with non-NUMA workloads. In your BIOS, there should be an option to disable memory interleave (or enable NUMA). Then when you boot into Linux and run
numactl --show you should see two NUMA nodes.
(Note for anybody reading this with newer systems: some architectures are “sub-NUMA” and can present multiple NUMA nodes per socket to the operating system.)
Now what you can do is use
numactl to isolate each madMAx instance to one socket and the memory assigned to it. This will (in theory) perform better because no traffic will have to pass over QPI, i.e. between sockets. Also, when you create the two RAMdisks you can pass a mount option to indicate which NUMA node to use for each. And you can use
lstopo to figure out which NVMe is attached to which socket so that traffic stays local as well.
Full disclosure: I haven’t actually tried this myself because I don’t have any dual-socket systems available for personal use.
EDIT: Here’s an example from a PowerEdge R630 with two E5-2637v3 processors and 64GB RAM:
$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 32092 MB
node 0 free: 2355 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 32229 MB
node 1 free: 1137 MB
node 0 1
0: 10 21
1: 21 10
Generally speaking, -r equal to number of physical cores is a good place to start, then tweak based on experimental results. But it is just a multiplier. It doesn’t tell your system to allocate processes to certain logical processors.
To control this allocation, you need to use system tools (depending on hardware and OS). As @bambinone mentioned above, numactl is one of those tools (for Linux). I believe a more abstract name for it is “cpu affinity”.
In this video, the concept is explained on a Windows example: Chia's Harry Plotter & The Affinity Wars! - YouTube
I plot faster using 2 madmax with 110GB of ram and the RAID0 as temp1 than with only one 244GB ramdisk.
One madmax + 128GB and RAID0 = 39min
One madmax + 256GB = 36min
Two madmax + 128GB each = 32min
The only reason for me to plot on 256GB of RAM is to save on SSD life. You should use 2 madmax + 2 x 110GB of ramdisk + SSD as tmp1 and HDD as destination if you want to plot as fast as possible. (In my case)
Very cool. Thanks for sharing!
Thank you all for the information and insight! I’m definitely going to implement everything mentioned and run 2 plotters on one machine.
I plot entirely with RAMDisk using MadMax once in a while. I got a bunch of old servers that we were getting rid of from work, so I have 3 servers with 1TB of RAM each as my plotters. (I actually have more than that, but I trip circuit breakers if I try to run more than that) I usually only use the RAMDisk for temp2, put temp1 onto some old SSDs I have and write the end result to cheap spinning disk. (then I run a script that pulls those plots from the spinning disk over to cheap SATA drives I have in my farmer) Once I fill up all that spinning disk, though, I switch over to putting temp1 and 2 onto the RAMDisk and writing the final plot files to those old SSDs. I can’t run nearly as many in parallel, but it lets me continue to plot a little more and I turn my plotters into remote harvesters.
Once I decide to buy more SATA drives for my farmer, I run a script that moves all of those existing plots from my plotters over to those and start back with some aggressive plotting.
If you have a TB of RAM in each server, why aren’t you using Bladebit??? Have you tried it?
How is that possible?
My plot time with madmax using one 110G ramdisk and RAID0 SSD’s was 19 minutes
What cpu are you using? That must be your bottleneck. I had 5950X
Dang. I didn’t even know about Bladebit. I wish there was a Windows version of it. I’m not sure what performance I would get 'cuz my servers have 4x Xeon E5-4650’s. They’re 9 years old at this point. I’m not sure how a 4 socket setup is going to perform 'cuz NUMA starts to apparently factor into it. I’m curious to try it.