I don't understand how to set up NUMA correctly

vasiliinorris · May 13, 2023, 10:15am

I have ordered a Supermicro X9DRi-LN4F+ with 2 CPUs and 512 GB 14900 LRDIMM with a plan to put there a pair of A4000 GPUs.

I want to plot with 2 GPUs in parallel, because I’ve ordered too much storage and my current setup (256 GB + a single A4000) doesn’t plot fast enough (160-180 secs per C7 plot - around 500 plots per day, but I need twice/thrice more) and I want to speed things up because my current motherboard, Supermicro X9DRE-TF+, has two x16 slots that are both linked to the same CPU.

IIRC, for NUMA parallel plotting to work correctly, in my case the first GPU should be inserted to the x16 slot that is connected to the CPU0, and the second GPU should go to a slot that is connected to the CPU1.

Also I need to make sure that I put equal amount of RAM to the corresponding slots so that each CPU gets 256GB RAM each.

Am I right? Or are my thoughts incorrect somewhere? If so, what shall I change?

And how do I start parallel plotting correctly? There’s little written on MadMax’s Github about it.

Whatshisname · May 13, 2023, 3:41pm

As for the physical setup, both Gen 3 x16 slots are linked to the same CPU… this is not optimal for using both nodes of your numa setup since CPU2 will need to reach across to the x16 slot. If you do go this route, yes, 256 GB RAM on each CPU and the commands in Linux would be the first examples listed here:

A better option may be, if you can setup all 512 GB on the first CPU, remove the second CPU. In this case, it may help to have the best CPU that board can take since it’s v2 and it’ll be traffic cop for both GPUs as well as handling the 10Gb network if moving final plots to other machines. This setup would forgo any need for the numa commands, you would just need to add -g 0 and -g 1 for the respective GPUs to your command line options.

vasiliinorris · May 13, 2023, 6:39pm

Unfortunately, with my current motherboard it is not possible to do so, because both X16 are on CPU1, but SSB is on the CPU0.
If I remove CPU0, the motherboard will not work.
If I remove CPU1, the x16 slots won’t work.

That’s why I’ve ordered a different motherboard (X9DRi-LN4F+) which has x16 ports on both CPUs, and I was asking about that one and not about my current one.

Whatshisname · May 13, 2023, 7:31pm

I see, I misread and looked up the wrong mobo. With your new mobo, you are correct with the following statements:

Other than that, make sure to set any bios options that may be present as mentioned in the MadMax page like: (disable memory channel interleave for NUMA) .

The specs I’m seeing don’t mention LRDIMM modules and they may work fine, (I really don’t know) but if you have issues getting it to post, that would be the first thing I would look at testing with non LR modules.

vasiliinorris · May 13, 2023, 7:38pm

Thank you for your help!

As to the LRDIMM, the manual says: “Integrated memory controller supports up to 1.5 TB of Load Reduced (LRDIMM)…”

So, I think I will be fine with that type of RAM.

hajes29a · May 19, 2023, 8:44pm

I have similar MB…both PCIe 16x slots linked to CPU2. You won’t be able to use two GPUs…just one GPU can hog PCIe 3.0 bandwidth. You will slow down the whole process.

If I use NUMA pinning, I get slower plots.