Sometime plots take 10 minutes longer to complete

I have the following hardware:

Dell T7910 (just replaced T5810 with same components which also did the same), it’s a dedicated machine for plotting.

1 x E5-2699v3 18 core CPU
256GB Ram (250G ramdisk)
4 x 200GB Intel SSD raid 0
Linux Mint
Plotting from the CLI

Sometime my K33 plots complete in around 65 minutes other times they complete in around 75 minutes. I’ve never been able to work out why, and it doesn’t help that I’m a Linux Noob.

I had originally thought it was whenever I shut down the PC that it changed, but today I had a self created glitch and it ran out of storage space. The first two plots completed in around 75 minutes before the space ran out. After I cleared the backlog and started plotting again every plot so far has completed in around 65 minutes. The duration will take the same for the whole batch, so if its 65 minutes they are all 65, if its 75 then they are all 75, it doesn’t seem to change half way through a batch.

Now today I’ve noticed its only phase 2 which takes longer, the two plots times are below.

Anyone any idea what could be causing it?

Edit. Should add I’m using exactly the same settings each time, I just hit up arrow in the console.

66 minute plot.

Crafting plot 4 out of 27 (2022/03/05 14:30:30)
Process ID: 12081
Number of Threads: 18
Number of Buckets P1:    2^8 (256)
Number of Buckets P3+P4: 2^7 (128)
Pool Puzzle Hash:  <snip>
Farmer Public Key: <snip>
Working Directory:   /media/chiamining/Raid0/
Working Directory 2: /mnt/ramdisk/
Plot Name: plot-k33-2022-03-05-14-30-eb1b093c53fb8be13f943506f711b17b0ab4f262786e1e90e746978c8e91fcd5
[P1] Table 1 took 35.7205 sec
[P1] Table 2 took 201.269 sec, found 8589864034 matches
[P1] Table 3 took 259.654 sec, found 8589840559 matches
[P1] Table 4 took 290.663 sec, found 8589779461 matches
[P1] Table 5 took 286.452 sec, found 8589676618 matches
[P1] Table 6 took 259.396 sec, found 8589509516 matches
[P1] Table 7 took 233.741 sec, found 8589166604 matches
Phase 1 took 1566.91 sec
[P2] max_table_size = 8589934592
[P2] Table 7 scan took 45.7622 sec
[P2] Table 7 rewrite took 78.1122 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 54.534 sec
[P2] Table 6 rewrite took 100.178 sec, dropped 1162575443 entries (13.5348 %)
[P2] Table 5 scan took 63.5149 sec
[P2] Table 5 rewrite took 79.6154 sec, dropped 1523945922 entries (17.7416 %)
[P2] Table 4 scan took 65.0739 sec
[P2] Table 4 rewrite took 78.0684 sec, dropped 1657688927 entries (19.2984 %)
[P2] Table 3 scan took 69.7358 sec
[P2] Table 3 rewrite took 77.5137 sec, dropped 1710077829 entries (19.9081 %)
[P2] Table 2 scan took 66.6757 sec
[P2] Table 2 rewrite took 75.665 sec, dropped 1731083900 entries (20.1526 %)
Phase 2 took 887.372 sec
Wrote plot header with 252 bytes
[P3-1] Table 2 took 89.4178 sec, wrote 6858780134 right entries
[P3-2] Table 2 took 80.1316 sec, wrote 6858780134 left entries, 6858780134 final
[P3-1] Table 3 took 136.35 sec, wrote 6879762730 right entries
[P3-2] Table 3 took 86.6875 sec, wrote 6879762730 left entries, 6879762730 final
[P3-1] Table 4 took 138.419 sec, wrote 6932090534 right entries
[P3-2] Table 4 took 87.0621 sec, wrote 6932090534 left entries, 6932090534 final
[P3-1] Table 5 took 135.874 sec, wrote 7065730696 right entries
Copy to /media/chiamining/DiskBay6SAS10TB/ChiaNFTPlots/plot-k33-2022-03-05-13-24-161f3dd250b1dcb3e49215a91285689472cbed96ee38a72fb187bddd5d023bf9.plot finished, took 3277.77 sec, 65.2344 MB/s avg.
[P3-2] Table 5 took 88.7771 sec, wrote 7065730696 left entries, 7065730696 final
[P3-1] Table 6 took 138.697 sec, wrote 7426934073 right entries
[P3-2] Table 6 took 91.9238 sec, wrote 7426934073 left entries, 7426934073 final
[P3-1] Table 7 took 162.496 sec, wrote 8589166604 right entries
[P3-2] Table 7 took 107.125 sec, wrote 8589166604 left entries, 8589166604 final
Phase 3 took 1353.5 sec, wrote 43752464771 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 149.794 sec, final plot size is 224225084917 bytes
Total plot creation time was 3957.69 sec (65.9614 min

74 minute plot

Crafting plot 1 out of 29 (2022/03/05 00:30:37)
Process ID: 2576
Number of Threads: 18
Number of Buckets P1:    2^8 (256)
Number of Buckets P3+P4: 2^7 (128)
Pool Puzzle Hash:  <snip>
Farmer Public Key: <snip>
Working Directory:   /media/chiamining/Raid0/
Working Directory 2: /mnt/ramdisk/
Plot Name: plot-k33-2022-03-05-00-30-83028f34f0b27c67fdb82f07a98bceef563c6329a5efc12b5131f1777ac199b4
[P1] Table 1 took 34.384 sec
[P1] Table 2 took 194.439 sec, found 8589850456 matches
[P1] Table 3 took 253.352 sec, found 8589867008 matches
[P1] Table 4 took 283.601 sec, found 8589849728 matches
[P1] Table 5 took 278.964 sec, found 8589799812 matches
[P1] Table 6 took 250.026 sec, found 8589694079 matches
[P1] Table 7 took 228.02 sec, found 8589342556 matches
Phase 1 took 1522.8 sec
[P2] max_table_size = 8589934592
[P2] Table 7 scan took 43.353 sec
[P2] Table 7 rewrite took 77.0855 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 40.8711 sec
[P2] Table 6 rewrite took 205.402 sec, dropped 1162536740 entries (13.5341 %)
[P2] Table 5 scan took 51.281 sec
[P2] Table 5 rewrite took 196.804 sec, dropped 1523931872 entries (17.7412 %)
[P2] Table 4 scan took 59.3791 sec
[P2] Table 4 rewrite took 183.439 sec, dropped 1657675510 entries (19.2981 %)
[P2] Table 3 scan took 61 sec
[P2] Table 3 rewrite took 232.112 sec, dropped 1710071639 entries (19.908 %)
[P2] Table 2 scan took 59.3086 sec
[P2] Table 2 rewrite took 184.702 sec, dropped 1731047063 entries (20.1522 %)
Phase 2 took 1426.65 sec
Wrote plot header with 252 bytes
[P3-1] Table 2 took 88.2685 sec, wrote 6858803393 right entries
[P3-2] Table 2 took 79.7635 sec, wrote 6858803393 left entries, 6858803393 final
[P3-1] Table 3 took 133.444 sec, wrote 6879795369 right entries
[P3-2] Table 3 took 84.5177 sec, wrote 6879795369 left entries, 6879795369 final
[P3-1] Table 4 took 136.256 sec, wrote 6932174218 right entries
[P3-2] Table 4 took 85.6728 sec, wrote 6932174218 left entries, 6932174218 final
[P3-1] Table 5 took 137.01 sec, wrote 7065867940 right entries
[P3-2] Table 5 took 86.9061 sec, wrote 7065867940 left entries, 7065867940 final
[P3-1] Table 6 took 141.836 sec, wrote 7427157339 right entries
[P3-2] Table 6 took 92.2599 sec, wrote 7427157339 left entries, 7427157339 final
[P3-1] Table 7 took 158.31 sec, wrote 8589342556 right entries
[P3-2] Table 7 took 105.892 sec, wrote 8589342556 left entries, 8589342556 final
Phase 3 took 1340.45 sec, wrote 43753140815 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 148.836 sec, final plot size is 224228717351 bytes
Total plot creation time was 4438.85 sec (73.9809 min)

Could you try a couple of short (maybe even -n 1) batches of K32’s in RAM only to see if this gives consistent timings?
Then a couple of similar K32 batches on the raid-0 to check for those timings.

Would be interesting to know if it’s happening working from RAM-only, RAID-only or both for further investigation.

Thanks, I’ll test it once its finished the current batch, its got another 19 k33’s to go.

I’m sure it did it with K32’s, but it’s been a while since I plotted any to have any clear memory, I know I got it down to 22 minutes, and I think sometimes it would take 25 but can’t be certain.

My guess is that you are running out of ram and using swap space, since you only have 6GB free to do the plotting + run an OS.
Keep an eye on your swap usage with top.

I found that system monitor has a nice graphical user interface, and during phase 1 memory usage got to 96%, and only 25% of a 2GB swap file was in use.

In phase 2 memory usage was around 38%, but 100% of the 2GB swap file is used, it does fluctuate but did appear to use anywhere near as much memory as phase 1. Although I don’t quite understand what is displayed: Memory 95.7Gib (38%) of 251GiB, but then it says cache 227.5 GiB. So if its using 95.7GiB where does the 227.5 GiB cache come from??

This is whilst its completing plots in 65 minutes though, but if it was running out of memory, wouldn’t be every time, rather than some runs, or some would be 65 minutes and some would be 74 minutes, not all one or the other??

Just to update this thread, I’m still seeing the issue, even with 512GB of ram and dual E5-2699v3 CPU’s.

When I updated to the above my k33 plot time dropped to around 59 minutes, the batch I ran overnight took around 70 minutes each, and it was phase 2 that took the extra 10 minutes.

I had the issue on a Dell T5810, and now the Dell T7910, this is with Linux Mint, not sure if it happened when I was using Ubuntu.

I’ve nearly finished plotting (unless more disks come up at the right price), so will be able to devote time to trying to locate the cause, but its way beyond my abilities.

Hi,

i have the same effect, i do only k34 plots in a row (n 999) with 2 different ploters (ryzen 8/16 64GB & ryzen 16/32 64GB only 22 threads because of the ram) and i use two m.2 for each to plot.
Both with the same problem in phase 2 at table rewrite, it could happen that this takes double time. if this happend all plots after it takes so long (190min & 220min). but if i restart the serial plotting till phase 2 had the normal timings all plots after this had normal timings (160min & 190min). normaly i need 0-4 trys after i restart the system or i was stoping the serialploting to get the normal timings. :frowning:
I never have memory swaping so it could not the problem. multi CPU/Chip also not because my 8/16 Ryzen have only one chip.
i think its a problem with madmax and plots greater k32 somthing with memory allocation/managment.

edit: i use ubuntu server 21.10

I’ve experienced somewhat similar timing anomalies on my Threadripper 16C/32T, first using the OG plotter, and Madmax too after that appeared. Parallel instances would change overall times to complete. Finally, with MadMax I wanted to try and solve the issue, and decided to play around with CPU affinity.

To test that theory, I would split the cores, assigned 16Cs each to 2 MM instances with Win10 affinity settings. Each MM had it’s own two SSDs for T1 & T2 so things could be virtually separate operations. Result was that rather than have dis-similar timings or variable timing between parallel instances or phases of plotting, I could keep them nearly identical over long periods of time.

I’ve chalked up the original variances to the Windows scheduling algorithm. If you let the OS decide what to give to what cores, seems that can happen. In the end, I’m not sure it really matters too much, as the times do seem to average out anyway mostly.

this was also my first thought (CPU affinity) but my Ryzen 5700G have only one Chip and i plot with -r16 (16T) on it and the same problem apears, so it cant be the CPU affinity when i use the full CPU :frowning:

Exactly the issue I have, using Linux Mint. I stopped and restarted plotting last night but it was late so I left it going, checked this morning and all plots had taken longer.

I think, but not certain some of my K32 runs took longer.

The Dell T5810 is single CPU and it happened on there, I’m using Linux, no idea if it happens on Windows but I can test. When it happens all plots take longer, and when I’ve checked it’s always phase 2.

Plots always vary slightly, first one is always quickest as there’s no disk copy going on, but after that the others are either two or three minutes slower. When phase two is slow it’s always about 10 minutes slower with the same variance.

The only solution for me at the moment is to use serial plotting with a big -n value and restart the plotting till the timings are fast, normaly i need 0 to 4 restarts (terminat one time so the curent plot will be finshed and not aborted).
And after u got good timings, all ur plotts will be fast till u abort the plotting and then u have to do this again. :unamused:

edit: i think if u plott k32 the difference are to little and so anybody think its normal

Still sounds like some type of CPU hardware or scheduling/timing anomaly, … that is…unless Chia code does not execute the same each time within the CPU. In any case … very weird.

Yes but we have here totaly different hardware setups and plotting solutions (RAMDrive vs SSD) even the OS isnt the same. The only match is that we using MadMax, so i think the problem should be located in MadMax itself.

I’m certain its not related to CPU affinity, as your on a single CPU system, I’ve had it on both a single and dual CPU system. Could be the OS though, Linux Mint is based on Ubuntu.

I’ve seen reports that both Debian and CentOS 8 are faster at plotting, to quote someone on another forum, not had chance to try either yet.

My plots took 13-14 mins on Ubuntu and on CentOS it’s down to 9 minutes per K32

PS. For K33 I use a ram drive and raid 0 of SSD’s.

13-14 min & 9min i think are bladebit times and not madmax never heared about a 15 min madmax k32 plot time, maybe on a totaly overclocked system, but not on a normal madmax plotter. and with bladebit it could be that times are better than ubuntu, but with madmax i think its nearly the same but i dont know it exactly i never tested this by my self.

Excellent point…

Yes you are right, the quote was with reference to Bladebit, but if that runs faster, hopefully Madmax will, the comment I saw about Debian was referring to Madmax.

I’ve got a single Madmax K32 down to 18.6 minutes, and two in parallel down to 27 minutes for the pair, so an average of 13.5 minutes each.

Ok my last time comparisions for a solo plot are months ago, i see the plot times are shrink very fast.

@DigitalSpaceport Have you got any thoughts on why phase 2 would take longer on some plot runs? I’ve seen it recently on K33’s but I think my K32’s used to do it, but it wasn’t so bad.