Why is staggering important?

Is there a primer on why staggering is important?

From what I can tell all 4 phases are heavily CPU bound (see excerpt from log below). The only phase that is not is copy phase at the end.

In my case, I am on 32C/64T Threadripper pro with 256GB RAM. I run 40 plots in parallel and when it finishes phase 4 and starts the copy phase, I start another batch of 40. I get about 90 plots/day on this machine using this simple setup.

Here is a summary of one of the plot files:

Time for phase 1 = 16580.790 seconds. CPU (146.350%) Mon May 10 14:45:56 2021
Time for phase 2 = 6068.944 seconds. CPU (97.340%) Mon May 10 16:27:05 2021
Time for phase 3 = 11178.198 seconds. CPU (92.180%) Mon May 10 19:33:23 2021
Time for phase 4 = 846.353 seconds. CPU (89.180%) Mon May 10 19:47:30 2021
Total time = 34674.286 seconds. CPU (118.910%) Mon May 10 19:47:30 2021
Copy time = 3024.519 seconds. CPU (2.730%) Mon May 10 20:37:54 2021

How can I improve it with staggering?

See Comparing Plot parameters for threads, memory, and staggering - #12 by codinghorror

Thanks and sorry about the post deletes.

Just so I am clear, lets say you want to launch 30 plots with 30 min stagger time.

If you are launching plotters from GUI, do you choose “plot in parallels” with “Want to have a delay before the next plot starts?” set to 30 minutes?

If you are launching from windows CLI, can you share your script?

Time for phase 1 = 14392 seconds. 
Time for phase 2 = 5141 seconds. 
Time for phase 3 = 10836 seconds.
Time for phase 4 = 829 seconds. 
Total time = 31199 seconds. 
Copy time = 536 seconds. 

An example plot I just pulled from my logs. I stage 6 streams of queued plots w/30 min delay between them… 5438MB 2Threads on a lowly Ryzen 3600. Looks like it improves over ur times 10% or so. Your copy time is very long, I copy to USB 3 HD. Kudos - You get awesome # of plots/day! GUI BTW.

1 Like

Yes, it is very simple… I put in a sleep / pause command before the plot, like so:

sleep -s 14400; ./chia.exe plots create -k 32 -r 4 -b 6144 -n 100 -t d:\chia-plotting -d e:\

The sleep -s 14400; part means

:sleeping: sleep for 14,400 seconds before executing the next command after the semicolon

1 Like

You have a 4 hour stagger time? How did you arrive at this number? I was thinking of using 1 hour based on the average copy time of my 40 parallel batch jobs.

How did you arrive at this number? I see your copy time is 9 minutes and this is, if I am not mistaken, is the copy time in the plot phases where you have a clear separation of cpu vs io/bound so you can take advantage of that.

Maybe crazy, but I printed out the little graph floating around someone did of CPU time/Disk MB/s / HD space /Mem used (showing them one above each other in time) - cut them out - lined 4 copies of each up on a table and eyeballed the best mix to join them with multiple instances running. Half way thru Phase one seemed about right, but starting them manually would take too long (I wanted to go to bed before 12PM that night!). I’d also manually recorded about 6 stream’s plot phase timings (several times) on a chart to try for best phase timings. Turned out 6 streams 1/2hr apart synced really well, did really well. Call it luck, dumb luck, or perseverance, or crazy, but whatever.

If you look at the logs all phases expect for the copy phase (after phase 4) are heavily CPU bound. So I am not convinced by the “halfway through phase 1” approach works although I do see it mentioned a lot.

It seems to me launching a new plotter process right after phase 4, when the plotter is in “copy phase” busy doing just I/O, seems to make sense. Right?

Not really. If you watch (literally) enough plots progress you’ll learn how the phases alter your computer’s , CPU, storage type & speed, xfer speed, mem use, and component temp over the course of the plot. Sit hours w/task mgr, CPUID HWMonitor, chia itself while plotting. Sure, in P4 it’s basically over, everything except copying - but that’s essentially queued execution, I want queue followed by queue followed by queue, etc. in a nice flow until I have 6 queues running. Load of one plot is a fraction of what the PC can do.

Soon, with a TR I hope to have 12-16 queues running. Now, my last queue #6 is already 3 hrs into the flow, ergo. then Q12 or Q16 on the TR might be 6 or 8 hrs into the flow. Hey! I’m already near beginning next plot! Perfect. Of course it’s all is dependent on your setup, record experimental timing to adjust accordingly.

I am little confused so hoping you can clarify this.

What you want is something like this:

While (n-- > 0):

  • Launch plot create (1 plot) without blocking (i.e in background)
  • Sleep with blocking

Your script does this:

  1. Sleep with blocking
  2. Create n plots in sequence.

So I don’t see how it staggers all n plots.