Understanding qs sort

I know uniform sort is faster then qs sort.
I’ve got 4 threads and 6750mb ram per set.
64G of ram in the rig. Used ram is 24G. Cached fills up the rest of the 64.

I’m seeing qs in every plot log file. I’ve reduced parallel plots to see if there’s any reduction in as sort and there isn’t.

Is qs sort used only when there isn’t enough RAM?
If it’s happening, is that the likely bottleneck?

I’ve got 5800x, 64G ram, 3 nvmes. I was getting 24plots/day with 2 nvmes. Ram and cpu not maxed.
I added a third, and am only getting 27. Trying to find the new bottleneck.

I see the QS at the end of some phases for no good reason. For example, I recently gave every plot 14Gb memory, there was still a couple remaining QS lines in the log. Perhaps at some point QS is actually better the USort. Unfortunately I not a bucket sort guru.

1 Like

On the 5800x with those specs you should easily be getting 30+ per day.

Assuming 3 x QUALITY 1TB NVMEs I’d set up 3 jobs, 1 per NVME. Overall, 12 max with start early in parallel, 10 max overall. You can easily use 6750mb of memory with a 60 minute stagger. Make sure parallel jobs don’t start at same time if going to a single HDD.

You are CPU limited here, as you have double the ram needed, and only need 2 x 1TB to get the most out of this.

2 Likes

I’ve got 3x 2TB.
2 Corsair mp600
1 Samsung 970 Evo plus.

Using plot manager.
Staggered 30mins. Limit 2 in phase 1 and 4 total plots per nvme. Two destination HDDs.

I’ve tried 8, 10, and 12 parallel. They all result in 24-27 plots per day. I’ve got loads of data using Prometheus/grafana. I just can’t pinpoint the bottleneck.

When you do a 60 minute stagger, does that not mean you can never get more than 1 plot per hour? Ie. 24/day?

The algo uses a bit of quicksort regardless of ram size. You will notice this in the logs as it says something like “forced_qs = 1” (can’t remember the exact text). This is normal, everything is fine.

You should only worry if it is doing a quicksort that says “forced_qs = 0”, as that means it had to abandon uniform sort because of not enough ram

4 Likes

Ah. That’s super helpful!
Odd that force_qs=1 means it’s not because of running out of RAM but I checked the code. Checks out.

ok. QS isn’t happening in my plotting at all. Not a single time.

So RAM isn’t my bottleneck.

No, stagger is just the start time between each plot, so the second one starts after 60 min, the 3rd after 120 mins, etc. After a while you wil have 12 plots running in different stages, spreading the workload as best as possible.

How many you do per day is function of amount of plots running at the same time x average time per plot.

Not quite, if you have 2 destination HDDs, you can do 2 jobs kicked off at once so 12/10 in parallel but one job on 1 SSD, 1 on another, etc. I’d actually do 3 jobs to take advantage of that third NVME SSD too.

So math says you’d have 24/1 = 24 * 2 = 48 max.

Realistically, the stagger will increase over time and you should see maybe 35-40 max, but 60 min is a good stagger to start with.

In short, kick off 2 jobs are once, plotted on 2 different temp drives, going to 2 different destinations. 4 threads each is optimal.

I suspect its because bucket/radix sorts work best on large data sets. If there’s a little bit of data left over for a given stage, it’s not worth doing radix sort on it, as for such a small bit of data quick sort is actually faster.