Four threads
Plot size is: 32
Buffer size is: 28000MiB
Using 128 buckets
Using 4 threads of stripe size 65536
Using optimized chiapos
Overall:
Time for phase 1 = 3630.957 seconds. CPU (230.460%)
Time for phase 2 = 2212.198 seconds. CPU (90.180%)
Time for phase 3 = 3170.369 seconds. CPU (139.560%)
Time for phase 4 = 299.807 seconds. CPU (102.430%)
Total time = 9313.333 seconds. CPU (162.080%)
It’s really faster than 2 threads for about 1380 seconds (~23 minutes). First phase is shorter, other phases the same as 2 threads. Now lets look inside…
Memory
Memory usage decreased to 4392 MiB at peak at first phase and a little bit for second one.
Disk spase
Temporary drive space usage is a same as 2 threads, but shorter in time.
CPU & IO
At first phase CPU usage is greater, but not at 400% as expected. (This is average values per minute, I would check it later).
I/O operations seems more agressive for first phase and same for others.
IOPS:
Total read and write:
Conclusions
It seems to be optimal parameters for me is -r 3 -b 4500
for 128 buckets