Here is my test result.
Using 16 threads is actually faster than using 32 threads. This could be the reason why MadMax introduced -K
to double the number of threads used in phase 2 which is the only phase benefiting from using all the threads available.