Performance problem

Hi guys,

I’m currently plotting with this configuration:
e5 2680 V3
16 Go Ram DDR4 2666Mhz
NVME 500Go Pioneer APS-SE20G formatted in Ext4
Ubuntu 20.04 using MadMax (-r 24 -u 256)

At the beginning (on Monday), I was able to plot every 59min but during the night of Monday/Tuesday, it slowed down to 95/100 min per plot, without any change, I clearly don’t understand, do you have any ideas ?

It could be a trim problem. Ubuntu does not enable it by default on ext4 as far as I know.
You could try formatting the disk with f2fs, which is designed for ssd and has trim enabled by default.

Also I think that cpu should do faster plots than 59 minutes, so your ssd might be holding you back here in term of plotting speed

1 Like

Thanks for your reply.
I followed what you said, formatted the disk with f2fs but it’s still very slow, the first step took 2500 sec, as before…

[P1] Table 1 took 22.5172 sec
[P1] Table 2 took 160.495 sec, found 4295047374 matches
[P1] Table 3 took 357.596 sec, found 4295136887 matches
[P1] Table 4 took 531.71 sec, found 4295277892 matches
[P1] Table 5 took 681.809 sec, found 4295286391 matches
[P1] Table 6 took 456.398 sec, found 4295300113 matches
[P1] Table 7 took 324.713 sec, found 4295341710 matches
Phase 1 took 2535.33 sec

About the NVME, I also guess it’s not the best one for plotting but 59 minutes is acceptable for me for the moment, not 100 or 110 min :sweat_smile:

Apparently, it really comes from Trim not activated but I have difficulties for “activating” it, I’m not very familiar with Linux :grinning_face_with_smiling_eyes:

I activated TRIm with this command “sudo fstrim -v /media/harstof/NVME/”

/media/harstof/NVME/: 468,4 GiB (502921031680 bytes) trimmed

But it’s still very slow…

The increase in P1 times indicate a throttling of some sort. Either thermal, or you have a QLC or TLC drive that runs out of cache after a few minutes.

It runs fstrim weekly, IIRC. You can run it manually after every plot. Alternatively, you can mount the volume with the discard flag.

Try with -r 12 -K 2. Did you do the turbo unlock?

I did a quick google and it looks like this is a QLC drive with very low write endurance (800TBW). If you apt install smartmontools and do a quick smartctl -a /dev/nvme<X> it should tell you the remaining spare threshold and total bytes written.

If you’re intent on using this drive, you could try over-provisioning it by resizing the NVMe namespace, but I’ve never had any success doing this and it’s not supported on most consumer-level drives. If the drive’s controller is partition-table-aware, you might be able to get away with creating a ~330GiB partition and leaving the rest of the space unpartitioned.

Apparently, my NVME is a SLC drive.

  • Very strangely, after typing the command “sudo fstrim -v /media/harstof/NVME/” yesterday, the first 4 plots took long, something like 90 / 110 min then the 5th and others took between 57 and 59 min maximum as at the beginning!
    Now, it’s the 20th and still around 59min.

So apparently, this command worked but I don’t understand why it didn’t work for the first plots but only from the 5th…

  • About your suggestion to mount the volume with the discard flag, could you tell me more ?
    I searched yesterday about this “discard” function (for activating the TRIM) but I didn’t understand which command to type because it’s all the time with xfs format.

  • Turbo unlock is a function inside the BIOS ? if so, what is the exact name usually ? Because I have so many settings in BIOS concerning the chip…

  • Apparently, my NVME has a Dynamic SLC cache (Internal SSD(APS-SE20G) | Pioneer).
    Information about my NVME:

=== START OF INFORMATION SECTION ===
Model Number:                       APS-SE20G-512
Serial Number:                      XXX
Firmware Version:                   ECFM13.3
PCI Vendor/Subsystem ID:            XXX
IEEE OUI Identifier:                XXX
Total NVM Capacity:                 512 110 190 592 [512 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512 110 190 592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            
Local Time is:                      Thu Aug 19 13:22:25 2021 CEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005d):     Comp DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     75 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.87W       -        -    0  0  0  0        0       0
 1 +     5.15W       -        -    1  1  1  1        0       0
 2 +     4.30W       -        -    2  2  2  2        0       0
 3 -   0.0490W       -        -    3  3  3  3     2000    2000
 4 -   0.0018W       -        -    4  4  4  4    25000   25000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        48 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    22%
Data Units Read:                    583 308 955 [298 TB]
Data Units Written:                 534 999 671 [273 TB]
Host Read Commands:                 785 699 867
Host Write Commands:                852 525 055
Controller Busy Time:               22 029
Power Cycles:                       70
Power On Hours:                     336
Unsafe Shutdowns:                   41
Media and Data Integrity Errors:    0
Error Information Log Entries:      143
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Thermal Temp. 1 Transition Count:   50
Thermal Temp. 2 Transition Count:   19
Thermal Temp. 1 Total Time:         493512
Thermal Temp. 2 Total Time:         480877

Error Information (NVMe Log 0x01, max 63 entries)
No Errors Logged

Thanks all for your help :smiley:

It’s a QLC drive with SLC cache. Big difference.

QLC NAND has four bits per cell. That makes it great for high densities but the controller has to do “more work” to write each cell, so it’s slower. SLC cache picks a region of QLC NAND and only uses the first bit in each cell. This dramatically reduces the total capacity of that region but it takes on write performance characteristics comparable to “true” SLC NAND. This has a few implications:

  1. The fuller the drive becomes, less QLC is available to use as SLC cache.
  2. As SLC cache fills up, the controller works in the background to move data to QLC. Because the controller can’t write to QLC as quickly as it can write to SLC cache (obviously, that’s the point of SLC cache), this can take quite a while. And it only happens when the drive is mostly idle.
  3. When the SLC cache is full, subsequent writes to the drive go straight to QLC.

I’m generalizing here—not all QLC drives work exactly like I’ve described—and I’m not even sure I’m describing it 100% correctly in the first place because NAND flash is dark magic. But that’s the gist of it.

1 Like

That’s great! I suspect that 1. the drive definitely needed to be trimmed, and 2. the SLC cache was full and the controller needed some time to move everything off to QLC as I described above. So as long as you keep the drive trimmed, you should only have to worry about the SLC cache filling up.

It’s a mount option. How are you mounting the NVMe in Linux? Typically you would do something like this:

sudo mount -t ext4 /dev/nvme0n1p1 /foo-bar

To add a mount option, pass the -o flag to the command:

sudo mount -t ext4 -o discard /dev/nvme0n1p1 /foo-bar

It’s a little more involved than that. You actually have to modify the BIOS image to remove some sections, then inject a driver that loads a different boost table. Then flash the modified BIOS image. There are tutorials online and on YouTube if you want to pursue this further.

Drive looks fine. Keep an eye on that Available Spare. If it drops below 100% you should probably stop using the drive—or at least pause plotting for a bit to see if it recovers. As Data Units Written approaches 800TB you might start to notice more issues… or sooner… or later… every NAND flash package is a beautiful and unique snowflake.

One other suggestion: make sure you’re using another volume as your final destination. This will help the drive keep enough free space to use as SLC cache. Good luck!

Many thanks for these explanations!

  • About mounting option, I was simply using “disk utility” and mounting the disk with the button “play” :sweat_smile: (I’m not used to Linux yet) but I’m gonna try your command tomorrow.

  • For turbo unlock, I searched a bit and indeed, need to flash the BIOS so I prefer to wait because if I kill my mobo, I will have to wait some time before to be able to plot again …

  • For the last point on life of the NVME, I’m gonna keep an eye on it.

  • I tried r 12 -K 2 but for the moment, it changes nothing apparently, will see after a night if it’s better or not.

Many thanks again for all your help :smiley:

A last question (a bit stupid but I’m not familiar with Linux :sweat_smile:)
I use this script for moving plot from my NVME to HDD

#!/bin/bash

sourcedirs=/media/harstof/nvme1/plot/
destdir=/media/harstof/SeagateDrive/plot/

function filesMatch {
   local arg="$*"
   local files=($arg)
   [ ${#files[@]} -gt 1 ] || [ ${#files[@]} -eq 1 ] && [ -e "${files[0]}" ]
}

while true
do
    for sourcedir in ${sourcedirs[@]}; do
        if filesMatch "$sourcedir/*.plot"; then
            for f in $sourcedir/*.plot; do
                date
                echo Moving $f to $destdir/$(basename $f).tmp
                time mv $f $destdir/$(basename $f).tmp
                echo Renaming $destdir/$(basename $f).tmp to $destdir/$(basename $f)
                mv $destdir/$(basename $f).tmp $destdir/$(basename $f)
                echo Plot $(basename $f) done
            done
        fi
    done
    sleep 30
done

My question concerns the name of hdd destination, in my example, it’s “SeagateDrive” but if it was “Seagate Drive” (with a space between both words), it doesn’t work because of the space (I guess), how to fix this without changing the name of the hdd ?

I’m assuming with -r at 24 you have dual processors? Because something is very wrong with your times and as others have mentioned it is without a doubt your Pioneer NVME drive. I have the same processors in one of my machines and when I tested plotting Maxmax with both temps to NVME I was getting about 30 minutes a plot.

I’d recommend to invest in a better NVME drive if you want to go faster. My personal favorite bang for buck NVME drive is the SK Hynix gold P31 1tb. Then just set up hourly trim. If you get 2x drives it will be even faster, and if you use a Ram disk expect under 25 min plot times with those processors.

I don’t have dual processors, I use -r 24 for 24 threads (but 12 cores), that’s why I plot in 1 hour but thanks for your advice.

From my understanding, with Madmax you are supposed to set the thread count to the physical core count and not thread count. Whenever I tried running anything beyond 3/4 threads more than core count I have it ran worse. Have you tried testing it with 12 or 14 threads?

i am sorry to tell you, but you are wrong. just enable SMT in your Bios.
but…

my setup with madmax (Windows 10 or Linux Ubuntu 20.04).

  • dual socket epyc 7551 cpu (total of 128 threads)
  • 512 GB Ram ( 16* 32gb sticks of 2666 mhz ddr4).

madmax (v0.1.5) tried with 8, 16, 32, 64, 96, 128 threads (in Windows and Ubuntu):

  • sweet spot is 32 threads (-t 32) in madmax
  • K is set to 2 (2 times multiply of -t=32)

doing -t 64 doesn’t give your more speed (frankly, you even get lower plotting speed with -t 64) i don’t lnow why.

plotting in Windows 10 takes more time (2 times slower) than in Ubuntu.
Windows 10 takes 40min per plot
Ubuntu 20.04 takes 22 min per plot.

have fun

SMT is hyperthreading ?

yes, SMT is hyperthreading. hyperthreading is intel-therm for it.