Significantly slower plots moving a plotting drive to Linux, looking for NVME configuration advice for Linux

I have moved my 500gb 960 EVO drive into my linux system from my windows system. The Linux system is a bit less powerful so it isn’t identical however times I’m seeing are almost double what I was getting with my Windows system.

I’m currently stumped as to the reason why, the drive was starting to get up in wear on the windows system with its used percentage being 0x0020, so 32% as I took it. However it was still performing fine and never had significant slow downs this was at about 140 TBW of its rated 200 TBW (given wear rates don’t translate 1:1 with TBW given how chia uses the drive with largely 64k writes from my understanding)

I moved the drive to my system and at first with NTFS file system on it since I didn’t understand how to utilize ext4 or xfs, my times were almost triple that of the windows machine. (17k sec dual plotting on it for windows, 55k on linux)

I had learned how to make it ext4 as well as trim it, and times for dual plots was lower at 34k seconds but still obscenely high considering the specs of the system in my opinion. During a single plot it was closer to 23k seconds which I thought was more reasonable (windows did single plot in 12k on this drive). (these were always staggered of course 2H stagger, although given the slow downs it might need more

So I’m not sure what to do or how to address this, I’m hoping for some clear instructions for something that might not have been clued into as a requirement for configuring this drive correctly in a linux environment. I’ve done my best to search but am still left not really sure.

The linux system had been used already for plotting with hard drives albiet everything was on NTFS and not sure how much that was holding back everythings performance. I used a Teamgroup MS30 512gb sata m.2 for awhile plotting as well before I made this switch and it performed very poorly as well slower than the hard drives, which I attributed to just being low quality but maybe there’s something I wasn’t doing correctly, it could only single plot and usually took 48k seconds or so, dual plots basically doubled the time. Hard drives were plotting on were 8tb Seagate Ironwolfs, and would plot 1 per, with 42k sec times.

Specifications of systems and commands being used.

Linux system: (Ubuntu 20.04 LTS)
CPU: Ryzen 7 1800x (Running at stock config)
Mobo: Gigabyte X370 K7
Memory: Corsair Veng 16gb 3000mhz C16 x1, Patriot 8gb 3000mhz C16 x1
SSD: Samsung 960 Evo 500gb running at 4x
System also has storage on it and is my farming system currently.
Plots on SSD were tested with no plots running on the hard drives.

Windows: (Windows 10)
CPU: Ryzen 7 5800x (Stock)
Mobo: B550 Tomahawk
Memory: 4x Gskill C16 3600mhz Bdie 8gb
SSD: 960 Evo at the time, now raid 0 1tb 970 Evo’s

Both systems plot using CLI manually configured, linux system outputs its logs to a text file.

960 Evo plots on both systems was using K32, 4gb memory, 4 threads, and no secondary temp

I was expect the linux system to be a bit slower since the 5800x is about 50% faster in single threaded at least I believe, however I didn’t expect it to be that much slower maybe I’m overestimating its performance and its current performance is about all I should expect, please let me know if I’m expecting to much. I expected it to be about 25%-30% slower given linux’s benefits added in, 100% slower is a surprise to me.

Not sure why you’re seeing things be so much slower. It looks like you’re not running RAID on Linux - I wouldn’t, so that’s good.

If the system is idle (or even if it isn’t, but then it becomes a bit harder to interpret) -

You can do some raw debugging by doing the following:

echo 3 >/proc/sys/vm/drop_caches

(to remove anything in Linux’s buffer cache)

dd if=/dev/sda of=/dev/null bs=1000k count=10000
dd if=/dev/sdb of=/dev/null bs=1000k count=10000
dd if=/dev/sdc of=/dev/null bs=1000k count=10000
dd if=/dev/sdd of=/dev/null bs=1000k count=10000

Sequentially, which’ll show you raw read throughput

If you want to see parallel speed just run them all in the background:

dd if=/dev/sda of=/dev/null bs=1000k count=10000 &
dd if=/dev/sdb of=/dev/null bs=1000k count=10000 &
dd if=/dev/sdc of=/dev/null bs=1000k count=10000 &
dd if=/dev/sdd of=/dev/null bs=1000k count=10000 &

If there’s nothing on the disks you need, you can use dd to write to the raw partition or disk. Or if there is and they’re say mounted on /disk1 /disk2 /disk3 and /disk4 you can do:

time (dd if=/disk1/writetest1 of=/dev/null bs=1000k count=10000 ; sync)

For all the mount points

But divide the total time into 10GB to get the total throughput/sec because that’ll factor into how long it took to sync the data being written as well.