Sudden increase in IOWait - plots stuck at phase 3

I’ve been happily plotting for weeks on 5900x with 3x2TB nvme in Raid0 when suddenly my IO wait started to get over 30%. I am running Ubuntu server on 5.11 kernel, 1.1.5 chia.

I’ve checked drives temperatures they are running bit hot, but nothing unusual + i let them cool off during the night with open window, didn’t seem to make a difference. They are relative new with percentage_used being around 30%.

I switched to seagate external drives for final dest, but I don’t think that can be a problem. For now I increased stagger time and by 10 minutes and decreased parallel plots to 10 from 12. Really puzzled about this. My system did crash before this but that has happened before (Because of RAM), but without any problems.

Did you mount with “discard”, or if not, do you run “fstrim -v /dev/<disk>” periodically ?

No, I did not, do you think that can make a such a difference? Also isn’t that run automatically on ubuntu? Just fyi most of plots are stuck in phase 3

I put this in my root crontab (as root type “crontab -e”):

33 * * * * sh -c "fstrim -v /media/pro980.1 >> /var/tmp/fstrim.1.out"

44 * * * * sh -c "fstrim -v /media/pro980.2 >> /var/tmp/fstrim.2.out"

That runs fstrim once every hour (per drive). Even if Ubuntu is running fstrim automatically, it’s probably running it once every two weeks or something like that!

You can run fstrim right away. You don’t need to wait for any phases to finish. Just run it against all your SSD’s. You should see pretty soon after if that helps!

p.s. I have no idea how fstrim works with raid. I never raid anything!

1 Like

Yeah thanks, although is it safe to run fstrim? Wont I loose tem pdata?

Yes it’s safe and it won’t touch existing data.

By not using continuous TRIM from day one (discard mount), your SSDs have not been operating efficiently, causing unnecessary increased NAND wear and reduced performance.

Also:

Does it work with RAID? and if so I can just run fstrim -v /mnt/temp-chia ?

Yes it works for RAID (mdadm software RAID I assume?) - add discard (ext4, XFS) or discard=async (Btrfs) to the mount options.

And yes run that fstrim command on the mountpoint.

1 Like

Harris thanks a ton! That helped. The issue was that fstrim is run weekly but only ran on main ssd not for temp drives!