Thanks in advance to everyone who posts voluntary support on the forum! In the quest to learn and solve the problems on my chia farm, all the help was very helpful!
After a lot of work and hours of learning over many weeks, everything is OK on my chia farm, with the exception of the plots/day performance of my plotting machine.
After searching this entire forum for answers, and learning every detail, today I finally found that I have two issues:
- I/O performance far below the MINIMUM threshold expected by SSD NVMe (SSD Adata XPG S40G 4TB M.2): his factory spec is 3500MB/s write, on tomshardware blog he explains that this drops for up to 500 MB/s when it is more than half full. In my case, it never manages to cross the 300 MB/s write line.
- This I/O rate, which is already low, is not well divided between multiple plots in parallel. The older plots, which arrive in phase 2 or 3, are forgotten, while all the I/O is left with the newer plotting processes in phase 1. Then they all go on bottling up in phases 2 or 3, and accumulate waiting times per I/O total hours in the sum total.
In case that’s not the case for anyone, believe me: I’ve read everything I found of official documentation related to plotting, and unofficial, on this forum and on the web in general.
I’ve tried several things to try to resolve the very high I/O timeouts:
- Added fstrim every hour.
- Changed disk mount options in fstab.
- Monitored the temperature of all HW components.
- I reduced the RAM space that the system reserves for swapping to almost nothing.
- And many more attempts that are not even worth mentioning.
Only thing not yet tried:
- The only thing I confess I haven’t tried yet is to reformat the temporary NVMe disk And maybe replace the current xfs filesystem with ext4.
REQUEST 1: paid assistance:
If there is anyone here interested in providing PAID assisted technical advice to help me with this problem, I would accept the service.
REQUEST 2: voluntary aid:
If the commitment to paid service is not in anyone’s interest, but you want to help voluntarily, all help is welcome and I believe that this is the doubt and difficulty of many other people here too, from what I read in the help request posts !
Some details of my plotting system:
- Ubuntu desktop 20 LTS.
- All the work is being done via CLI, I never even opened Chia’s GUI or the machine.
- All work is done via SSH on LAN.
- All plots are made automatically by the plotman tool.
- Tools I used to monitor resource consumption:
- Setup of plots:
- r=4; b=6000; stagger between plots=50 min; (tmp disk is SSD NVMe).
HW: (inspired by a Tomshardware post which states this could reach 30 plots/day)
- CPU: Intel Core i9 10850K (10 cores/20 threads, operating at the default clock of 3.6GHz).
- Motherboard: MSI MEG Z490 Gaming Carbon Wifi.
- Case Coolers: 5 FAN Corsair AF120 Coolers.
- CPU Cooler: CoolerMaster Hyper T4 RR-T4-18PK-R1.
- RAM: 2x32GB Team T Force Zeus 3200 MHz (64 GB RAM totaling).
- IMPORTANT MOST: 1x SSD Adata XPG S40G 4 TB M.2.
- OS and all SW are in another SSD: 512 GB Sandisk.
- PSU: 600W Thermaltake, Smart Series, 80 Plus White.
- Case: Corsair Obsidian 750D Full Tower.
- Destination HDDs: External HDD: 8x Seagate Portable 5TB.
There are not RAID between the 2 disks: the average SSD is just for SWs and OS and the NVMe is 100% dedicated for plotting process.
IMPORTANT NOTE: with just ONE plot it finishes the full job in 3.7 hours (13600 s), using 4-8 CPU threads and 8 GB RAM. So, it can indeed make 6 plots/day currently.
BUT with this system tomshward states that it is able to put 10 parallel plotting, and in my case, I cannot put even 2 parallel plotting without losing all performance to I/O infinite waiting. So, it’s supposed to reach 30/day, although with 20/day I would be quite happy already.
IF YOU READ ALL THIS POST, THANK YOU SOO MUCH!!!