I’m a new farmer that have just joined the community. I’ve been following the " [Build a Budget Chia Cryptocurrency Plotting Rig]" guide from Chia Decentral to setup my plotting machine. I’m using CLI commands instead of GUI interface to setup my plotting event. Everything went smoothly and my machine have started farming.
However, after about 3 hours later, when I use glances to check on my system status, at the bottom it shows “High CPU I/O waiting” and “CRITICAL on CPU_IOWAIT”. A little more information from the top of glances indicates that my current system usage:
CPU: around 38% up and down
iowait: 3%-14% bouncing in between.
Mem: 37%
Currently the machine has 5 plotting event going on with 1hr separation in between. The total amount of plotting event was set to 9, which the last plotting event should start after the 9th hour.
Is this High CPU IO Waiting normal? Or is there anything I can tweak to solve this problem?
Bellow is my plotting machine specs:
CPU: Intel i9-10900
MB: Gigabyte Z590 VISION G
RAM: 4 * 16GB D4-3200 Kingston
HDD: 9 * WD 10TB Ultrastar
SSD: 250GB WD Blue (OS installed here), 2TB WD Blue (plotting second temp disk)
M.2: 2 * 2TB WD Black SN750 NVMe
PUS: 850W Corsair Gold
There are a total of 9 commands in one bash script. The only changes with each commands are the m.2 ssd being assigned to plot to different hdd, so nvme1 will plot hdd001 to 004 (4 commands here), and nvme2 will plot from hhd005 to 009 (5 commands here).
There are only these 9 plots running parallel with 1 hours delay to stagger them.
My strategy for iowait issues has just been to continuously add more ssds, and avoid m.2 slots that go through the chipset. Have you also changed changing the sector size of ssds to the highest possible? And changed the filesystem to xfs?
fireboy0526 if it is not a problem I will join to your topic because I have this same problem with I/O Waiting
Setup:
i7 - 10700k
32GB 3600 G.Skill
Gigabyte Z490 UD
system - ssd1 - kingston hyper X
temp Samsung 1 TB 980
hdd - 2x10 tb WD , 1x 3TB Toshiba
For now i have 2 parrarels plots 4t/2400MB nad after 6% i got few High CPU I/O waiting and CRITICAL on CPU_IOWAIT.
I see you are running with “-e” which means bitfield is disabled. You don’t need to do this anymore. There have been many improvements to bitfield to where its actually faster than no bitfield. When using this you only need to use 3408 RAM with 4 threads.
I see you are using “-2” also. You have two 2TB NVMes so you don’t really need to use the “-2”. I would use the ssdtemp drive as the destination drive for you plotters. From there then transfer to HDDs.
Thank you very much for your suggestion. I will be sure to give your suggestion a try. It seems that in patch 1.0.4 they’ve changed the bitfield mode to be faster and smaller. Should save me some extra ram and space while plotting.
Also, monitoring my iotop, it seems that kworker is causing all the iowait. Is it because of phase 1 of plotting that causes the high usage of kworker? https://imgur.com/a/pMcLMc1
Couldn’t remember how iotop displayed data so I had to look it up. In your image, the IO column only tells you the percentage of time a process spends waiting on I/O. So that kworker isn’t causing high IO wait, it is experiencing high IO wait because of all the plotters. 99.9% means it’s waiting for IO 99.9% of the time. But, I’m sure these lines jump around a lot, right? What we can say for certain is yes, some of your plotters are experiencing some IO wait. Having another nvme and raiding it would probably remove the IOwait.
You said that “Having another nvme and “raiding” it would probably remove the iowait”. My question is must we raid it? Personally I have bad experience with raiding, which is why I want to avoid it if possible.
I am having the same issue, a bit a slower cpu but after 2 plots in phase 1, or a total of 3-4 plots its causing HIGH iowait.
I have 3 x 1 tb ssd on raid0, xfs file system.
I am thinking in keeping them separte, 2 for temp and 1 for temp2.
What do you think?
Thanks
Using Ubuntu 20.04 on ESXi on a DELL R720XD with the temp directory on a 3x1TB SSD on raid 0 through the PERC H710P raid controller, with the virtual disk attached to the VM as raw disk. Same thing despite the completely different hardware.
I have the same issue (IO WAIT blocking) , but not when plotting. I have 8 plot to try and now I’m trying to Sync the full node and it’s a pain due to this IOwait that is nearly freezing the complete machine. Any chance to get around this ? all plots are on another SSD dedicated to that, if that help.
If you’re just having issues copying plots around then try to use rsync or scp and play with bandwidth limits. It will take longer to copy of course but put less strain on the system.
example to limit bandwidth to 1000KBytes per second:
rsync --bwlimit=1000 /path/to/source /path/to/dest/
example for scp:
scp -l 1000 file user@remote:/path/to/dest/