Performance loss plotting on multiple HDDs in parallel

Signed up to confirm this.

As NinjaDummie says, “It’s as if there was a software lock creaking saying not to make 2 simultaneous accesses in SATA.”

I can see clearly, my Z drive is 100% whilst my X drive is 0%. A hard 0. Nada. And then they swap and alternate.

Memory and CPU are 8core 32gb.

The issue is less pronounced with 1 SATA and 1 USB, the USB slows to 0-40%, but the issue is still there.

1 Like

I also find that pasting a large file from my OS SSD, to the disk that is currently 0%, also results in no write activity to the disk as long as the other disk is getting 100% activity from chia!

It’s very odd - windows still reads slowly from the source, but simply doesnt perform any writes to the destination, it’s just decided for its self that it doesnt want to.

If I run a CrystalDisk Write benchmark on the drive, it’s fine - the disk gets busy doing the write benchmark.

This type of chia blocking does not happen on USB HDD.
I tested putting 3 plots in parallel on separate USB HDD and they flowed smoothly.

I’ve worked out what is happening. I do not yet know the fix.

Basically Job1 fills up the write cache, and keeps filling it up, as windows flushes it to disk.

Job2 comes along and windows tells it to wait because the write cache is full!!

Regarding the “software lock” - using ProceXP, I can confirm this is not the case. The process is waiting on ntkernel:write, it is not waiting on a mutex type api call.

1 Like

Good job!
And why doesn’t that happen when the HDD is USB?

1 Like

It did for me a little bit. It will be up to the deep internals of the Windows cache scheduler. You will notice that if running 4 sata jobs, the OS does alternate between which drive it decides to start writing to.

On a happier note, iv’e improved performance by modifying the code to flush on file close

disk.hpp - FileDisk → Close
fflush(f_); //Add this line
::fclose(f_);

Build.bat
mkdir build-win
pushd build-win
cmake …
cmake --build . --config Release -j 6
popd

(make sure you have only Python 3.7 installed before building)

Update chia gui with new plotting binary
copy /y “C:\DevJunk\chiapos\build-win\release\chiapos.cp37-win_amd64.pyd” C:\Users\janedo\AppData\Local\chia-blockchain\app-1.1.6\resources\app.asar.unpacked\daemon\

I have code to optimise Phase1 - doubles the speed of 4 parallel jobs, but it is only going to save 5mins at the most.

And easier solution is to launch the jobs 5 min apart.

Job time for me, 4 HDD jobs, has gone from15hr to 11hr

1 Like

Another option MIGHT be something along these lines. The idea here being that the long running flushall does not block the computing threads.

But I have not tried it.

            //declare this globally, outside of FileDisk

            std::atomic<bool> done(true);

            //place this code in FileDisk Close

            if(done){

                done = false;

                if(t1.joinable())

                    t1.join();

                t1 = std::thread { [&done](){ flushall(); done = true; }};

            }

On an i5 (refurbished optiplex 7020) I almost max out my cpu running 6 parallel K32s or 3 K33s. If I push it any harder the cpu starts to lose MHz (overheat). I have all fans set always on bios already. Everything gums up and plotting speed drops to a crawl.

So, my first question to miguel would be, “Is your CPU usage at 100% as your parallels are added?”

Yes I have this on some older machines, and therefore the HDD cache bottlenecks are of no concern.

But in this case, the machine is Ryzen 3700x, and CPU is most certainly not bottlenecking

Ideally, the plotter would simply sequentially write to disk, and not fill up the write cache. But the way it works is by using many temp files simultaneously. If we were to flush too often, windows never gets a chance to optimise the “order” of data being written to the disk, and the writes become incredibly random, which as we all know slows a hard drive down to a crawl.

A good option would be to increase windows write cache but I cant find out how.

@NinjaDummie @miguel I’ve just confirmed, running Ubuntu in vbox, two Phase1 jobs running to two disks at the same time, each is running full pace 100+mb/s

You will want to empty the HDD, create a 320gb fixed VDI on each drive, and use a program like defraggler to make sure the vdi is contiguous and entirely located at the start of the disk

I know I am late to the party on this but they are not talking RAM discs here. They are talking HDD caching here. Think of it as RAM backed SWAP or USB drive backed SWAP (at the most basic level) but you can do more to it than just that. I don’t know all about it and I won’t do it personally. I tried something similar to a PCIe SSD and didn’t notice enough gains to justify keeping it going.

2 Likes

This kind of stuff (using RAM as cache, and the swapfile, not to mention RAM compression) is usually handled transparently by the operating system, so the software people peddling this solution are saying “we are smarter than the people writing the operating system”… unless it is some extremely specialized use case, I highly doubt that!

1 Like

Basically, as I understand it, and I really don’t, some people think that high speed storage or ram can speed up the page file or swap file. Some programs can “prebuffer” (kinda like internet videos buffer) your data to RAM or SSD and make your storage seem faster but in reality, it isn’t. My experience with trying it on a PCIe SSD (granted, only 256GB and only at full 4x interface) was what most people would consider a waste of time and energy.

After reading and studying this long thread, it seems there is no clear solution to slow multiple HDD plotting. I am running two HDDs for plotting; HDD1 is at 100%, HDD2 is at 0% waiting. Then the drives alternate. (I have a fast CPU barely touching 10% utilization and 128GB of RAM). There are some fancy primo cache software that seems to help, but not by much. Is this about right? Or is there a solution out there somewhere that I am missing? So confused…

Install primocache and enable write cache only. This solves the SATA HDD queue problem. Or you can just connect the HDDs in USB ( strangely it doesn’t cause the same SATA problem )

Thanks! I’ll try that!

I never saw how using HDDs to plot would be useful. In a process that depends on speed.
Splitting the process up between HDDs instead of one NVMe (or SSD) does not appear to me to have any chance of speeding the process.
I have had read some pretty intricate scenarios for how to accomplish fast HDD plotting.
If it is possible, it is tricky an expensive.

Hi Ninja,
Looks like PrimoCache is working good. I ran 4 parallel plots on 4 300gb 2.5 10k drives at the same time. All four finished in 11.15 hours. I’m going to test with 8 parallel plots on 8 300gb 2.5 10k drives next. But then I just saw Sloth’s video doing 24min plots… why even bother… I should just give up now.

I will have all the storage I have bought filled inside of two months. Unless things change drastically I will not be buying any more storage for Chia plots.

Unless you can afford a stack of storage there is no need to compete in the Chia plot speed Olympics.

1 Like