The damn Phase-3-Stuck! What to do?

Hello all!

once a day one of my plotter gets Stuck somewhere in phase 3…ALLWAYS in phase 3.
Sometime with high cpu load + no hdd-load, sometime with zero load at all.
I know that there are many peoples outside asking for this problem but i didnt read a solution jet.

Symptoms:

  • Plot hangs in Phase 3 Step 1 to 6 totaly random. No more progress
  • Sometime no load, somtime high load on cpu

For me:
im plotting on windows 10 with several older machines. The problems occures an the chia-plotter as on the madmay/stotiks - plotter

What i did so far:

  • Run machine without OC
  • double/tripple check all Disk smart data
  • swap SATA-Cables. Sometimes i seems to work, a few days later…not
  • run memtests / swap memory
  • changed driver

No solution worked for now.

All of my plotter use SW-Raid 0. Normaly setup with “Windows Storage Spaces”. I am feeling that here the problem COULD be.
I changed the config of one of my plotter from sw to hd raid and suddenly it runs smooth…but could be some random success.

On some machines i split up the raids again and plotted in parallel…one plot per SSD. Thought it would be a good idea: the plot who crashes indicate a defective drive…but than the plots crashed randomly…spread over all drives.

Any suggestion? No hints in windows logs etc… :frowning:

Thank you,
Grodahn

I’m chasing the same problem on Ubuntu. Random P3 freezes.

same here, ran out of idea’s. Thought I had it cleared up after chipset driver update…but yesterday another plot got stuck.

I did run for quite a few days without incident though. The only thing I changed since then was not having a separate temp_2 directory set. When I was running with temp2 set to another drive, I had no plots getting stuck. But I doubt that is the difference or just random luck

Bump. Any suggestions here?

Mine has hung twice in various phases, I’ve put it down to me moving my mouse over the powershell window and clicking a button, as I see the cursor square in the wrong place, now I wake screen with keyboard, just a thought.

I’ve had this problem to, you can pause and resume a process running in powershell.

Unfortunately this is not the same problem as the phase 3 freezes.
In Swar it was annoying, but now with Madmax it just stops plotting altogether so that’s much worse

Maybe we can gather Background-Infos / try to figure out a reason?

Maybe we make a count:

  • who gets P3-Freezes with SW-Raid
  • who gets P3-Freezes w/o SW-Raid

I’ve had it on both of my systems

One has 3x 1TB nvme raid0 (windows storage pool)
other has 1 x external ssd only

I’m going to switch to Ubuntu this weekend, but I see @69chargr also had this issue on Ubuntu, still got that?
Right now, even though Madmax is much faster, +40% on my total output based on plot times, I don’t get anywhere near that number because it keeps just stopping in phase 3 every now and then.

So I’m thinking of setting up a scheduled task but not sure if this will work, or how to do it.

With Madmax, when it crashes in phase 3, it just returns to the powershell commend promt.

So would it be possible to use task scheduler to detect if chia_plots.exe is running or not, and if not running start a new process?

The scheduler has the option to repeat every so many minutes and “not run new instance if already running”
But I think the task will not actually stop because the powershell/cmd window will stay open after the plotter crashes…

Any thought on how to do this?

Ideally a task/script would do this:

  • detect is plotter is running
  • if not, delete data from specified temp drives
  • restart the plotter

I no longer have this problem (for now). What I believed solved it was properly partitioning the two NVMe drives via command line instead of using the GUI on Ubuntu. I was able to step it back up to 256 buckets and it’s ran fine for about 100 plots now. It even persisted after a reboot which I was afraid of. Seems all is well so far.

1 Like

I got this reply in another topic which is worth a shot for sure.

The only time I was running without incident, was when I had CPU affinity set in Swar.
Now i am running madmax with 22 threads instead of the 24 (3900x)
and it’s now been running for a day without any plots getting stuck.
So to me it does seem likely that it’s a cpu issues, hope below solution will solve it

Any idea how to get to that setting?

Edit.

Nevermind, found it.