OK so this is weird.
I know I have done the work on this one, so bear with me. I recently scanned EVERY drive in a datacenter hosted JBOD with the following command:
.\chia plots check -n 5 -g C:\mounts\hd01
… just to make sure all the plots were good. That 4U JBOD has a number of drives, you get the idea, like
C:\mounts\hd01
C:\mounts\hd02
etcetera. Real standard computing stuff – Windows, Mac, Linux, basic drive mount points on any OS. That 4U JBOD is driven by a 1U brain server.
So, I scanned each drive with plots check -n 5
, and I removed any invalid plots. Each drive had from 0-2 plots that were invalid. This was 3 days ago. I scanned every drive and did a plot check on them.
Now fast forward to today, and I’m seeing bad plots get reported in the Chia GUI. Huh. That’s … odd? I scanned every drive a few days ago and made sure all the plots were good!
But here’s the Chia client telling me plots are bad and sure enough… when I double check… they are bad!
ERROR Failed to open file C:\mounts\hd23\plot-k32-2021-05-12-13-45-f0e26426307bedc4d43f53502d453fb52800d274e1c55f744ace2f67927a0a22.plot. Invalid plot header magic Traceback (most recent call last):
File "chia\plotting\plot_tools.py", line 189, in process_file
ValueError: Invalid plot header magic
2021-05-17T16:15:06.682 chia.plotting.plot_tools : ERROR Failed to open file C:\mounts\hd23\plot-k32-2021-05-12-13-50-117c991c91af976a63f7bd466a0cec19ac043deff4fb0f70fde6811d62df8158.plot. Invalid plot header magic Traceback (most recent call last):
File "chia\plotting\plot_tools.py", line 189, in process_file
ValueError: Invalid plot header magic
2021-05-17T16:15:06.682 chia.plotting.plot_tools : ERROR Failed to open file C:\mounts\hd23\plot-k32-2021-05-12-13-57-cba497cd80a9c14b835a49107d1323c9159397ff209014a3ef70aa540113d6ae.plot. Invalid plot header magic Traceback (most recent call last):
File "chia\plotting\plot_tools.py", line 189, in process_file
So this makes me wonder… can plot files go bad over time? This is kind of boggling my mind. A few possibilities:
- Maybe
-n 5
isn’t enough of a check? - Maybe the drives are actually bad?
- Maybe there’s a communication error between the brain 1U and the 4U JBOD? But if so I’d expect that to show up randomly, here it is repeatable, it’s just these specific files that show up bad.
The error is always failed to open {plot}. Invalid plot header magic.
The plot farming process does not write to the disks at all, so I’m kind of wondering how plot files that were previously tested good via plots check
could turn into bad plots, with no disk write activity? Nothing is writing to these disks, the only thing running on the machine is the Chia farming GUI.
I know the brain system itself, the 1U driving the JBOD, is definitely stable, it’s a Xeon, it’s got ECC memory, it passed memtest, it passed prime95/mprime overnight…