Node loses connection arbitrarily

Lsherring · September 18, 2022, 5:20pm

Hi Team:

My node has been acting very bizarrely over the last month. It loses connection to the wallet or the node goes not synched for no reason. Logs say it lost connection to the wallet. Nothing has been done to the node in over a month. I’m running 1.51 and I have tried wired and wireless with the same results. I have closed down and made sure all was closed and started up. I have also done the same and done a hard reset with the same results. The logs just complain about the connection lost to the wallet. Any thoughts on this would be helpful. Jacek, you know my setup, any thoughts? TY as always for reading this P.S., the node will run anywhere from an hour to a couple of days and everything in between before it end up in this state.

drhicom · September 18, 2022, 5:48pm

Just asking, is your famer and harvester, wallet on the same machine? I’m running window 10 and my other harvester is on windows 10 also, harvester started by cli.

seymour.krelborn · September 18, 2022, 5:55pm

Which OS?

What is it that conveys you are not connected to your wallet?
Is it a message on your GUI? What is the message, and on which tab? Or does your wallet number disappear from the top of the GUI?

Does your XCH balance change (go to zero)? Is that how you are catching that your wallet is disconnecting?

When that happens, do you still have internet connectivity?
Do you get resynced automatically, or only after you restart services or restart your computer?

Do you mean pressing your computer’s reset button while the OS is running, or holding the power button until your computer powers off?

Jacek · September 18, 2022, 7:00pm

All your modules are on the same box, so for internal communication (e.g., node to wallet) it really doesn’t matter whether you have an Ethernet or WiFi. There is no easy to identify reason why such connection would go down, other than bad code.

This may be the primary reason, and the wallet, etc. problems may be just secondary. So, I would start here.

Assuming that this is not v1.5.1 issue, the first thing to check would be the drive that your bc and wallet dbs are sitting on. Maybe that drive has some issues? I don’t recall whether that is a separate NVMe, or your OS drive.

Although, few people reported problems with v1.5.1. Maybe this is also the case here.

Lsherring · September 18, 2022, 8:36pm

Yes, everything is on the same rig with Win 10.

drhicom · September 18, 2022, 8:41pm

enough space on your C:\ drive??

Lsherring · September 18, 2022, 8:41pm

Internet is up and running when this comes up, I only get resynched by stopping and restarting. I did a reboot of the system again today. When you look at the logs, it says it lost connection to the wallet’s address. The balance never changes where it goes away. After stopping Chia, I rebooted (sorry) and did not hold the button. Currently, it is up, but I don’t know for how long. LOL

Lsherring · September 18, 2022, 8:42pm

Yes, I have the OS on a 2tb seagate enterprise HDD and my dbase on the NVME. I have lots of space on my c drive as I do nothing with this rig but farm away.

drhicom · September 18, 2022, 8:43pm

Scan both of these disks just to check health?? How often does the nvme get trimmed?

Lsherring · September 18, 2022, 8:45pm

Yes, I will be doing that next if it acts up, which Im sure it will. ugggg…TY all for your thoughts as it is a frustrating issue and the logs don’t help at all.

Jacek · September 18, 2022, 8:54pm

Maybe you can run “findstr ERROR debug.log” and it will show some other errors (or " findstr ERROR %userprofile%\.chia\mainnet\log\debug.log" if in doubt)?

Fuzeguy · September 18, 2022, 9:14pm

That would normally be ok for a very low end PC, but is it possible that the Chia app just gets out of sync running between a super slow HD and a fast nvme? Why not put in, at a minimum, a 500GB SATA drive, just to be sure they two drives are not so far apart, performance wise, that they screw up? Should take but a few minutes to clone your boot drive on the ssd and see if the issue remains.

Jacek · September 18, 2022, 9:33pm

Not really. Most of what is running from the boot drive runs from memory. The only thing are the chia logs that could be really hitting hard the OS drive (either dust storm, or yet another bug spamming the INFO logs). Still, the concern about bc drive is that both bc and wallet will start hitting NVMe with small chunks choking the main start_farmer process and chia usually goes out of sync when it happens (remember, a single physical core, and/or sqlite getting hammered).

Still, HD problems may cause problems with those log files being written, thus killing chia. So, chkdsk /f on both drives on the next reboot would be helpful.

By the way, @Lsherring check your Event Viewer for errors. If any drive is the culprit, there should be some errors about it there.

seymour.krelborn · September 18, 2022, 9:47pm

@Lsherring
I am running version 1.5.0 on Windows Home.
I periodically find myself not synced, perhaps one or twice every few days. It usually lasts for 10 or 20 minutes, and it takes care of itself. And all the while, my internet connection is good.

If it lags any longer, I will either:
1:
Stop the GUI.
Verify, via taskmgr.exe, that all chia processes have closed.
Start the GUI.

– or –

2:
Run:
chia start -r all

The next time it happens, I will see if my logs complain about my wallet. If my log contains the same as yours, then it might be a version 1.5.x issue.

drhicom · September 18, 2022, 9:50pm

Samsung 870 EVO 500GB Sata SSD

Fuzeguy · September 18, 2022, 11:15pm

It’s called troubleshooting. This is an easy step to take, and saying that it is not the issue, doesn’t mean it is not the issue. Any PC in 2022, IMO, that’s is running a rust-OS drive is nuts. Esp one of a knowledgeable PC literate person such as @Lsherring It should be on ssd regardless.

drhicom · September 18, 2022, 11:33pm

Come on, maybe he wats ti string a bunch of seagate ST506 drives together.

Lsherring · September 19, 2022, 12:40am

I just checked and it is still up and running. As an FYI I have three SSD’s (2 NVME) and all of my HDDs are Seagate Enterprise. All quality in this rig.

Lsherring · September 19, 2022, 12:45am

FYI, in 1980 the first 1 GB (What is a Gigabyte?) hard drive was developed by IBM, weighing 550 pounds and costing $40,000. We have come a long way in 42 years. This is when I was a field engineer and you actually had to know your engineering and follow flow maps that were the size of a desk and put in a massive binder.

Fuzeguy · September 19, 2022, 1:24am

Still, enterprise or not - rust-disks are not for OSs I don’t think, not any more, or at least they shouldn’t be!

These were like my 1st drives, thinking it was 5MB, lol, too long ago… memory fades … but yeah >