Sync geting stuck 10 minutes after starting

Still experiencing some odd sync issues. Using Windows GUI v1.1.5, all good when starting the node but it gets out of sync after ~10 minutes and never recovers. I have to shut it down and reopen it to get back to sync.
Something is not right and can’t detect any issues on the log that might be causing this behavior.
Any ideas?

Thanks.

Hey, have more than 10 peers connected?

Yes, I have more than 50 peers connected. Ports are open since it was working before the 1.1.5 upgrade.

UPDATE 1
So node sync is going on and off now.
Finding this errors on the log:

Don't have rc hash ......... caching signage point 42.
Signage point 42 not added, CC challenge
......
2021-05-10T01:41:49.519 full_node chia.full_node.full_node_store: INFO     Don't have challenge hash ......., caching EOS
2021-05-10T01:41:49.526 full_node chia.full_node.full_node: INFO     End of slot not added CC challenge .........

UPDATE 2
Adding more odd log messages:

2021-05-10T01:49:51.173 full_node chia.full_node.full_node: INFO      Updated peak to height 256335, weight 11227135, hh e5c8b8ffb713d2fcf3a5fa370d7d4def3b39179bd5b6cf3bd7e60d7c191bdfb5, forked at 256334, rh: 4611927f82f5873aab7c9aca1eb2a08586a3ed034d096a84e73c05dac778d4e0, total iters: 832356216128, overflow: False, deficit: 13, difficulty: 215, sub slot iters: 110624768, Generator size: No tx, Generator ref list size: No tx
2021-05-10T01:49:51.180 full_node chia.full_node.mempool_manager: INFO     add_spendbundle took 0.0 seconds

Should I do a clean install of v1.1.5?

EDIT: Consolidating all posts in one.

Tried a clean install of v1.1.5 but having same issue.
I open Windows GUI and everything is syncing fine. After 10-15m, it slows down and gets behind other peers. Eventually becomes out of sync and never recovers.
I shut it down, reopen the Windows GUI and it gets synced.
Definitely something is clogging the node and I don’t know what it is besides what I posted above.

You said you did port 8444 forwarding correct? Did you disable UPnP in the config file also?

Yes, port is open and UPnP disabled. I have more than 30 peers connected. Like I said, works fine on first 10-15m but then gets stuck.
Seems like all the connected peers are behind height block.
Maybe there’s a limit on the number of connected peers, and once I reach that limit, it stops any connection to new ones. So I end up with peers not updated.

Hello,

when do we disable UPnP? If running harvesters and then disable UPnP on full node? or running multiple nodes in same network?

Ok just checking. What could happen after 10-15 minutes…is there any power settings that might be interfering?

There are two ways to provide a full node with the correct connectivity; UPnP or port forward 8444. If you decide to port forward 8444 you must disable UPnP in the config for the full node.

1 Like

Not sure I understand your question about power settings. CPU is doing fine, no change on the system.
Maybe this is some fallout from yesterday’s issue.
What I’ve been doing is closing the Windows GUI and reopen it to get back farming.

I’m just trying to understand, even i am running only one full node and i am using port forwarding, i should disbale UPnP, right?

Right, you still need to disable it. Because then it will waste cycles trying to use UPnP if you don’t.

1 Like

Windows has a power plan that you can pick. Sometimes the power plan reduces power to hard drives or networking if it’s a usb device.

Oh, the power management settings in windows. Yeah, that’s all good. No sleeps are allowed.

Looks like I finally got this issue sorted thanks to @sogente711 from keybase chat.
I changed the config.yaml param target_outbound_peer_count to 50 and so far the node is working fine and not getting stuck.

2 Likes

Ohhhh so it was too many peers. This is good information. Thanks!

Yeah, something on the peer management is clogging the system and it became more apparent due to the netspace catching up.
Something that DEVs should take a look at.

Well that was good while it lasted… back to same issue again.
So false alarm :sob:

I have the some problem…

Mine is at target_outbound_peer_count: 8
Should I leave it like that or change it to 50?

Could you figure out what this log means?
chia.full_node.mempool_manager: INFO add_spendbundle