Farming stops and keeps "Reconnecting to peer"

I’m running Ubuntu 20.04 and chia 1.2.11. About every 2 days the farming stops with this log. There are no changes being made on that computer that would clash with 8444 port.

2021-11-18T11:54:24.009 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 300, remaining 476, duration: 0.01 seconds
2021-11-18T11:54:24.018 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 300, remaining 176, duration: 0.01 seconds
2021-11-18T11:54:24.026 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 176, remaining 0, duration: 0.01 seconds
2021-11-18T11:54:24.027 harvester chia.harvester.harvester: INFO     refresh_batch: event done, loaded 0, removed 0, processed 11876, remaining 0, duration: 0.43 seconds
2021-11-18T11:54:24.817 wallet wallet                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:54:36.151 farmer farmer                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:54:58.826 wallet wallet                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:55:09.812 farmer farmer                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:55:32.812 wallet wallet                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:55:43.810 farmer farmer                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:56:02.835 daemon __main__                   : INFO     Websocket exception. Closing websocket with chia_full_node code = 1006 (connection closed abnormally [internal]), no reason Traceback (most recent call last):
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 827, in transfer_data
    message = await self.read_message()
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 895, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 971, in read_data_frame
    frame = await self.read_frame(max_size)
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 1047, in read_frame
    frame = await Frame.read(
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/framing.py", line 105, in read
    data = await reader(2)
  File "/usr/lib/python3.8/asyncio/streams.py", line 723, in readexactly
    await self._wait_for_data('readexactly')
  File "/usr/lib/python3.8/asyncio/streams.py", line 517, in _wait_for_data
    await self._waiter
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "../chia/daemon/server.py", line 195, in safe_handle
    async for message in websocket:
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 439, in __aiter__
    yield await self.recv()
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 509, in recv
    await self.ensure_open()
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/websockets/protocol.py", line 812, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: code = 1006 (connection closed abnormally [internal]), no reason

2021-11-18T11:56:06.818 wallet wallet                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:56:17.813 farmer farmer                     : INFO     Reconnecting to peer {'host': '127.0.0.1', 'port': 8444}
2021-11-18T11:56:31.423 harvester chia.plotting.util      : WARNING  Directory: /home/thomas/exports_antony/merged_antony does not exist.
2021-11-18T11:56:31.427 harvester chia.harvester.harvester: INFO     refresh_batch: event started, loaded 0, removed 0, processed 0, remaining 11876, duration: 0.00 seconds
2021-11-18T11:56:31.698 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 300, remaining 11576, duration: 0.01 seconds
2021-11-18T11:56:31.719 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 300, remaining 11276, duration: 0.02 seconds
2021-11-18T11:56:31.735 harvester chia.harvester.harvester: INFO     refresh_batch: event batch_processed, loaded 0, removed 0, processed 300, remaining 10976, duration: 0.01 seconds

If I run chia start farmer -r I get this:

(venv) thomas@ben:~/chia-blockchain$ chia start farmer -r
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@             WARNING: UNPROTECTED SSL FILE!              @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0664 for '/home/thomas/chia-blockchain/mozilla-ca/cacert.pem' are too open. Expected 0644
One or more SSL files were found with permission issues.
Run `chia init --fix-ssl-permissions` to fix issues.
chia_harvester: stderr: /home/thomas/chia-blockchain/chia/harvester/harvester.py:101: RuntimeWarning: coroutine 'HarvesterAPI.new_signage_point_harvester.<locals>.lookup_challenge' was never awaited
  {
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
stopped
chia_harvester: started
chia_farmer: stopped
chia_farmer: started
chia_full_node: started
chia_wallet: stopped
chia_wallet: started
(venv) thomas@ben:~/chia-blockchain$ stderr: Traceback (most recent call last):
  File "/home/thomas/chia-blockchain/venv/bin/chia_full_node", line 33, in <module>
    sys.exit(load_entry_point('chia-blockchain', 'console_scripts', 'chia_full_node')())
  File "/home/thomas/chia-blockchain/chia/server/start_full_node.py", line 60, in main
    return run_service(**kwargs)
  File "/home/thomas/chia-blockchain/chia/server/start_service.py", line 255, in run_service
    return asyncio.run(async_run_service(*args, **kwargs))
  File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/thomas/chia-blockchain/chia/server/start_service.py", line 249, in async_run_service
    return await service.run()
  File "/home/thomas/chia-blockchain/chia/server/start_service.py", line 171, in run
    await self.start()
  File "/home/thomas/chia-blockchain/chia/server/start_service.py", line 143, in start
    await self._server.start_server(self._on_connect_callback)
  File "/home/thomas/chia-blockchain/chia/server/server.py", line 242, in start_server
    await self.site.start()
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/usr/lib/python3.8/asyncio/base_events.py", line 1463, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 98] error while attempting to bind on address ('::', 8444, 0, 0): address already in use
stderr: Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/aiosqlite/core.py", line 109, in run
    get_loop(future).call_soon_threadsafe(set_result, future, result)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 764, in call_soon_threadsafe
    self._check_closed()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/thomas/chia-blockchain/venv/lib/python3.8/site-packages/aiosqlite/core.py", line 117, in run
    get_loop(future).call_soon_threadsafe(set_exception, future, e)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 764, in call_soon_threadsafe
    self._check_closed()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

Do you have other example logs where this issue came up? Would be good to see if it always follows a plot refresh event (harvester chia.harvester.harvester: INFO refresh_batch: event done ....).

I occasionally see something similar, but on a Windows 10 box, Chia v1.2.11 also.

Console output:

concurrent\futures\thread.py:176: RuntimeWarning: coroutine 'HarvesterAPI.new_signage_point_harvester.<locals>.lookup_challenge' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

From the log (I assume this is correlated, it may not be but theirs no other errors around the time but no direct correlation with the message on the console):

2021-11-23T13:41:32.573 harvester chia.harvester.harvester: INFO     refresh_batch: event started, loaded 0, removed 0, processed 0, remaining 701, duration: 0.00 seconds
2021-11-23T13:41:43.187 harvester harvester_server        : ERROR    Exception: dictionary changed size during iteration, {'host': 'IP ADDRESS REMOVED', 'port': 8447}. Traceback (most recent call last):
  File "chia\server\server.py", line 566, in wrapped_coroutine
  File "chia\harvester\harvester_api.py", line 185, in new_signage_point_harvester
RuntimeError: dictionary changed size during iteration

2021-11-23T13:41:43.188 harvester harvester_server        : ERROR    Exception: dictionary changed size during iteration <class 'RuntimeError'>, closing connection {'host': 'IP ADDRESS REMOVED', 'port': 8447}. Traceback (most recent call last):
  File "chia\server\server.py", line 576, in api_call
  File "asyncio\tasks.py", line 481, in wait_for
  File "chia\server\server.py", line 573, in wrapped_coroutine
  File "chia\server\server.py", line 566, in wrapped_coroutine
  File "chia\harvester\harvester_api.py", line 185, in new_signage_point_harvester
RuntimeError: dictionary changed size during iteration

2021-11-23T13:41:43.196 harvester harvester_server        : INFO     Connection closed: IP ADDRESS REMOVED, node id: a..............c
2021-11-23T13:41:43.197 harvester harvester_server        : WARNING  Banning IP ADDRESS REMOVED for 10 seconds
2021-11-23T13:41:43.197 harvester chia.harvester.harvester: INFO     peer disconnected {'host': 'IP ADDRESS REMOVED', 'port': 8447}

I’m seeing it only on one machine out of three at present.

Notably the machine I see it on has plots that are being created (copied from another box) into a folder that’s also farmed on that machine, which might explain the “dictionary changed size during iteration” error.

My guess in this case would be the dictionary holding the list of plots is updated on another thread because of a new file, but it was being iterated over to look for a plot, and hence broke.

This is a known python Exception usually own found while bug hunting new code. I could find no mention of any other user having this as a Chia ERROR.

It looks like it was caused by you dropping your wallet peer but I am not sure. Probably some combination of dropping the peer and the plotting you mentioned.

Do you have farmr or another dashboard installed? With this tool you could check to see if you are just having a hiccup or a real problem.

I don’t have farmr or any dashboard installed, it’s very basic.

I suspect it’s somewhat of an unusual race condition (hence why probably not many other people reporting it), where the plot file list is updated because of a new plot on one thread whilst another thread iterates over that list to check the plots (that would certainly align with the errors in the log and the timing also shown in the log).

Ahh the issue that never goes away. It’s been like that since I joined in March. Happens from time to time.