Why does one queue suddenly stop?

Keermalec · June 7, 2021, 8:40pm

I launched 6 queues and n° 2 just stopped about an hour after starting, so it never even completed phase 1:

/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q1.log:Time for phase 1 = 5655.434 seconds. CPU (230.220%) Mon Jun  7 16:47:50 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q1.log:Time for phase 2 = 4875.892 seconds. CPU (95.040%) Mon Jun  7 18:09:06 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q3.log:Time for phase 1 = 6557.568 seconds. CPU (223.790%) Mon Jun  7 18:07:38 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q3.log:Time for phase 2 = 4760.643 seconds. CPU (95.310%) Mon Jun  7 19:26:58 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q4.log:Time for phase 1 = 6767.380 seconds. CPU (222.640%) Mon Jun  7 18:43:32 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q4.log:Time for phase 2 = 4583.049 seconds. CPU (96.040%) Mon Jun  7 19:59:55 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q5.log:Time for phase 1 = 6542.023 seconds. CPU (225.660%) Mon Jun  7 19:12:12 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q5.log:Time for phase 2 = 4422.479 seconds. CPU (96.790%) Mon Jun  7 20:25:54 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q5.log:Time for phase 3 = 7113.610 seconds. CPU (99.150%) Mon Jun  7 22:24:28 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q5.log:Time for phase 4 = 470.096 seconds. CPU (99.950%) Mon Jun  7 22:32:18 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q6.log:Time for phase 1 = 6491.231 seconds. CPU (224.440%) Mon Jun  7 19:43:45 2021
/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/Q6.log:Time for phase 2 = 4226.215 seconds. CPU (98.060%) Mon Jun  7 20:54:11 2021

Is there any way to re-enable it? Or do I have to erase its temporary files and start a new queue?

Voodoo · June 7, 2021, 9:12pm

don’t know why it happened, but yes you have to delete temp files and start again…no way to resume unfortunately.

pretty good times for 6 plots together, what cpu are you running?

Keermalec · June 9, 2021, 7:43pm

Ryzen 9 3900x

But those times are not really good: only queue 5 has gone beyond phase 2. The others simply stopped…

Keermalec · June 10, 2021, 2:08am

So I ran the queues again and looked closely at the output messages.

Queue 3 stopped during phase 1 with the following message:

Starting plotting progress into temporary dirs: /mnt/3cb719c9-7e1f-4d60-bbdb-4bf80fde1e55/Q5 and /mnt/3cb719c9-7e1f-4d60-bbdb-4bf80fde1e55/Q5
ID: 23839e32c42a2dd4ed64d9861013e3beebd37ac10c5e6e1525e75a0154fb65b8
Plot size is: 32
Buffer size is: 6780MiB
Using 128 buckets
Using 6 threads of stripe size 65536

Starting phase 1/4: Forward Propagation into tmp files... Thu Jun 10 01:37:44 2021
Computing table 1
F1 complete, time: 108.834 seconds. CPU (139.09%) Thu Jun 10 01:39:33 2021
Computing table 2
	Bucket 0 uniform sort. Ram: 6.548GiB, u_sort min: 1.125GiB, qs min: 0.281GiB.
...
	Bucket 77 uniform sort. Ram: 6.548GiB, u_sort min: 0.563GiB, qs min: 0.281GiB.
corrupted size vs. prev_size

Any one know what the error “corrupted size vs. prev_size” means?

Also queue 6 now stopped with the following message:
“RuntimeError: std::exception”

Forward propagation table time: 1014.883 seconds. CPU (223.480%) Thu Jun 10 08:02:39 2021
Computing table 4
	Bucket 0 uniform sort. Ram: 6.548GiB, u_sort min: 1.625GiB, qs min: 0.812GiB.
...
	Bucket 127 uniform sort. Ram: 6.548GiB, u_sort min: 3.250GiB, qs min: 0.813GiB.
	Total matches: 4294779938
Caught plotting error: Matches do not match with number of write entries 4294779938 4294779800
Traceback (most recent call last):
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/bin/chia", line 33, in <module>
    sys.exit(load_entry_point('chia-blockchain', 'console_scripts', 'chia')())
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/chia/cmds/chia.py", line 77, in main
    cli()  # pylint: disable=no-value-for-parameter
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/venv/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/chia/cmds/plots.py", line 135, in create_cmd
    create_plots(Params(), ctx.obj["root_path"])
  File "/media/marco/12TB_TOSHIBA/CHIA_BLOCKCHAIN/chia-blockchain/chia/plotting/create_plots.py", line 164, in create_plots
    plotter.create_plot_disk(
RuntimeError: std::exception

Two other queues are simply frozen at

Total compress table time: 1105.533 seconds. CPU (96.500%) Thu Jun 10 03:00:23 2021
Compressing tables 2 and 3
	Bucket 0 uniform sort. Ram: 3.274GiB, u_sort min: 1.250GiB, qs min: 0.313GiB.
...
	Bucket 13 uniform sort. Ram: 3.274GiB, u_sort min: 0.500GiB, qs min: 0.250GiB.

and

Total compress table time: 1098.155 seconds. CPU (96.200%) Thu Jun 10 02:21:53 2021
Compressing tables 2 and 3
	Bucket 0 uniform sort. Ram: 3.274GiB, u_sort min: 1.250GiB, qs min: 0.313GiB.
...
	Bucket 62 uniform sort. Ram: 3.274GiB, u_sort min: 0.500GiB, qs min: 0.250GiB.

Many thanks