Bladebit Diskplot cache and bucket advice please

Plotting on an old machine using the GUI Chia blackchain 2.1.3.
2200G, 32GB 2666MHz
SN750 1TB M.2 ssd, 3000MB/s
Seagate Exos 16, 250 MB/s

Currently churning at 1.65 hours per plot, single queue.

Any opportunity for optimization?

Thanks in advance.

Yea, up your system memory if possible, I’m running 256gb and have no issues…

An EPYC would be nice. I moved my plotter over to a 5600G machine. Using 8 threads, it’s satisfactory.

A snip of the log of the first plot:

[Bladebit Disk Plotter]
Heap size : 3.37 GiB ( 3447.82 MiB )
Cache size : 0.00 GiB ( 0.00 MiB )
Bucket count : 256
Alternating I/O: true
F1 threads : 8
FP threads : 8
C threads : 8
P2 threads : 8
P3 threads : 8
I/O threads : 1
Temp1 block sz : 4096
Temp2 block sz : 4096
Temp1 path : /media/napTime/SN750
Temp2 path : /media/napTime/SN750
I/O metrices enabled.
Allocating memory

Generating plot 1 / 1: 28ea1394ee49776ebf4eb8c12d59cdac81f0028dc0e683905a2d00a3d2e192a5
Plot temporary file: /media/napTime/plots1/plot-k32-2023-12-29-08-55-28ea1394ee49776ebf4eb8c12d59cdac81f0028dc0e683905a2d00a3d2e192a5.plot.tmp

Started plot.
Running Phase 1
Table 1: F1 generation
Generating f1…
Finished f1 generation in 28.46 seconds.
Progress update: 0.01
Table 1 I/O wait time: 28.46 seconds.
Table 1 Disk Write Metrics:
Average write throughput 1195.98 MiB ( 1254.07 MB ) or 1.17 GiB ( 1.25 GB ).
Total size written: 33278.56 MiB ( 34895.10 MB ) or 32.50 GiB ( 34.90 GB ).
Total write commands: 131072.

Table 2
Sorting : Completed in 63.71 seconds.
Distribution : Completed in 16.98 seconds.
Matching : Completed in 93.58 seconds.
Fx : Completed in 78.57 seconds.
Completed table 2 in 266.09 seconds with 4294963523 entries.
Progress update: 0.06
Table 2 I/O wait time: 0.31 seconds.
Table 2 I/O Metrics:
Average read throughput 587.08 MiB ( 615.60 MB ) or 0.57 GiB ( 0.62 GB ).
Total size read: 33278.56 MiB ( 34895.10 MB ) or 32.50 GiB ( 34.90 GB ).
Total read commands: 131072.
Average write throughput 1189.51 MiB ( 1247.29 MB ) or 1.16 GiB ( 1.25 GB ).
Total size written: 100093.46 MiB ( 104955.60 MB ) or 97.75 GiB ( 104.96 GB ).
Total write commands: 197121.

Table 3
Sorting : Completed in 87.38 seconds.
Distribution : Completed in 20.79 seconds.
Matching : Completed in 86.67 seconds.
Fx : Completed in 81.70 seconds.
Completed table 3 in 314.14 seconds with 4294939318 entries.
Progress update: 0.12
Table 3 I/O wait time: 0.25 seconds.
Table 3 I/O Metrics:
Average read throughput 591.41 MiB ( 620.14 MB ) or 0.58 GiB ( 0.62 GB ).
Total size read: 66301.47 MiB ( 69522.13 MB ) or 64.75 GiB ( 69.52 GB ).
Total read commands: 196608.
Average write throughput 1332.31 MiB ( 1397.02 MB ) or 1.30 GiB ( 1.40 GB ).
Total size written: 145148.36 MiB ( 152199.08 MB ) or 141.75 GiB ( 152.20 GB ).
Total write commands: 262402.

Table 4
Sorting : Completed in 104.49 seconds.
Distribution : Completed in 25.25 seconds.
Matching : Completed in 81.01 seconds.
Fx : Completed in 88.90 seconds.
Completed table 4 in 354.12 seconds with 4294746318 entries.
Progress update: 0.2
Table 4 I/O wait time: 0.32 seconds.
Table 4 I/O Metrics:
Average read throughput 887.88 MiB ( 931.01 MB ) or 0.87 GiB ( 0.93 GB ).
Total size read: 99068.49 MiB ( 103880.84 MB ) or 96.75 GiB ( 103.88 GB ).
Total read commands: 196608.
Average write throughput 1347.54 MiB ( 1413.00 MB ) or 1.32 GiB ( 1.41 GB ).
Total size written: 145142.76 MiB ( 152193.21 MB ) or 141.74 GiB ( 152.19 GB ).
Total write commands: 262402.

Table 5
Sorting : Completed in 90.32 seconds.
Distribution : Completed in 21.52 seconds.
Matching : Completed in 87.69 seconds.
Fx : Completed in 84.51 seconds.
Completed table 5 in 323.95 seconds with 4294442886 entries.
Progress update: 0.28
Table 5 I/O wait time: 0.20 seconds.
Table 5 I/O Metrics:
Average read throughput 820.45 MiB ( 860.30 MB ) or 0.80 GiB ( 0.86 GB ).
Total size read: 99063.84 MiB ( 103875.96 MB ) or 96.74 GiB ( 103.88 GB ).
Total read commands: 196608.
Average write throughput 1401.42 MiB ( 1469.50 MB ) or 1.37 GiB ( 1.47 GB ).
Total size written: 145133.48 MiB ( 152183.48 MB ) or 141.73 GiB ( 152.18 GB ).
Total write commands: 262402.

Table 6
Sorting : Completed in 97.09 seconds.
Distribution : Completed in 21.11 seconds.
Matching : Completed in 83.18 seconds.
Fx : Completed in 91.54 seconds.
Completed table 6 in 342.21 seconds with 4293911893 entries.
Progress update: 0.36
Table 6 I/O wait time: 0.30 seconds.
Table 6 I/O Metrics:
Average read throughput 871.59 MiB ( 913.93 MB ) or 0.85 GiB ( 0.91 GB ).
Total size read: 99057.08 MiB ( 103868.87 MB ) or 96.74 GiB ( 103.87 GB ).
Total read commands: 196608.
Average write throughput 1187.09 MiB ( 1244.76 MB ) or 1.16 GiB ( 1.24 GB ).
Total size written: 112357.71 MiB ( 117815.60 MB ) or 109.72 GiB ( 117.82 GB ).
Total write commands: 262402.

Table 7
Sorting : Completed in 82.84 seconds.
Distribution : Completed in 9.99 seconds.
Matching : Completed in 89.36 seconds.
Fx : Completed in 80.61 seconds.
Completed table 7 in 301.96 seconds with 4292827249 entries.
Progress update: 0.42
Table 7 I/O wait time: 0.08 seconds.
Table 7 I/O Metrics:
Average read throughput 574.79 MiB ( 602.71 MB ) or 0.56 GiB ( 0.60 GB ).
Total size read: 66285.49 MiB ( 69505.38 MB ) or 64.73 GiB ( 69.51 GB ).
Total read commands: 196608.
Average write throughput 1090.82 MiB ( 1143.80 MB ) or 1.07 GiB ( 1.14 GB ).
Total size written: 79326.50 MiB ( 83179.86 MB ) or 77.47 GiB ( 83.18 GB ).
Total write commands: 196866.

etc.

It’s using 6.3 GB of memory so far and 1/3 of the SSD at around 300GiB.
It’s using 12 worker threads of the 8 I allocated. System outside the chia app is responsive.
So… solved. 1.1 hours for a plot.

Do you have a GPU?

Even 66 minutes is seriously slow these days.

75 watt cpu?
Comparing cpu mark stats: 7.7 times slower
5600G 19,905
vs.
AMD Ryzen Threadripper PRO 7995WX 153,096

1/7 (13%) the performance at 3% of the cpu cost. This metric is ok. raw CPU per dollar.

Ref: PassMark Intel vs AMD CPU Benchmarks - High End

Not sure if the numbers jive with plotting performance, but the correlate approximately.

The cost number… got it off Google Shopping with reputable vendors. Varies.

I’m running an AMD GPU. I’ve got a an old GTX 750 Ti on the shelf, 2GB @ 128 bit. Would there be an incremental boost?

Shading Units 640

TMUs 40

ROPs 16

SMM Count 5

L1 Cache 64 KB (per SMM)

L2 Cache 2 MB

@ Ronski
Seeing a 25% improvement on Table 2 with the 750 Ti. I’m surprised. I thought the memory on that GPU would be too small. Thanks for motivating me in that direction. Heading out to the pub, Friday and all. Thanks.

Copying @drhicom

…20

1 Like

I’m surprised a 750Ti works at all, and I thought 8GB was the minimum requirement for plotting.

Did you use the Cuda plotter?

I’ll try the cuda plotter.

Heh heh. Welll memory size halted that layout. Back to disk plot. 11 threads didn’t work that well in the past. Trying 10 threads now. With 8 threads I could open a browser and surf. With 11 even the chia gui would slow down to a crawl. The disk diskplot app will keep on chugging if the GUI is exited…
requiring a reboot to shut it down for clean restart. …which is often with these tryout runs.
I notice the diskplot threads running at priority 0 here on Linux Mint. Not sure if the I/O threads are at the same priority or wether they should be raised or not. Just a side thought, re using all threads at step lower priority with less nice I/O …
I’m impressed with all this stuff. Kudos. I’ll get back after this plot is done.

59 minutes with 10 threads.

If you’ve got much plotting to do get a GPU, I was doing 3 minute plots may be less with a 3080, my Tesla P4 was 7 minutes, combined IIRC that was around 600 to 700 a day.

I wouldn’t recommend a P4 Tesla for BB but for GH it works and is cheap, for BB you need 20 series or higher, 10 series did work but would crash at times, especially farming, but plotting seemed to suffer more recently.

PS I think the fastest plot was 49 seconds.

That’s fast. I’ll see if I can dig one up. Just disabling my nic got me another 7 minutes faster. I noticed the led on the ethernet connector flashing.

Just checked the times, GH C19 on the 3080 126 seconds, and on the P4 350 seconds, which equates to around 932 plots or 56TiB a day. That is on an old T7910 with 512GB DDR4 ram. I think in practice it actually worked out less than that, as times varied a bit.

Even with a GPU you will be restricted due to only having 32GB of ram, but should quite a bit quicker.

Heh heh. Whew. I’m going to let it run for now and get some shut eye. Thanks.

There are some 8GB GTX 1070s well priced on Kijiji. I tried two local guys but no reply, probably gone. So finished plotting. Done. Fini. Used dpkg to clean up. Back to the farm.

1 Like

Don’t forget to TRIM your temp NVMe periodically or performance will suffer over time. On Linux use “sudo fstrim ” or better yet run it on a schedule in a separate terminal window/tab with “sudo watch -n 900 fstrim -v ”. Play with -n if needed (value in seconds) and -v is just more verbose output.

Disabling NIC was likely a placebo as in your case few min could be a result of launching an app when plot is being created. Without purchasing any additional hardware I would recommend checking how tmp volume is formatted and making sure it does not have ATIME enabled (records access date/time) which will cause significant delay.

Above is assuming you are plotting on Linux and if you are using Win or WSL on Win then switching to Linux will yield significant performance increase.

Thanks dctech. That 900 number sounds good.
Well… I turned off automatic updates. Disabled the NIC. I guess I left chrony(NTP) running. With the NIC disabled there might have been some churn checking for connectivity.
Not sure if periodic TRIM is enabled by default in Debian/Ubuntu. I think it is, and didn’t dig into the periodic parameters.

Also, XFS or F2FS for the NVMe drives are in the running for faster filesystem. (ref: Phoronix XFS / EXT4 / Btrfs / F2FS / NILFS2 Performance On Linux 5.8 - Phoronix )
I used ext4. I periodically delete my OS partition and reinstall with minimal customizations. So tuning the system has gone out the window. I can spend weeks on that and then bork it. An Nvidia GPU with 8GB+ was my hurdle. Cheapskate? Yep. A little patience was involved.