Gigahorse Compressed GPU Plot Client Released! Start Plotting Now!

you’re dodging again. You should become a politician :wink:.
So again very directly.

Why your GPU Plotter and not CHIA OFFICIAL GPU Plotter.
yes you are the first but chia will be out soon too.

Basically all of the above. I was right, you are trolling, nice work :wink:

Thanks @madMAx43v3r!

I was pondering on waiting for Chia’s version… but their version is probably a month out or more… an then there will be bugs and a period before it’s all 100% working…

I think once Flexpool.io gives the all clear that their Flexfarmer supports your compressed plots, I’m going with yours…

Leaning that direction

A month, but what year is the question???

2 Likes

Hi guys, what mistake am I doing here?
Needs to create a note on how to install and configure this. It is hard when you don’t know all its works. I’m a beginner on Ubuntu and it’s hard.

How do I link it with my chia?

@madMAx43v3r how many RAM do you need to just farm compressed plots with GPU?

Lazy man, all this information is available in github with nice spreadsheets… But may be you mean Ram+Vram. I think Node itself is using 4 to 8 Gb depending on C compression level when gpu farming.

1 Like

The Chia Gigahorse Node / Farmer for windows is still not available, right?

@smokeTime
Try with /
./chia…

something else. pci x16 to x4 like ethereum. Is there possible? Is it about calculation and or about bandwidth?

https://static3.caseking.de/media/image/thumbnail/zurc-008_zurc_008_1g_800x800.jpg

@madMAx43v3r I have a 1070 and a 2080ti, and they have the same plot time (8.5 minutes) on a pcie 4.0 machine with 512G 2933 ram. There’s something wrong for sure. What should I do to check it?

@SmokoTime

Hope these videos help you two.

1 Like

Depends on K size and C level, in any case with GPU it’s usually small enough not to worry about it.

Just checked my chia harvester with K32 C7 uses 4G RAM, however this can be reduced by lowering export CHIAPOS_MAX_CORES=X to 8 or 4 (depending how fast your CPU is), the default is 16.

Yeah that’s bad, you are measuring second plot right? can you post the terminal output of second plot?

I suspect either slow -t, reduced PCIe lanes, or bad RAM / NUMA config. Also this on windows or linux?

I share my experiences maybe it is helpful for someone.
From my already existing consumer hardware i put together a GPU plotter only i had to buy a graphics card because i don’t own one :smiley:
key points:

  • Ryzen 5950X @3200
  • 4 x 32GB 4000 @3200 ( IFAB @1600 )
  • 3 x 2TB M2 AORUS PCIe4 (2 x Raid0, 1 x System)
  • X570 motherboard
  • INNO3D 4070 TI (actually the only 4070 TI which is only 2 slots wide)
  • 24 Port SATA Card PCIe2x16 ( 6 SATA controler with 4 ports + 1 SATA switch )

first the easy part, the gigahors plotter from max ran immediately without problems :slight_smile:
The problem was to achieve a reasonable hardware configuration.

The biggest problem was the distribution of the bandwidth to the individual components.
With the 5950X I have only 24 PCIe4 lanes available with a maximum of 2GB per lane. 16 alone for 1 x PCIe4x16 or 2 x PCIe4x8 since huge amounts of data have to be transported to the GPU and back it only makes sense to use one PCI4x16 for an optimal result and leave the second free.
Another PCIe4x4 are used for a M2 slot, so now there are only 4 x PCIex4 lanes left to connect the X570 chipsats of the motherboard. This is the real bottleneck because these PCIex4 can only transport a maximum of 8GB/s.

The problem is for a GPU plot with only 128 GB we need a SSD as a swap, with me has shown that a Raid0 from 2 SSD is about 70% faster than only one SSD. In addition, the finished plots must also be copied back down at the same time which caused with a raid0 with me only about 10%-12% performance loss with a single SSD were the 20%-30% in addition broke here also the SATA transfer of over 270MB/s to 180MB/s - 210MB/s

So only raid0 makes sense, problem we only have one m2 port which is directly connected to the CPU with x4 lans the rest are all connected to the CPU via the X570 chipset (the x4 bottleneck). so i use one M2 port which is directly connected to the CPU and a second one via the X570 this is not nice but all times better than using 2 M2 connected to the X570 because in the end both have to squeeze via the bottleneck to the CPU.
and let’s not forget all other components (onboard SATA, PCIe cards, USB Ethernet all have to go through the PCIEx4 bottleneck CPU and X570 chip)
after I optimized this it showed very fast that the high permanent load of the SSDs led to a performance drop due to the heating.
I have here finally the clock rates somewhat down since with me the Chipsets and the M2 are all connected with the same cooling element and an additional 120mm fan attached which cools this.

./cuda_plot_k32 -C 8 -k 32 -n 500 -d /destination1/ -d /destination2/ -d /destination3/ -d /destination4/ -t /raid/ -2 /raid/ -c xxx -f xxx

so last but not least some plot times:

  • K32 C8, 128GB@3200, 1x2TB PCIe4 SSD, 4070TI no OC, cold (2-3 plots) without plot copy:
    290s - 310s

  • K32 C8, 128GB@3200, 1x2TB PCIe4 SSD, 4070TI no OC, warm (5-7 plots) without plot copy:
    350s - 370s

  • K32 C8, 128GB@3200, 1x2TB PCIe4 SSD, 4070TI no OC, warm (5-7 plots) with simultaneous plot copy (4xHDD):
    410s -520s

  • K32 C8, 128GB@3200, 2x2TB PCIe4 SSD RAID0, 4070TI no OC, cold (2-3 plots) without plot copy:
    177s - 182s

  • K32 C8, 128GB@3200, 2x2TB PCIe4 SSD RAID0, 4070TI no OC, warm (5-7 plots) without plot copy:
    218s - 224s

  • K32 C8, 128GB@3200, 2x2TB PCIe4 SSD RAID0, 4070TI no OC, warm (5-7 plots) with simultaneous plot copy (4xHDD):
    238s - 245s

2 Likes

Yeah partial RAM mode is all about your SSDs, especially sustained writes, MLC or Optane are the best. You’d be fine with 8 lanes to the GPU as well here, with PCIe 4.0. The full 16 lanes are only needed for full RAM mode.

thx for this info, if i had time next WE i will try test raid0 over the 2nd PCIe Slot where both SSD conect direct to the CPU :slight_smile:

Hi everybody,
I tried Max’s cuda plotter on Windows and it works great, 162 seconds for a k32 c8, full ram with RTX 3060 non Ti.
However, I encountered a strange behaviour when plotting to multiple HDDs: even if the nvme buffer (-t) has enough space, when the second plot is in the middle of P3 it pauses and waits for the first plot to be completely transferred to the mechanical hard drive 1, even if there is a second hard drive idle waiting and -t has only one plot (the one being transferred). Only when the transfer of the first plot is complete, it completes the second one and starts to copy it to the second hard drive, thus vanishing the advantages of the fast plotting. i’m using cuda_plot_k32 -C 8 -M 128. If I set -d to the ssd I have not the problem because it flushes the plot well before P3.
I think it is windows related because another guy tried on windows and got the same behaviour. Any ideas?

Thank you!

2 Likes

Great summary. You may also add your cmd line for those that want to follow what you do.

I assume you are on Linux, as such there are new binaries for Linux (~4 hours ago) that may be fixing box degradation that happens (on my box) after ~5 hours or so.

One thing that I see that the CPU usage dropped down by 25+% or so. Also, I am about 1 hour in, and by this time the old version was already degrading HD writes from over 250 MBps down to 200 MBps or less (after those ~5 hours it was going down to 60-70 MBps per 4 drives; it was not individual HD speed that was capped, but rather total, regardless what was the HD configuration (a mix of different SATA controllers, or some HD on USBs, it really didn’t matter, the total speed was going down to drain). So far it looks really good, but too early to tell. By the way, I am on PCIe 3 with 256 GB RAM.

By the way, if this new release will fix those issues, you may want to check plot-sink, as it offers more flexibility (e.g., disks hot swap during a single run, adding / removing HDs on the fly, potentially pushing finished plots both to local HDs and over the cable to another box(es)).

By the way, by this time, the previous version was already sitting on 3-4 HDs, so far looks like 2 HDs are more than enough. Still rolling over the set of 4 drives. Super cool.

This very strange, can you post the terminal output?

Also latest version fixes some issues in regards to copy, but not something like this.

The plotter waits at the beginning of P3 if the previous plot has not finished flushing to -t (flushing not copying to -d), that’s normal behavior.

It doesn’t wait at the start of P3, but right after P3 table 5 LPSK. GPU activity goes to zero till the copy to the hard drive finishes. During this time, allocated system memory (as seen from task manager) is 179/256 GB.

It seems like it has still a copy of the previous plot in ram, even if it had completely flushed it into -t. I’m suspecting this because even with the last plot of a bunch of -n, allocated memory stays at around 179/256 till it finishes to copy, even if plotting process is finished (because it’s the last plot)

Here is the terminal output

Thank you!

Chia k32 next-gen CUDA plotter - 15133ec
Plot Format: mmx-v2.4
Network Port: 8444 [chia]
No. GPUs: 1
No. Streams: 4
Final Destination: E:\
Final Destination: G:\
Final Destination: F:\
Shared Memory limit: 119.7 GiB
Number of Plots: 3
Initialization took 0.267 sec
Crafting plot 1 out of 3 (2023/02/12 22:49:23)
Process ID: 8632
Pool Puzzle Hash:
Farmer Public Key:
Working Directory:   D:\
Working Directory 2: @RAM
Compression Level: C8 (xbits = 8, final table = 4)
Plot Name: plot-k32-c8-2023-02-12-22-49-7819be3b5d8220b3dfb6e2db2626d969ed0fc898c517e691f2469af7644ed4d5
[P1] Setup took 0.821 sec
[P1] Table 1 took 18.623 sec, 4294967296 entries, 16790622 max, 66759 tmp, 0 GB/s up, 1.82571 GB/s down
[P1] Table 2 took 16.885 sec, 4294889084 entries, 16786600 max, 66524 tmp, 1.89517 GB/s up, 3.02045 GB/s down
[P1] Table 3 took 25.978 sec, 4294701041 entries, 16787172 max, 66587 tmp, 1.84768 GB/s up, 3.27201 GB/s down
[P1] Table 4 took 27.379 sec, 4294300817 entries, 16786493 max, 66604 tmp, 2.92177 GB/s up, 4.34641 GB/s down
[P1] Table 5 took 15.741 sec, 4293662006 entries, 16782009 max, 66610 tmp, 5.08148 GB/s up, 6.47991 GB/s down
[P1] Table 6 took 14.249 sec, 4292251478 entries, 16780057 max, 66522 tmp, 4.49018 GB/s up, 5.96535 GB/s down
[P1] Table 7 took 10.567 sec, 4289428681 entries, 16765475 max, 66436 tmp, 4.53957 GB/s up, 4.42417 GB/s down
Phase 1 took 130.58 sec
[P2] Setup took 0.23 sec
[P2] Table 7 took 2.821 sec, 11.3289 GB/s up, 0.18832 GB/s down
[P2] Table 6 took 4.172 sec, 7.66533 GB/s up, 0.127337 GB/s down
[P2] Table 5 took 4.182 sec, 7.64952 GB/s up, 0.127033 GB/s down
Phase 2 took 11.559 sec
[P3] Setup took 0.643 sec
[P3] Table 4 LPSK took 5.238 sec, 3465082979 entries, 15586255 max, 61650 tmp, 6.20968 GB/s up, 9.73659 GB/s down
[P3] Table 4 NSK took 6.365 sec, 3465082979 entries, 13548026 max, 61650 tmp, 6.0841 GB/s up, 9.35352 GB/s down
[P3] Table 5 PDSK took 5.111 sec, 3531234451 entries, 13817053 max, 54881 tmp, 6.36305 GB/s up, 9.14699 GB/s down
[P3] Table 5 LPSK took 6.199 sec, 3531234451 entries, 14248833 max, 57064 tmp, 10.0004 GB/s up, 8.22717 GB/s down
[P3] Table 5 NSK took 6.201 sec, 3531234451 entries, 13807069 max, 56683 tmp, 6.36423 GB/s up, 9.6009 GB/s down
[P3] Table 6 PDSK took 5.381 sec, 3710590665 entries, 14515593 max, 57700 tmp, 6.04182 GB/s up, 8.68802 GB/s down
[P3] Table 6 LPSK took 7.854 sec, 3710590665 entries, 15085752 max, 60464 tmp, 8.18985 GB/s up, 6.49354 GB/s down
[P3] Table 6 NSK took 6.169 sec, 3710590665 entries, 14509770 max, 59974 tmp, 6.72217 GB/s up, 9.6507 GB/s down
[P3] Table 7 PDSK took 5.51 sec, 4289428681 entries, 16777746 max, 66436 tmp, 7.97518 GB/s up, 8.48462 GB/s down
[P3] Table 7 LPSK took 7.013 sec, 4289428681 entries, 17197948 max, 68801 tmp, 10.2081 GB/s up, 7.27224 GB/s down
[P3] Table 7 NSK took 6.63 sec, 4289428681 entries, 16765475 max, 68226 tmp, 7.23048 GB/s up, 8.97966 GB/s down
Phase 3 took 68.603 sec
[P4] Setup took 0.228 sec
[P4] total_p7_parks = 2094448
[P4] total_c3_parks = 428942, 2385 / 2460 ANS bytes
Phase 4 took 4.94 sec, 6.46938 GB/s up, 3.7011 GB/s down
Total plot creation time was 215.767 sec (3.59612 min)
Crafting plot 2 out of 3 (2023/02/12 22:52:58)
Process ID: 8632
Pool Puzzle Hash:
Farmer Public Key:
Working Directory:   D:\
Working Directory 2: @RAM
Compression Level: C8 (xbits = 8, final table = 4)
Plot Name: plot-k32-c8-2023-02-12-22-52-c849f01c723e703d8c29e83ae574794cf6d801a7308f412cffd108ca0d4da9bf
[P1] Setup took 0.968 sec
[P1] Table 1 took 3.36 sec, 4294967296 entries, 16788271 max, 66628 tmp, 0 GB/s up, 10.1191 GB/s down
Flushing to disk took 11.193 sec
Started copy to E:\plot-k32-c8-2023-02-12-22-49-7819be3b5d8220b3dfb6e2db2626d969ed0fc898c517e691f2469af7644ed4d5.plot
[P1] Table 2 took 10.291 sec, 4294984588 entries, 16788837 max, 66609 tmp, 3.10951 GB/s up, 4.95581 GB/s down
[P1] Table 3 took 12.435 sec, 4294805658 entries, 16789558 max, 66601 tmp, 3.86009 GB/s up, 6.83556 GB/s down
[P1] Table 4 took 13.736 sec, 4294567925 entries, 16785250 max, 66700 tmp, 5.82389 GB/s up, 8.66338 GB/s down
[P1] Table 5 took 14.057 sec, 4294051017 entries, 16786510 max, 66628 tmp, 5.69059 GB/s up, 7.25619 GB/s down
[P1] Table 6 took 13.654 sec, 4292979576 entries, 16783165 max, 66753 tmp, 4.68627 GB/s up, 6.2253 GB/s down
[P1] Table 7 took 10.563 sec, 4290926244 entries, 16771286 max, 66714 tmp, 4.54206 GB/s up, 4.42585 GB/s down
Phase 1 took 79.421 sec
[P2] Setup took 0.241 sec
[P2] Table 7 took 2.807 sec, 11.3893 GB/s up, 0.189259 GB/s down
[P2] Table 6 took 3.296 sec, 9.70424 GB/s up, 0.16118 GB/s down
[P2] Table 5 took 4.078 sec, 7.84531 GB/s up, 0.130272 GB/s down
Phase 2 took 10.562 sec
[P3] Setup took 0.69 sec
[P3] Table 4 LPSK took 5.196 sec, 3465401279 entries, 15592609 max, 61561 tmp, 6.26025 GB/s up, 9.81529 GB/s down
[P3] Table 4 NSK took 6.385 sec, 3465401279 entries, 13547663 max, 61561 tmp, 6.0656 GB/s up, 9.32422 GB/s down
[P3] Table 5 PDSK took 4.932 sec, 3531738446 entries, 13815623 max, 54887 tmp, 6.59457 GB/s up, 9.47896 GB/s down
[P3] Table 5 LPSK took 6.644 sec, 3531738446 entries, 14246387 max, 57136 tmp, 9.33178 GB/s up, 7.67614 GB/s down
Renamed final plot to E:\plot-k32-c8-2023-02-12-22-49-7819be3b5d8220b3dfb6e2db2626d969ed0fc898c517e691f2469af7644ed4d5.plot
[P3] Table 5 NSK took 264.543 sec, 3531738446 entries, 13812349 max, 56580 tmp, 0.149202 GB/s up, 0.225049 GB/s down
[P3] Table 6 PDSK took 5.599 sec, 3711429192 entries, 14524158 max, 57791 tmp, 5.80754 GB/s up, 8.34975 GB/s down
[P3] Table 6 LPSK took 6.433 sec, 3711429192 entries, 15086614 max, 60447 tmp, 10.0008 GB/s up, 7.92791 GB/s down
[P3] Table 6 NSK took 6.193 sec, 3711429192 entries, 14513084 max, 59886 tmp, 6.69763 GB/s up, 9.6133 GB/s down
[P3] Table 7 PDSK took 5.556 sec, 4290926244 entries, 16789118 max, 66714 tmp, 7.91192 GB/s up, 8.41437 GB/s down
[P3] Table 7 LPSK took 7.033 sec, 4290926244 entries, 17196519 max, 68956 tmp, 10.1821 GB/s up, 7.25156 GB/s down
[P3] Table 7 NSK took 6.702 sec, 4290926244 entries, 16771286 max, 68314 tmp, 7.1553 GB/s up, 8.88319 GB/s down
Phase 3 took 326.208 sec
[P4] Setup took 0.248 sec
[P4] total_p7_parks = 2095179
[P4] total_c3_parks = 429092, 2384 / 2464 ANS bytes
Phase 4 took 4.929 sec, 6.48608 GB/s up, 3.70936 GB/s down
Total plot creation time was 421.215 sec (7.02026 min)
Crafting plot 3 out of 3 (2023/02/12 23:00:00)
Process ID: 8632
Pool Puzzle Hash:
Farmer Public Key:
Working Directory:   D:\
Working Directory 2: @RAM
Compression Level: C8 (xbits = 8, final table = 4)
Plot Name: plot-k32-c8-2023-02-12-23-00-db0382245f2d7160b2cba4f259d5ade2acf97b080211afce55be4da76308c898
[P1] Setup took 1.03 sec
[P1] Table 1 took 3.366 sec, 4294967296 entries, 16790115 max, 66553 tmp, 0 GB/s up, 10.1011 GB/s down
Flushing to disk took 10.838 sec
Started copy to G:\plot-k32-c8-2023-02-12-22-52-c849f01c723e703d8c29e83ae574794cf6d801a7308f412cffd108ca0d4da9bf.plot
[P1] Table 2 took 10.272 sec, 4294784120 entries, 16792635 max, 66724 tmp, 3.11526 GB/s up, 4.96498 GB/s down
[P1] Table 3 took 12.362 sec, 4294510934 entries, 16787890 max, 66624 tmp, 3.8827 GB/s up, 6.87593 GB/s down
[P1] Table 4 took 12.991 sec, 4293918503 entries, 16784589 max, 66665 tmp, 6.15746 GB/s up, 9.16021 GB/s down
[P1] Table 5 took 14.364 sec, 4292840005 entries, 16780199 max, 66557 tmp, 5.56812 GB/s up, 7.1011 GB/s down
[P1] Table 6 took 13.741 sec, 4290613649 entries, 16773549 max, 66637 tmp, 4.65529 GB/s up, 6.18588 GB/s down
[P1] Table 7 took 10.587 sec, 4286205438 entries, 16754066 max, 66724 tmp, 4.52927 GB/s up, 4.41582 GB/s down
Phase 1 took 79.102 sec
[P2] Setup took 0.212 sec
[P2] Table 7 took 2.818 sec, 11.3324 GB/s up, 0.18852 GB/s down
[P2] Table 6 took 2.828 sec, 11.3039 GB/s up, 0.187854 GB/s down
[P2] Table 5 took 3.35 sec, 9.54751 GB/s up, 0.158582 GB/s down
Phase 2 took 9.346 sec
[P3] Setup took 0.715 sec
[P3] Table 4 LPSK took 5.178 sec, 3464487354 entries, 15581542 max, 61601 tmp, 6.28108 GB/s up, 9.84941 GB/s down
[P3] Table 4 NSK took 6.244 sec, 3464487354 entries, 13545274 max, 61601 tmp, 6.20094 GB/s up, 9.53478 GB/s down
[P3] Table 5 PDSK took 4.986 sec, 3530137588 entries, 13806991 max, 54762 tmp, 6.52134 GB/s up, 9.3763 GB/s down
[P3] Table 5 LPSK took 6.512 sec, 3530137588 entries, 14248238 max, 57195 tmp, 9.51737 GB/s up, 7.83173 GB/s down
[P3] Table 5 NSK took 257.581 sec, 3530137588 entries, 13805607 max, 56766 tmp, 0.153165 GB/s up, 0.231132 GB/s down
Renamed final plot to G:\plot-k32-c8-2023-02-12-22-52-c849f01c723e703d8c29e83ae574794cf6d801a7308f412cffd108ca0d4da9bf.plot
[P3] Table 6 PDSK took 5.389 sec, 3708727604 entries, 14511346 max, 57602 tmp, 6.03058 GB/s up, 8.67512 GB/s down
[P3] Table 6 LPSK took 6.409 sec, 3708727604 entries, 15088603 max, 60415 tmp, 10.0321 GB/s up, 7.9576 GB/s down
[P3] Table 6 NSK took 6.154 sec, 3708727604 entries, 14500291 max, 60061 tmp, 6.73517 GB/s up, 9.67422 GB/s down
[P3] Table 7 PDSK took 5.567 sec, 4286205438 entries, 16771626 max, 66724 tmp, 7.88759 GB/s up, 8.39774 GB/s down
[P3] Table 7 LPSK took 7.035 sec, 4286205438 entries, 17200968 max, 68896 tmp, 10.1695 GB/s up, 7.2495 GB/s down
[P3] Table 7 NSK took 6.615 sec, 4286205438 entries, 16754066 max, 68155 tmp, 7.24143 GB/s up, 9.00002 GB/s down
Phase 3 took 318.673 sec
[P4] Setup took 0.249 sec
[P4] total_p7_parks = 2092874
[P4] total_c3_parks = 428620, 2386 / 2458 ANS bytes
Phase 4 took 5.066 sec, 6.30373 GB/s up, 3.60905 GB/s down
Total plot creation time was 412.259 sec (6.87098 min)
Flushing to disk took 11.481 sec
Started copy to F:\plot-k32-c8-2023-02-12-23-00-db0382245f2d7160b2cba4f259d5ade2acf97b080211afce55be4da76308c898.plot
Renamed final plot to F:\plot-k32-c8-2023-02-12-23-00-db0382245f2d7160b2cba4f259d5ade2acf97b080211afce55be4da76308c898.plot