Chia recompute proxy

winreboot · April 12, 2024, 3:51pm

Hi, can anyone help me with command syntax to have chia run on two servers. I have server A with netapp storage attached that max out the GPU on it, I need to delete some plots as I’m start to get stale partials. I have server B with GPU and free room on separate netapp.
server A (192.168.3.2) GPU 4070 Super,
server B (192.168.3.3) GPU 3060 12GB, almost empty netapp

I hoped I can just use chia_recompute_proxy -n 192.168.3.2 -n 192.168.3.3 command on both servers and it would load balance itself, so any new plots on server B would be processed on GPU on server B and not on server A. but I get errors

Node 192.168.3.2 failed with: socket() failed with: Too many open files (24)
accept() failed with: Too many open files (24)
accept() failed with: Too many open files (24)
accept() failed with: Too many open files (24)

If there is another way or if I’m using the wrong command, please let me know. Any help is greatly appreciated. Thanks!

Jacek · April 12, 2024, 5:43pm

You can either use recompute directly or via proxy (I have never used it via proxy). Here is the doc - GitHub - madMAx43v3r/chia-gigahorse. Take a look at the 4th picture (load balanced across multiple machines). This may be an easier way to setup it up. The only thing you need to do is to export CHIAPOS_RECOMPUTE_HOST to env on both harvesters:

CHIAPOS_RECOMPUTE_HOST=192.168.3.2,192.168.3.3

and start recompute without any params (it will pick it up from that env variable).

CryptoVibe · April 14, 2024, 1:20am

This is how I got it working. I have three physical servers each running Proxmox. Each server has VMs running Linux Desktop (FARMER-01, HARVESTER-02, HARVESTER-03) (The Farmer VM also is running a harvester too, as you may know). On each VM, I started the recompute server like this FIRST (I Will add this as a service that starts with the VM now that I know this is working):

If any of Chia functions were running, I stopped them: ./chia.bin stop all -d

I run this in the VM, in a terminal window ON each VM (This stays open in the terminal & keeps listening for remote compute commands & you can’t add another command while it is running in this terminal window):

./chia_recompute_server

Then, I go to each VM through SSH on my desktop & Run this command on each of the VM’s:
export CHIAPOS_RECOMPUTE_HOST=10.1.1.1,10.1.1.2,10.1.1.3

For you, this is what you’d run:
export CHIAPOS_RECOMPUTE_HOST=192.168.3.2,192.168.3.3

Then I run either of these commands, depending on which machine I am starting Chia up on. I have my main remote farmer & two remote harvesters:

./chia.bin start farmer
./chia.bin start harvester

CryptoVibe · April 14, 2024, 2:33am

I was having the same issue. I have three Dell R720XD servers all running Proxmox. There is the main FARMER-01 & two remote harvesters HARVESTER-02 & HARVESTER-03. FARMER-01 has 12 20TB HDD’s & a NetApp DS4246 attached to it with 24 20TB HDD’s & a RTX 3070 Blower GPU, which was getting latency between 11 - 20 seconds. The latency kept getting worse as I plotted more C30 plots to it. MadMax suggested Remote Compute & I got it working (Like I shared above). HARVESTER-02 & HARVESTER-03 only have 12 20TB HDD’s in them & were getting 2 - 4 second latency. Once I got this Remote Compute working, FAMER-01 latency went down to 4 - 7 seconds & the harvesters were now below a second to 2 seconds most of the time. So EVERYTHING improved in terms of latency once I got Remote Compute working. I did not have to delete any plots. FARMER-01 is maxed at 100% usage / 15,500 C30 plots. I am planning to add NetApp DS4246 with 24 20 TB HDD’s to both of those & see how Remote Compute works. Each R730XD has an RTX 3070 Blower GPU in it. If the latency goes above 8 seconds once these other NetApps are added, I have a 4th R720XD that I will add to the Proxmox cluster & will only add 12 20TB HDD’s without a NetApp. BUT, there is an RTX 4070 blower GPU that is supposed to come out soon. Once that happens, I will upgrade to that & see if I have have 4 NetApps with 24 20TB HDD’s connected to all of these R720XD’s. I have a NetApp DS4486 that I’m curious to see if this Remote Compute could handle with these RTX 3070’s. I hope what I shared about my Remote Compute helps you & anyone else who sees this post.