Tips for Chia harvesting using S3-compatible APIs

jacobcs · May 25, 2021, 4:30pm

I’m the CTO of an S3-compatible cloud storage company and we have had several customers reach out to us about using our service for chia harvesting. I’ve compiled some tips for them and felt like I’d share them here in case anyone else finds them useful.

Use software to mount the cloud storage that has a small overfetch setting. Some of them fetch 5 or 10 MB at a time and Chia doesn’t need to fetch that much on its random seeks when solving challenges. Goofys seems to do pretty well with the defaults, others might be fine if you configure them.
Don’t plot directly to cloud storage. This one will probably seem obvious to most people, but we have seen it. Plot locally.
Beware when using ‘-d, --final_dir’. Chia will write out the final file using a ‘.tmp’ in the destination, then rename that file to the file plot file. This causes your plot to be uploaded twice because S3 doesn’t support renames. It also will rack up your early delete fees
Until it is released, cherry-pick in the parallel reads patch: Add parallel reads to GetFullProof by marcoabreu · Pull Request #239 · Chia-Network/chiapos · GitHub. This can have a 10x speedup in solving challenges. Our testing shows that it completes in under 2 seconds when you have this patch.

We wrote up some of these tips into a blog post Set Up Chia Storage for $4/TB - CrowdStorage. Let me know if anyone is interested in the step-by-step how to get the parallel PR cherry-picked. We are also considering adding a patch to chiapos to add a flag to help with step 3 so you can plot to your tmpdirs and then have it stream the plot out once direct to cloud storage so you don’t have to have the extra disk space to finish the plot completely.

MisterOutofTime · May 25, 2021, 5:15pm

our plotting service does support CrowdStorage, we did had the same issue as on 3. for the first day(i was really surprised that “hot storage” s3 providers cant make a copy on write since the file doesnt change at all and needs NO reupload, nor does the checksum/hash change), but we fixed it

you can now just directly order plots from us and we upload them to your CrowdStorage bucket as soon as their done with no manual steps required at all.

Eysteinh · May 25, 2021, 5:49pm

Brilliant. This would help accelerate decentralized storage solutions and is so much cost effective than what datacenters can provide today. It is a win win as for example myself I could want to offer my storage to others in the future but I would not want to pay for the redundancy measures that you need with centralized storage.

luckidog · May 26, 2021, 7:17pm

#4 is where I was looking to go with my thread here here

I have this open MR for generating more detailed metrics in the plot check which might have helped you Add timing metrics to plot check by aarcro · Pull Request #5109 · Chia-Network/chia-blockchain · GitHub

Looks like this is merged to chiapos master, but they haven’t had a release since 1.0.2 2 weeks ago. I hope it’s safe to run master I’ll be trying that ASAP.

jacobcs · May 26, 2021, 7:39pm

Yes this looks useful. It was my understanding that doing a time chia check -n5 and dividing the time by 5 was a good approximation to a full challenge’s solution time.

luckidog · May 26, 2021, 7:42pm

I found out that most of the quality check time drops to zero after the first few checks, I assume because it all lands in disk cache and the right hand branches through the table have all been touched prior. Same for the checks themselves. Since the proofs done by check are deterministic the second pass of check -n5 runs at RAM speed, not network speed. You probably saw this in your testing as well.

I wondered about getting some gains by prefetching a few quality checks when a new plot is discovered to accelerate the first time it gets hit for a real check.

jacobcs · May 26, 2021, 7:46pm

Yes, need to make sure you clear the fs caches between benchmarks. With the parallel PR it seemed like there was ample headroom to get the challenges done in time. (~2-3 seconds).

luckidog · May 27, 2021, 4:41am

This is amazing. Just dropped my proof checks over the internet from 26 seconds worst case to 4 seconds. Might be possible to harvest cloud plots from home with this change.

TalyGin · May 30, 2021, 9:48am

No need to patch anymore, 1.0.3 included relevant change, so, just upgrade version of chiapos

Eysteinh · May 30, 2021, 10:05am

So the patches suggested in this thread are all incorporated in the 1.0.3 Chiapos? (and next release of chia for the latency reduction on checking plots I assume - edit I see chiapos has a verifier as well)

TalyGin · May 30, 2021, 10:26am

What kind of hardware could be used to work with plots in bucket? Could someone share configuration, and how many plots could be in one bucket for one harvester?

I’m going to start with 8 CPU with 16 GB RAM for one harvester, in DO, would it be enough for bucket with 10TB of plots?

luckidog · May 31, 2021, 2:32am

I’ve been running 1.0.3 for a couple days. Had to tweak chia-blockchain to allow the future version. Working like a treat though.

alegu · May 31, 2021, 6:00am

what command are you guys using to move files from local storage to polycloud? I tried cp, pp, and rsync and they all fail to transfer midway through. Seems like the file is buffering in RAM and crashes the transfer when the RAM is full.

Chida · May 31, 2021, 8:22am

i try with rclone, goofys like blog, but don’t work. there is an error at the end when try to copy ( cp, rsync, dd ).

with aws cli, no error, it seems to work

Chida · May 31, 2021, 8:26am

I try Set Up Chia Storage for $4/TB - CrowdStorage, but when check:

2021-05-31T08:19:08.730 chia.plotting.plot_tools : ERROR Failed to open file /home/xxx/polycloud/plot-k32-2021-05-25-21-18-9ec3a1be3ff576d3e85cb737dff34239b5f9fc431570a8da0d8144790467773d.plot. badbit or failbit
after reading size 104 at position 0 Traceback (most recent call last):
File “/home/xxx/chia-blockchain/chia/plotting/plot_tools.py”, line 189, in process_file
prover = DiskProver(str(filename))
RuntimeError: badbit or failbit after reading size 104 at position 0

@jacobcs why this problem?

jacobcs · May 31, 2021, 12:43pm

You can try using AWS CLI to do the uploads, then use goofys to add as the plot directory for harvesting.

jacobcs · May 31, 2021, 12:44pm

@Chida Check the size of the file, did the upload complete all the way?

MisterOutofTime · May 31, 2021, 1:19pm

Did any of you try s3 with the hpool miner?
Seems like scan time explodes after about 20 min to always more than 15 seconds, until a restart of the hpool miner

Would appreciate some tips.

Chida · May 31, 2021, 3:56pm

@jacobcs the size is the same
origin

on polycloud

they copied with aws cli. Same space. one work, one not

taijicoinmaster · June 1, 2021, 4:03am

Yes, I tried to transfer the plot using rsync/cp/mv and even setup a http server to use wget into wasabi bucket, it doesnt work at all and will failed at around 30%.

It’s weird that if i download (using wget) the plot from a third party plotter into wasabi via DO droplet it will work, I even limit the download speed on my transfer from local to wasabi to 15MB/s, it still fails.