I’m the CTO of an S3-compatible cloud storage company and we have had several customers reach out to us about using our service for chia harvesting. I’ve compiled some tips for them and felt like I’d share them here in case anyone else finds them useful.
Use software to mount the cloud storage that has a small overfetch setting. Some of them fetch 5 or 10 MB at a time and Chia doesn’t need to fetch that much on its random seeks when solving challenges. Goofys seems to do pretty well with the defaults, others might be fine if you configure them.
Don’t plot directly to cloud storage. This one will probably seem obvious to most people, but we have seen it. Plot locally.
Beware when using ‘-d, --final_dir’. Chia will write out the final file using a ‘.tmp’ in the destination, then rename that file to the file plot file. This causes your plot to be uploaded twice because S3 doesn’t support renames. It also will rack up your early delete fees
We wrote up some of these tips into a blog post Set Up Chia Storage for $4/TB - CrowdStorage. Let me know if anyone is interested in the step-by-step how to get the parallel PR cherry-picked. We are also considering adding a patch to chiapos to add a flag to help with step 3 so you can plot to your tmpdirs and then have it stream the plot out once direct to cloud storage so you don’t have to have the extra disk space to finish the plot completely.
our plotting service does support CrowdStorage, we did had the same issue as on 3. for the first day(i was really surprised that “hot storage” s3 providers cant make a copy on write since the file doesnt change at all and needs NO reupload, nor does the checksum/hash change), but we fixed it
you can now just directly order plots from us and we upload them to your CrowdStorage bucket as soon as their done with no manual steps required at all.
Brilliant. This would help accelerate decentralized storage solutions and is so much cost effective than what datacenters can provide today. It is a win win as for example myself I could want to offer my storage to others in the future but I would not want to pay for the redundancy measures that you need with centralized storage.
Looks like this is merged to chiapos master, but they haven’t had a release since 1.0.2 2 weeks ago. I hope it’s safe to run master I’ll be trying that ASAP.
Yes this looks useful. It was my understanding that doing a time chia check -n5 and dividing the time by 5 was a good approximation to a full challenge’s solution time.
I found out that most of the quality check time drops to zero after the first few checks, I assume because it all lands in disk cache and the right hand branches through the table have all been touched prior. Same for the checks themselves. Since the proofs done by check are deterministic the second pass of check -n5 runs at RAM speed, not network speed. You probably saw this in your testing as well.
I wondered about getting some gains by prefetching a few quality checks when a new plot is discovered to accelerate the first time it gets hit for a real check.
Yes, need to make sure you clear the fs caches between benchmarks. With the parallel PR it seemed like there was ample headroom to get the challenges done in time. (~2-3 seconds).
This is amazing. Just dropped my proof checks over the internet from 26 seconds worst case to 4 seconds. Might be possible to harvest cloud plots from home with this change.
So the patches suggested in this thread are all incorporated in the 1.0.3 Chiapos? (and next release of chia for the latency reduction on checking plots I assume - edit I see chiapos has a verifier as well)
What kind of hardware could be used to work with plots in bucket? Could someone share configuration, and how many plots could be in one bucket for one harvester?
I’m going to start with 8 CPU with 16 GB RAM for one harvester, in DO, would it be enough for bucket with 10TB of plots?
what command are you guys using to move files from local storage to polycloud? I tried cp, pp, and rsync and they all fail to transfer midway through. Seems like the file is buffering in RAM and crashes the transfer when the RAM is full.
2021-05-31T08:19:08.730 chia.plotting.plot_tools : ERROR Failed to open file /home/xxx/polycloud/plot-k32-2021-05-25-21-18-9ec3a1be3ff576d3e85cb737dff34239b5f9fc431570a8da0d8144790467773d.plot. badbit or failbit
after reading size 104 at position 0 Traceback (most recent call last):
File “/home/xxx/chia-blockchain/chia/plotting/plot_tools.py”, line 189, in process_file
prover = DiskProver(str(filename))
RuntimeError: badbit or failbit after reading size 104 at position 0
Did any of you try s3 with the hpool miner?
Seems like scan time explodes after about 20 min to always more than 15 seconds, until a restart of the hpool miner
Yes, I tried to transfer the plot using rsync/cp/mv and even setup a http server to use wget into wasabi bucket, it doesnt work at all and will failed at around 30%.
It’s weird that if i download (using wget) the plot from a third party plotter into wasabi via DO droplet it will work, I even limit the download speed on my transfer from local to wasabi to 15MB/s, it still fails.