The Journey to One Petabyte

Hey everybody, it’s been awhile since I posted an update for my DIY JBOD setup. I last mentioned in that thread that I had come across a great deal on some Dell SC200 SAS enclosures and drives, so I was working on tearing down my DIY setup to move everything over. I wanted to detail my experiences here in a new thread in the hopes that it helps other farmers out there. Questions and feedback welcome!

My DIY setup was up to over 400tb and was made up of about 130 refurbished server SATA 4tb drives that I got for around $10/tb. It was still running great and I think I could continue scaling it, but the deal I found on the SAS enclosures changed my perspective and journey quite a bit.

I purchased 9 Dell SC200 enclosures, each full of twelve 4tb SAS drives, for a little over $3200 shipped. This basically doubled my entire capacity for around $7.50/tb. I had never played with SAS drives before, but I already had a Dell PowerEdge server that had external SAS ports, so I plugged it all up and they “just worked!” I had to flash my SAS HBA to IT mode so it would just pass all the drives through to the operating system, but other than that it was just like connecting a bunch of SATA drives.

Once I learned how easy and clean SAS is compared to my DIY SATA setup, there was no looking back. I still had over 130 4tb SATA drives, so I purchased 6 more Dell SC200’s on ebay (get them empty for about $150/each but don’t forget caddies). These enclosures work great for SAS or SATA and you can mix/match in the same enclosure. These enclosures also daisy chain to each other, so you have a single SAS cable coming out of the server going to the first enclosure. Very clean!

Now I had 15 of these 2U enclosures, so I needed a server rack! I found a nice one on Facebook Marketplace about an hour away for a couple hundred bucks and it even came with several useful rails.

My long-term goal is to replace the smaller 4tb drives with larger drives as the cost per TB comes down, of course. My next unexpected deal came soon after I had these all set up - I paid $12,000 total for 49 brand new 16tb Seagate drives along with 12 brand new 18tb Seagate drives. This turned out to be exactly 1,000 tb at a cost of $12/tb for brand new high-capacity drives. I couldn’t refuse! This deal was an in-person Bitcoin transaction - by far my largest to date. It went absolutely flawlessly, cost only a few bucks in fees and cemented crypto as the future of payment in my mind!

I began the slow process of replacing the 4tb drives with their larger brothers. It takes literally weeks to copy over 400tb of data one plot at a time, but because of the multi-channel features of SAS I was able to cut this down to a few days by running twelve file-copies at once. The server continued farming even during these massive file copies and suffered no negative performance impact - another testament to the performance of SAS!

With all high-capacity drives installed along with 4tb drives filling out the rest of the enclosures, I now have about 1.5pb of space total! I suddenly realized that I had no plan on how to actually fill all this space - I’ve been plotting since April and I’ve only managed to fill a little over 400tb. At current speeds, it would take another year to fill all this space! I needed more plotting power…back to deal search!

I found an auction for what looked like an HP c7000 BladeSystem chassis. It was mislabeled and only had a single picture of the back of the chassis, so I assumed it was just the empty chassis. Still these can cost $1000 used for just the chassis, not including any actual blade servers. These chassis can hold 16 HP blade servers and they are compatible with several generations of HP blade servers, including Gen 9 and 10, the two latest generations. They have 10gb internal network connections and have all kinds of fun connectivity options using the 8 “interconnect” slots on the back that can be mapped to the blade servers in the front. These are basically “data centers in a box” - just slide your blade in and it’s connected and ready to go.

I won the auction for a little over $200. The freight actually cost more than the auction - this thing weighs in at almost 400 pounds empty. But I thought it’d be a great base from which to build my plotting empire.

I received the chassis and could not believe my luck - slotted into the front were SEVEN Gen 9 blade servers!!! Not only that, these things were almost maxed out. All of them have dual E5-2695 v3 14-core Xeons - that’s 28 cores each and 56 threads. Four of them had 384gb RAM and the other 3 had 256gb RAM. This setup originally cost over $100k easily just a few years ago…what an incredible deal! I thought seriously about just giving up on Chia and cashing out right then!! :laughing:

I swapped memory around so that 4 of the blades now have 448gb of RAM - perfect for Bladebit. These monsters pump out a plot every 15 minutes. I literally can’t write the plots to the disks fast enough over the network - they spend a lot of time waiting. While they wait, the CPUs mine Raptoreum - more on that later.

Two of the other blades are currently plotting Chives plots using Madmax and ramdisks. They can pump out the smaller plots in 10-30 minutes as well. Once I’m happy with my Chives plots I’ll put them back on Chia plotting.

The last blade is my farmer, currently running almost 20 forks. The big disk array is not currently directly connected to the blade chassis while I figure out SAS connectivity. My old PowerEdge is still running the 15 SC200’s as a pure harvester. Once I figure out SAS connectivity, the farmer will directly connect to the array.

Whew! Ok, time for the picture:

From bottom up:

  • The HP c7000 chassis with 7 BL460C G9 blade servers
  • Power Edge 1u r610 running the array above
  • 15 Dell SC200s (bottom few still need caddies)
  • Bonus shot of Eth mining rig in right background along with two laptops busy plotting as well

I’ve moved from the laundry room out to the garage after outgrowing the space and the electric outlet. I installed two 240v outlets below my breaker box, each on their own breaker. Total electric usage of just farming is a little under 1kw. Total with all blades maxed out plotting is closer to 3kw. I leave the door between garage and house open and let it heat the house during the cold right now - we haven’t turned on the heater yet! :slight_smile:

Any questions, just ask! Also looking for improvement suggestions or any experts on HP blade server hardware/software. I’ve been stumbling my way through it but still have lots of questions!!

14 Likes

Very great report…

To this I can only say, you are a real chia freak. :slight_smile: :wink:

1 Like

Current status: I’m trying to figure out how to get a blade server connected to the Dell enclosures using SAS.

What I’ve learned so far:

  • I purchased two HPE 6Gb SAS Switches and installed them into the interconnect bays 5 and 6 in the back. They are recognized by the Onboard Admin software but when I try to manage them through the UI I get “Not Authorized.” However, logging in and managing through SSH works fine, so I don’t think it is a permissions issue. I think the browser is not passing authentication correctly. Any ideas?
  • In my naivety, I assumed the built-in HBA controllers in the blade servers would then connect to the SAS switches in the back. This is not how it works - you actually need to purchase another HBA controller card that is physically installed into the blade server’s “mezzanine” slot. I have a used P741m on the way from ebay. Hopefully this will allow the farmer blade to connect directly to the Dell SAS enclosures.
  • This chassis came with two HPE Virtual Connect Flex-10 10Gb Ethernet Modules installed in the back. These are very cool - they let you map 10gb external ports to the blade servers. Each blade server has 2 internal 10gb connections that can be mapped. You can do all kinds of fun things with VNETs and more advanced stuff if you want. HOWEVER: this module is very old and is now officially unsupported. This wouldn’t be a huge deal cause it still works, but unfortunately the last update that HP released for it still requires Flash for the Web Management. It is very hard to get Flash nowadays - I had to install Windows in a VM and turn off auto-update so it wouldn’t kill Flash with a Windows update. The newer version of this module is the HPE Virtual Connect FlexFabric 10Gb/24-port Module. With this module, you can update to the latest firmware that doesn’t require Flash anymore. I’ve ordered a couple of these from ebay as well. Man I hate Flash - HP hates it so much they just said “nope we don’t even support that hardware any more, burn it with fire and buy the new version.” :rofl:
  • This little guy is a life-saver when you have no idea what you are doing. Plug it into the front of the blade server and you get KVM at least. Turning this whole thing on for the first time is confusing - like, where do I plug in the monitor and keyboard?? Now you know - you can also plug a USB stick into this and boot from it.
  • HP’s “iLO” (integrated lights-out) system is really very well done. It allows you to do remote KVM through a browser. I rarely have to actually go visit the server - I can do pretty much everything from my PC. I believe it is similar to Dell’s “iDRAC” system but I never actually got any of that working on my Dell servers. HP’s is very easy to use and setup.
2 Likes

Insane deals.
Would you mind sharing auction name and also where you could get that fantastic deal on hdds?

2 Likes

I won’t give specifics cause I am still on the hunt myself but I will give you hints! Look for public auctions in your local area or anywhere you are willing to drive. You’ll eventually find yourself on lots of email notification lists from different sites and every once in awhile something really great comes along that nobody else notices! Ideally you’ll want to go and physically inspect stuff first cause you’ll rarely get good info from the site itself - its mostly just simple descriptions and grainy photos from people who don’t know what they are selling. Usually you can call ahead of the auction and ask to come by and take a look. From there you can start to branch out to more remote auctions when you are willing to take some more “sight unseen” risks but watch for freight charges and shipping times, which are insane right now at least in the US.

As for the drives, I actually got them from a fine fellow in this very forum!! Don’t underestimate the Buy/Sell/Trade forum and private message! :slight_smile:

3 Likes

Damn you got lucky! Quick question! I was looking at buying some of those dell arrays. Do you have one of them filled with 14TB or larger drives yet? They don’t have any issues handling larger disks so they?

2 Likes

No problems at all with larger disks. In fact I have one filled with 18tb SATA drives, others filled with 16tb SATA drives and I even have one that has a mix of 10/8/4tb SAS/SATA drives!

1 Like

Great! I wish you lots’o chia-luck.

I’m on a similar infrastructure journey. My rack is is on wheels, and haven’t decided between the garage or laundry-room yet… :smirk:

I’m all in on DELL 12th. gen. servers and recently added MD1200 Powervaults. My farm is slightly north of 200 TiB. Running on Linux Debian 11, and forks on VM’s on ESXI.

I understand you flashed your controllers to HBA, so how did/do you organize your disks? :question:

I’m an old UNIX/FreeBSD fan; and very comfortable with ZFS and NFS. So currently I’ve pooled disks in groups of 3-4 with OpenZFS on Linux, just JBOD - no RAIDx, and sharing plots to forks with sharenfs through autofs (/net/ip.addr) on a dedicated 1Gb subnet w. MTU 1500. It works fine, my AVG-CHECK on forks is ~700ms.

I plot w. MadMax and distribute to pools with autofs on a 10Gb subnet w. MTU 9000.

I’m very impressed - and surprised - that you can farm on that stack of enclosures at 1KWh(?) I’ve managed to get my MD1200’s down to 60Wh each (without disks), and record ~10Wh per SAS disk. Currently I’m drawing ~700Wh when farming 210 TiB (no plotting).

Best - Kim Bjoern

1 Like

Nice! From my research, it looks like the MD1200 and the SC200 are basically the same hardware, just branded differently. They both work great as stand-alone enclosures but the SC200 is normally marketed along with Dell’s special storage array server, which is really just a PowerEdge with an HBA in it. See this thread for more info: Dell MD1200 vs SC200 : homelab (reddit.com)

My background is MS and Windows, so I’m running Windows Server 2019. I would love to learn more Linux but I ran into some roadblocks trying to get Ubuntu installed on the HP blade chassis - my Linux-foo is not good enough to understand how to diagnose driver issues and other weirdness. Windows Server “just worked” so I’m sticking with that for now. I just mount each drive as a folder so I end up with a “Disks” directory like this:

I name each disk as CXX-YY where XX is the number of the enclosure (00-15) and YY is the number of the disk bay (00-11). I’ve been running over 1gb network but I should be getting some 10gb equipment today so I’m excited to see if I can stay ahead of Bladebit with faster connectivity.

Oops, I should have specified that the whole stack of enclosures is not currently turned on! :laughing: The 1KWh number is actually just 4 of the SC200’s turned on and farming along with the HP Blade chassis and blades at idle/farming - I don’t have the others turned on yet cause they are just empty disks and there is no need to power them until I’m filling/farming them. I expect I’ll be closer to 2KWh with the whole stack up and running.

Good luck with your setup too! Sounds like we are definitely on similar paths…

Aaah! Now your powerconsumption makes sense :money_mouth_face: !

I decided to go with a single controller in each enclosure, it saves a bit of power. With you “massive” chain of enclosures, you may prioritize the redundancy of powering the second controller as well.

Be carefull not to put your hopes to high on the 10Gb network; if you are transferring plots to single disks - you could be bottlenecked to single disk write performance (maybe plotman is your friend?). This is why I’ve pooled disks in 2’s, 3’s or 4’s - accepting the risk of singledisk pool failure. But hey! - it’s a write-once / read-many operation.

1 Like

Yes, I learned this early! I’ve created a little script with Power Automate Desktop that has some intelligence so that it only writes a single plot to a single disk at once. All plotters can then saturate the network cause they are all writing to different disks on the farmer.

Today I’m getting a 10gbs SFP+ card for the harvester which should directly connect to the back of the HP blade chassis in the 10gbs SFP+ switch. Ideally, though, the SAS controller I’m getting next week should allow me to connect the blades directly to the SC200 array, allowing me to use the internal 10gbs ports between all the blades to communicate. Each blade actually has 2 internal 10gbs ports, so the blade servers have 20gbs connectivity between them built-in! :exploding_head: That’s why I want to get the farmer blade directly connected to the SC200 array. The plotter blades would then have 20gbs connectivity to the farmer blade and it would be pretty tough to saturate that - then I’ll be back to raw disk write speeds I think (~150-200 MB/s). I think that at that speed, disk writes should stay ahead of 15 minute Bladebit plots but we’ll see!

1 Like

Awesome stuff as usual, thanks for sharing !

From your own estimate and chiapower.org numbers, you are at (2kwh/1.5pb)/(0.88wh/tb) = 1.51x the average (network’s) energy/tb for your setup - without the plotters. Do you have any plans to try and bring this number down? Or what do you think could be done to bring it down?

I can’t seem to find such great deals in my own country. Do you think such good deals can be made outside of the US or is it US specific?

1 Like

Nice site, I hadn’t seen that before. I’m going to bookmark it and go back and read through it in more detail later. And yes! I definitely will be tuning the power usage. First, the 2kwh is just an estimate. I haven’t powered on the whole stack of arrays yet, but maybe I’ll play with that this weekend. I have some nice 240v APC managed power strips that I need to get installed in the back of the rack so I can get all the real-time power info. I got two of these for $60 each at the local electronics recycling center. They had just come in as I was walking through and I snatched them up immediately!

I can definitely reduce power usage by turning off the redundant power supply/controller module in each disk enclosure as @kim.bjoern mentioned above. This makes the fans very angry and loud though, so it might be a wash unless I can figure out how to get them to calm down. I’ll report more as I find it, but that paper you linked gives me a good target for sure. Thanks again!

Well first of all I must acknowledge that I’ve been very lucky in finding these deals. I don’t know about other countries, but most large datacenters replace their equipment after 3-5 years for a few reasons: the biggest is warranty expiration of course, but the second is insurance. They can’t easily/affordably insure old, unwarranteed equipment so they are taking a huge risk by keeping equipment running past its prime. So you need to find these datacenters and make contacts. Sometimes they’ll even pay YOU to come haul stuff off. The bigger ones will use auctions - as I said above, get on those mailing lists for auctions in your area so you’ll get notified. Also try to make contacts with local recycling centers, specifically electronics recycling. I know one guy that pays by the pound for pallets of discarded electronics, goes through and gets the good stuff, then sells the rest by the pound to the next recycler who is probably going to break down the chips and get the gold. Get friendly with these people and go visit often - bring them your old stuff and they’ll return the favor with great deals! You can’t get lucky without putting yourself in a position to get lucky. :sunglasses:

2 Likes

I wonder if it’s possible to rig a rack door with fans and simply turn off any internal fans. With good static-pressure optimized fans, it should be more than enough to cool the servers down. From my experiments, you can cool about 200 watts of asic miner power with one 120mm fan.

LOL please stop

Luck is the same for everyone! A more fair measure is counting the time spent and multiplying by your hourly rate. Then you can add a surcharge and I’ll buy from you LOL

1 Like

Ooo interesting idea, I hadn’t even thought of that. I do have an enclosed rack with doors that have vent holes. Gonna think on this one…

:laughing: If it makes you feel better, one of them has a few dead pixels on its display.

The problem is my time is priceless but my hourly rate is too low and both are being destroyed by runaway inflation anyway. I’ll take the luck when I can get it! :grinning_face_with_smiling_eyes:

1 Like

Great tip , I have one in my city, I must reach out to them.

1 Like

Got it installed and running - this is four Bladebit plotters pushing plots at once over the network to the harvester/disk array. As expected, the disk write speed is now the limiting factor! I’m getting about 12 plots an hour from them all now instead of 5-6! The other four gigabit adapters are also busy receiving 3 incoming plots from 3 other plotters.

1 Like

Quick update: ran plot production all night on the new 10Gbs network and woke up with ~120 more plots. I’m old enough to remember when 120 plots took me weeks. :exploding_head:

So some napkin math: let’s say 30tb a day and I have almost 1pb of free space left. Even at this rate, I still have a month left of solid plotting to fill all that space.

I have 9 empty bays left in the blade chassis…hmmmmm… :smiling_imp:

2 Likes

Another metrics that could be potentially useful is how many Wh does it take it right now to get one plot.

1PB gives you about 8 chia / month, so that is an easy calculation.

I would check whether folks would be interested in your plotting service before going that route :slight_smile:

1 Like

You’ve had some amazing deals.
I work in and out of datacentres with this kit in the UK (hp c3000/c7000). Sorry you had to find out about the mez card the hard way. And those hp network flex adapters are tricky little buggers to work out as well. Kudos.

I’ve got a c3k at home but after starting it up in my office my wife was like “nope” as the fans noises are mental so I’m back to laptop/nuc/ryzen plotting as I don’t have a garage to hide this stuff away in.

I’m only sitting on 200tb SATA right now but also eyeing up the enterprise gear as a lot of the stuff at my work will be going EOL soon so I hope to buy some off them at a decent price.

How are your financials looking after your total spend? Are you hitting estimated chia wins or are you pooling for regular payments? I haven’t even started doing nft plots yet as I’m winning roughly once every 1-2 months solo.

2 Likes