Efficient ways of handling a multi-petabyte farm of external drives

External drives appear to be the cheapest way of handling drives, but it is difficult to find a computer that can handle a sufficient number of drives to make it worth while. I even added a PCIe USB card to one computer and connecting USB hubs did not appear to increase my USB hard drive limit (it appears some computers are good for around 32, while others around 64). There is some serious insight that appears to be needed to accomplish these things. Maybe even only certain specialized PCIe USB cards.

Plan “B” (if you cannot nail down the technical USB limits) is what I am doing.

I have a second computer, and I have a mapped network to the second computer’s file system. Mounted (similar to Linux style mount points) to that file system are dozens of USB drives.

So computer #1 simply sees computer #2 as a drive letter, and the dozens of USB drives on #2 just sit there as mount points (as directories).

When the time comes that I need yet more USB ports, I can add an inexpensive computer #3, etc.

By the way, I have had some USB headaches, too.
When one of my computers was down to its last available USB port, I added a Sabrent, 16-por,t hub. It is a good design. But I frequently have issues copying complete plots to drives that are connected to the Sabrent hub. And then other USB drives, connected elsewhere, start misbehaving.

I think that the Sabrent hub is a bit buggy.

My work-around is to connect new drives directly to the computer, fill the drives, and once full, then connect those drives to the Sabrent hub. Apparently, the hub work fine for the limited amount of data flow that Chia farming / harvesting performs. But when copying 100GB+ files to drives connected to that hub, I have headaches.

And then there is the issue of where to plug in dozens of power bricks for all of the external drives.

These babies will double your available outlets in your power strips:
https://www.homedepot.com/p/0-5-ft-16-3-Extension-Cord-HDC201/303467490

And for outlets that are too close together for the power bricks (resulting in you having to skip an outlet), the above item will make every outlet usable (more than doubling your available outlets).

I went off topic a bit, because this was related to USB capacity, and my comments might help others.

I hope that you nail down your USB connection / capacity problem.

3 Likes

Short answer : shucking them into sata drives.

Alternative answer : using cheap Pi + usb hubs.

1 Like

I have seen users fighting with their Pi setups.

Others have gone with refurbished SFF office machines. Usually cheaper and a much more functional base station than Pi.

I’m not shilling for anyone, but the refurbished Dell Optiplex line is widely available with many SFF (Small From Factor) machines to be found cheaply at amazon and elsewhere.

1 Like

You are correct, you can also use NUC or cheap SFF machines.

1 Like

Pi + USB is the way to go. Works like a charm. Lowest power consumption.

Yeah, if you only need like 10-12 drives total.

1 Like

Not at all true. Here is an example with 32 HDDs with one RPi https://github.com/Chia-Network/chia-blockchain/wiki/Reference-Farming-Hardware And there are ways to do even more.

I appreciate your link, but some wording seems a bit awry on that as one part claims 1 pi, then another sentence says 2 are needed for 32 hard drives. I already tried a relatively new Pi and I got about 17 drives on it before it wouldn’t read anymore. But, even then, I’m talking about multi-petabyte farms.

Right now, additional USB pcie cards don’t seem to add more to a standard computer (for reasons I don’t know, as it seems they should do so). Internal drives are often more expensive than external, so doing a full SAS cabinet of sorts may be contradictory in value. I’ve got an old x79 motherboard handling around 60+ drives and considering the price and complication of using 4 raspberry pies to reach that; I think my current option is better.

If “there are ways to do even more” then please tell us, that’s the point of this thread.

You can use USB2 hubs on pi3 and pi4 for high drive count. You can also use usb-c hubs on pi4 for more than 32 disks. You can use lots of Pi (cheap). You can buy low-ram versions as harvesting doesn’t require much.

The point of external drives is to increase resale value, not to save cost. The sata route is a lot more appropriate for scaling. Some cases allow for shucking without destruction of the case, so that works the best for both cases!

How can you have over 32 drives on a pi (even with a usb-c hub) when the chip on the pi literally can’t address that many devices?

You can with USB-C, I shared a link in the rpi discussion thread, go and have a look :slight_smile:

I’m not sure I was able to find the link you are referring to that confirmed that

All I gotta say on that is “good luck”. So much easier to go with the SFF PC. I did everything I could, including buying a 2nd Pi just to make sure my 1st wasn’t defective (different hubs, no hubs, USB2 vs USB 3, Etc…). I scrapped the Pi route and my SFF PC just works. Draws 35W if it’s running prime95. Good enough for me. :slight_smile:

2 Likes

also this

1 Like

I agree that SFF/NUC are a lot easier

1 Like

I’ve been having tons of issues adding hard drives and maintaining them on a single computer but I think I’ve noticed 3 big reasons why this is the case:

  1. USB Hubs connectors are shorting out and frying the hubs

  2. Hard drives moved from one hub to another might be given a different name by Linux and not show up in the Chia app until you re-add them manually (as they may be called “Elements 30” even though it was previously called “Elements 1”).

  3. It is frequently necessary to restart the whole chia app if it doesn’t see all the plots (assuming it knows all paths of the drives that may have been moved). Not acknowledging this can result in actions which cause both 1 and 2.

When I’ve had various problems I noticed that when I’d pull the USB hub connector out of one of the ports on the computer and proceed to reconnect them (usually to a different port), sometimes I’d be shocked/electrocuted by them and other times I’d see it electrically arc against the case. I’m guessing that’s from the higher amperage used to transmit all the data from all the drives connected to a given hub. I think this resulted in the hubs eventually frying themselves as they appear to have no built in protections against it.

This may have resulted in me basically answering my own question as this just means a few USB PCIe cards may work just fine.

From How many USB 3.0 PCIe cards are too many? - Troubleshooting - Linus Tech Tips I also found the card 4 Port PCIe USB 3.0 Card w/ 4 Channels - USB 3.0 Cards which is four USB controllers in one and haven’t had any problems though its only been like a week since I’ve discovered both of those issues and worked around them.

2 Likes

Decent hubs should have protection for that.

Can’t you mount them like folder mounts in windows?
That would stop that issue I’d think.

1 Like

I’ve had a couple of what I thought were “decent” hubs that appeared to stop working properly. I may have misdiagnosed some of them due to the drives swapping names and me not realizing it but other times I could confirm drives didn’t show up when connected to certain ports. Or maybe I did it so many times that I overrode the protections.

I don’t know about mounting them like folders in windows. I thought they were already properly mounted by themselves but, apparently not. I mean, some were literally “Drive 4359” that was changed to “Drive 43591”, arbitrarily by Linux, even though no drive with the original name was still connected to the computer.

Even then, the reasons I even had these issues, though, was because the app kept not acknowledging all the freaking plots and I thought it was a hardware issue, causing me to disconnect the hubs, frying them, and having to connect the drives to new hubs.

I can only spk for Windows, where drives will mount themselves, but by making them a folder mount and deleting the drive letter, they stay as fixed mount points, so wouldn’t keep changing paths and causing issues.

2 Likes