Recommendation for Linux filesystem on destination

ianj · May 19, 2021, 12:03am

So i have a mix of filesystems on my farm - NTFS, Btrfs, ext4

Given that chia is hardly a challenge in terms of performance (farming, not plotting), and space usage is high on the list, which is the “best” filesystem for the destination.

Out of the box, ext4 has about 5% extra provisioning - which can be lowered to 0% - Btrfs uses more than ext4 at 0% - but NTFS uses a lot less than ext4 or Btfs - in fact so much that my 2x3TB disk will fit an additional plot if i raid 2 together as each has 58GB spare with 27 plots)

Does anyone have any other recommendations before i pair up my drives, copy the contents out, software raid and format in NTFS before copying back in so i can hold 55 plots per drive pair

Is there any particular problem with plotting to NTFS over software RAID in Linux ? I will at least get better file transfer in the last plotting phase. In fact i always handle my drives in 4’s so a raid of 4 will be even faster - SAS 160MB/s so 700MB/sec - better than SSD as it is sustainable

I understand that RAID 0 will threaten the whole set of disks should 1 fail - which is why i havent done it already - i generally use mergerfs to spread the plots over groups of disks but it will not split plots over a number of disks

956E58E4 · May 19, 2021, 12:29am

0% ext4 no brainer.

As far as copy speed, single drive is about 15min 4 drive raid0 4min. But why not just round robin outputting to these 4 drives… Since your total throughput to destination is determined, raid0 is not giving you advantage other than less fs mount points to deal with.

Or if you have extra SAS/SATA ports, just raid0 some small HDD/SSD drives as a buffer and rsync them to the farming drives.

ianj · May 19, 2021, 12:39am

I am using the default mergerfs epmfs which sorta round robins when the plot is well staggered as long as i start with an empty set of disks but i find it frustrating that i waste 50GB of each 2.8TB - yes i know its not a lot but it is still annoying. When i wasnt staggering accurately there were bottlenecks at disk writing but not so much anymore unless something has gone wrong and i have a backlog on SSD (much more often than would like - using SWAR)

My network is only gigabit and i dont have any slots free on my plotters so its not trivial just to add another connect so i tend to have a cache of drives on each plotter which i drip copy (rsync) to the farm with managed bandwidth use and when they eventually fill (gigabit cant keep up) i use sneakernet to transfer the disks and put a new set of disks in the plotter (about every 3-4 days)

Raid 0 with NTFS running might give me 1 more plot per drive pair - ext4/0% not quite

956E58E4 · May 19, 2021, 1:24am

Just fill the drives with file based filesystem, so if you gather multiple 50G for multiple non raid0 disks you can have a fs mounted through loop device which may hold a 101G plot or two. If one drive fails you just lose that drive + part of a 101G plot, not the whole array.

Why bother with ntfs in linux…

I mean dd out the empty space in ext4 volumn and raid0 a fs based on those files spanned over several disks.

Learned new word: sneakernet

So I forgot about my 10Gbit net and sneakernet all the way between plotter and farmer. I have hot swap SATA/SAS on both so unlimited bw hah.

LuaKT · May 19, 2021, 4:52am

Since we’re storing few large files only, you can also save a little space by reducing the inode count.

Grimlok · May 19, 2021, 5:53am

You could use ZFS. Supposedly it helps drives survive bit errors long before they would fail otherwise. It also can improve write speeds with enough disks. But if you have 128 terabytes you lose about 36 TB. Probably 130 some less plots total. You can also pull the drives together and move them to another machine. Think the order of plugging them back in might matter. You’ll need spare HDDs to really benefit.

XiMMiX · May 19, 2021, 10:18am

Redundancy for slot storage makes no sense.

Each TB of redundancy could be farming up until it fails.

Grimlok · May 19, 2021, 2:08pm

With ZFS you can keep disks going, even with bad sectors on the array. It’s for keeping up time, while you replace the farm. You’ll lose about three disks around the same time. Though, you’d have to keep the 8 disks down, while resilvering I believe. probably only matters if you have petabytes on farm.

ianj · May 19, 2021, 2:12pm

I would rather have a file system that degrades and does not collapse when it succumbs to enough failures (hence I am avoiding many RAID setups) - so something that prevents corruption of the metadata on the disk is useful but not too many redundant disks - I MIGHT consider something something simple like a single parity disk across a large number of data disks, calculated offline (say SnapRAID) but committing too many “spares” for something that can be regenerated easily does not make sense

I am more concerned at using my space optimally since a plot is a significant size of each of my disks

It is important to note that i have a larger number of smaller disks (3TB) so individual disk losses can be rebuilt easily - i just have to set something up to detect and replace easily rather than de-stabilising the farm

956E58E4 · May 19, 2021, 2:12pm

I think you did not get what we meant here. Just a single vdev a single dataset a single pool on one disk. If that disk fails just pull it out… no redundancy needed.

@XiMMiX summarize this really well and I thumbed up.

ianj · May 19, 2021, 2:14pm

Yes - that is what i mean - only i did not say it - at MOST a single offline parity just to detect problems ahead of failure

Then again monitoring the eligibility timings performs that function … and more … so what the hell …

956E58E4 · May 19, 2021, 2:16pm

Well no worries, it is lottery ticket. Maybe the bit flip wins you the lottery.

As for read times. Better just modify the chia src and let it check in parrellel. Or docker many many harvesters so each only takes care of one zfs vol.

ianj · May 19, 2021, 2:21pm

They are setup for low capital outlay (except power of course) - right now stacked on desk in 4’s . In order to further the idea of low capital outlay i think ill knock up a box as a DIY enclosure - 32 disks a block each on a SAS expander - all e-waste except the cheap cables - i noticed one of the youtubers printing one , but i don’t have a printer so i guess its plywood and glue … lol

Once i have crammed my disks full that is …

956E58E4 · May 19, 2021, 2:22pm

You can order 3D print online for cheap. Maybe shipping cost will be more than the print itself.

But plywoods also sounds fun.

ianj · May 19, 2021, 2:36pm

It is as much about entertainment as money

kreaninw · July 20, 2021, 5:46pm

By playing around with many filesystems in the past few days, I use and recommend NTFS even if you’re on Linux.

Here are the filesystems that I have tried:

XFS because I heard it specialized with large files. However, it consumed about 0.7% or 42 GB on my 6 TB drive without any file in it.
Btrfs because it’s the favorite child in the Linux world. And I can use GNOME Disk to format my drive to it which is not possible with ZFS. At first glance, it consumed the least on my empty drive, only 3.8 MB without any file in it. However, this amount increased over time when I put more plots into it. My 108.8 GB plot would take 109.3 GB in Btrfs drive.
Ext4 because why not? It’s the default Linux file system after all. Of cause, the default reserved space can be recovered by sudo tune2fs -m 0 /dev/your_drive. After removed the reserved space, it consumed only 92 MB on my 6 TB drive. But still, my 6 TB drive can only fit 54 plots. And that’s with the default inode setting.

In the end, I use NTFS considering that I can just format it using GNOME Disk, put in 55 plots, and still have around 15 GB left in my 6 TB drive. I think it’s too much of a hassle to use Ext4 as I would need to use a command line to format my drives in order to change the default inode setting. Otherwise, I won’t be able to put 55 plots in it.

In fact, NTFS is the default filesystem of my drive. Therefore I don’t even have to format it. NTFS also has better compatibility across the OSes. I put my plots into it and I can just forget it. Less problem is good.

I am farming on my Pi with 20 NTFS drives. My average lookup time is around 0.82 s, 1s maxed.

Qwinn · July 21, 2021, 2:15pm

Yes and no. If you initially formatted the drive on a Windows machine, then it will set aside a small partition for the partition table. If you then transfer that drive to an Ubuntu machine, it will complain about the Microsoft metadata that Microsoft adds, but you can run a command that will then make it work. And if you then transfer that disk back to a Windows machine, it will complain that the metadata is missing but again it can fix itself. So, yes, if you format the drives on Windows, and can handle getting errors that are fixed without too much difficulty when moving them back and forth between Ubuntu and Windows, then it is indeed the most compatible solution between OS’s.

If you format them as NTFS on a Linux machine, though, the partition table partition never gets created, and there’s really no way that a Windows machine can read the drive (at least not without blowing away the existing plots/data). At least, I wasn’t able to do it.

As to the initial question, I am personally using btrfs for my farm drives. I was initially using ext4 with 0 reserve, but found that I would get just as much storage with btrfs without having to disable the reserve, datacow or any of the other data integrity features of btrfs.

kreaninw · July 21, 2021, 2:56pm

I have neither of the problems you mentioned about NTFS. The only thing I noticed is when I plugged the drive in Windows 10, it created an MS folder (I can’t remember the name).

Nonetheless, I can swap my drives around Windows and Ubuntu without any error.

Qwinn · July 21, 2021, 3:10pm

Interesting. Hmm. Well, I’d test it again if I had any drives to spare that I could format, but past that stage now, all plotting done and not buying any more drives atm.

At any rate, assuming for whatever reason one wants to stick with a Linux filesystem on their Linux machines, I still recommend btrfs over ext4, mainly because you can install an ext4 driver on a windows machine but that driver hasn’t been maintained in years, whereas the btrfs driver for Windows seems to work well (at least for reading, which is all you need for farming) and is currently being maintained,

kreaninw · July 21, 2021, 3:34pm

When I debug my USB drives farming on the Pi, I swap the drives many times between my main machine (Windows) and the Pi (Ubuntu). I have never seen any issue you mentioned whether the drives were format on Ubuntu. Windows can just read the drives just fine but it will create a folder that you can delete in Ubuntu later.

For me, I see no reason to use Btrfs or Ext4 for farming drives as it has no benefit at all. In my case, I would lose 32 plots with my 32 HDDs, plus another 15.6 GB per drive that I can use as a Chia’s database storage considering that I only have a 32 GB card on the Pi.

Of cause, I lost some performance by using a FUSE on Linux. However, I still get less than 1s of lookup time. Therefore the performance gain doesn’t worth the space that I would lose using Btrfs or Ext4. And that space will also be more valuable over time.