How to fit as many plots as possible on a 16TB hdd?

Thomas_hunter · October 10, 2021, 9:54am

I’m using 16TB hard drives on Linux formatted to ext4 and each drive fits just 138 plots.

Is there a way to tweak the formatting of the drive or remove some overhead so more plots would fit on the harddrive? There’s always some competition for the fastest plotting time, do people compete to fit more plots on a drive too?

I’d prefer to stay with the k32. Madmax plotter I think can’t even make other k sizes.

g0rbe · October 10, 2021, 9:59am

Check this: https://plot-plan.chia.foxypool.io/

slippers · October 10, 2021, 10:31am

Digital Spaceport (I think I’ve seen on this forum) has this web page where he appears to get extra plots if the drive is formatted with the XFS filesystem and a Partition format of GPT 4KiB-aligned. That would be for Linux users or course.

Thomas_hunter · October 10, 2021, 7:44pm

Yes it looks like the solution. XFS seem to have less overhead and just more utilized space for files. Thanks.

But if I umount the drive and run
sudo mkfs.xfs -f /dev/sdap
I get mkfs.xfs: cannot open /dev/sdap: Device or resource busy

I’m not able to find a way how to make it “not busy” and I don’t want to restart the machine because I’ve got some vms running on it.

juppin · October 10, 2021, 9:31pm

I use mkfs.ext4 -m 0 -T largefile4 /dev/sdX1 to get the most space usable for large chia plots…

slippers · October 10, 2021, 10:01pm

Me too but if you look at that web page from digital spaceport, he seems to be able to fit 74 plots on 8TB instead of 73. Haven’t tried it myself though.

Don’t know the reason, sorry unless it wasn’t unmounted properly. Hopefully you’re not trying to format a drive with plots on already though.

frandt · October 11, 2021, 12:16am

You must have another major issue, I am using 14TB HDDs on Windows and get 128 K32 plots per drive and it only has 12.7TB free after formatting.

marshalleq · October 11, 2021, 12:29am

What a great question! I have always assumed that I would get less than most people as I use ZFS and assumed there would be more overhead.

So as a comparison, using ZFS on a 16TB drive with 1M block size I get 141 plots. I’d be interested in others reporting what they get on XFS, ReiserFS, NTFS, APFS etc if they wouldn’t mind.

marshalleq · October 11, 2021, 12:41am

Oh by the way, I just remembered - EXT4 has a reserved space option that defaults to 5%.

You can change it with a setting somewhere (I don’t recall it), but I suspect that’s what is eating up your space. Back when I ran EXT4 it was something I’d do regularly.

The reserved space was originally set as a percentage before the drives got too big and (unless it’s been changed now) the percentage stayed the same but created quite a ridiculous amount of unused space on modern drives. So in your case that would account for 800G of space you can’t use if I’m calculating correctly.

I’d say for non changing data like plots you would probably be safe to set it at 0%.

A quick google…
https://docs.cloudera.com/cloudera-manager/7.4.2/managing-clusters/topics/cm-decrease-reserved-space.html

Hope that helps!

Marshalleq

Thomas_hunter · October 11, 2021, 2:09am

Yes this seems to be the most efficient so far. On the 16TB drive there is only 1.93GiB used when the drive is empty. This fits 147 plots on a 16TB drive

Ext4 with default settings only fits 138 plots.

If I format it to XFS with default settings there is 1.99GiB used. This fits 146 plots on a 16TB drive.

But the Digital spaceport seem to be able to fit 148 plots with “XFS filesystem and a Partition format of GPT 4KiB-aligned”

How can I do XFS Partition format of GPT 4KiB-aligned ?

DigitalSpaceport · October 11, 2021, 6:35am

As mentioned in the comments, this has to do with manufacturers not having a 100% common size which is part of the 512e 4Kn thing which has to do with each drive and manufacturers handling of Advanced Format drive.

That number may change based on mfgr of the drive. What is the model number of the drive? Should be +/-1 from my figure in an annoyingly close fashion for 8 and 16TB models, but not 2 off.

Here is a nice writeup on the exacts of partition alignment with various examples.

Check this excellent article for more on the deep dive nuts and bolts of it. xfs does very good out the gate at self aligning for a filesystem as well.

https://developer.ibm.com/tutorials/l-4kb-sector-disks/

Thomas_hunter · October 11, 2021, 7:26am

I’m using mostly Seagate ST16000NM001G-2K (SB30). Thanks will dive into those articles more.

gmit · October 11, 2021, 12:07pm

On 16TB I’m able to store 141 plots if formatted as btrfs or 147 plots if formatted as exfat.

DigitalSpaceport · October 11, 2021, 4:03pm

have you tried btrfs filesystem resize max /MNTNAME to increase the size of the btrfs FS to the extents of the partition? Also you should check fdisk on that and see if you have resizable partition space.

DigitalSpaceport · October 11, 2021, 4:08pm

In the Thomas-Krenn article, read the example that starts:

Proper Alignment Example using fdisk Versions 2.17.1 or later

Proper alignment can be achieved by deactivating DOS compatibility mode and setting the sector unit (the partition will start at the LBA Address 2,048. In the case of an SSD with a page size of four kilobytes, there will be 256 empty pages at the beginning of the disk. The partition will begin precisely at the start of Page 257).

In general I use the below (warning risk of data loss, should not be used on drives with active data)

parted -a optimal /dev/sdXXX --script mklabel gpt

is what I use to create the -a (alignment) optimal partition in a nice oneliner prior to formatting the disk.

gmit · October 11, 2021, 4:19pm

I’ve used my Synology to host drives that needs to be filled with plots. Partitions were max size and I’ve disabled btrfs redundancy data, although, I suspect that wasn’t really disabled.

In meanwhile, once I’ve discovered it can fit 6 plots more, I’ve switched completely to exfat.

DigitalSpaceport · October 11, 2021, 4:22pm

Yeah BTRFS is a dedicated topic in itself of all it can do, and it does a lot of metadata types and spans that are complex to understand and not really well documented to start with. Glad to hear you reclaimed that space!

Fuzeguy · October 11, 2021, 5:31pm

Win10 pro w/NTFS 16tb (nominal) ,147 plots.

seymour.krelborn · October 11, 2021, 5:51pm

***** Please disregard.
I discovered that JBOD will trash 100% of your data, if a single drive fails.

==========

I believe that setting up a JBOD (just a bunch of disks) RAID would eek out every byte that can be squeezed onto a hard drive.
(I write “believe” because I never actually used JBOD)

The reason for my belief is that, if my understanding is correct, JBOD simply fills one disk, and when there is no more room, it continues to the next disk, etc. And the advantage of JBOD is that when a disk has, for example, only 10GB remaining, the plot file will write out to that available 10GB and finish writing the remainder of the plot on the next drive.

Of course, you could accomplish this with a RAID 0. But then if any drive fails, you lose 100% of your plots across all of the drives in that array.

If a JBOD array works the way I think that it works, then if one of the drives fail, you will lose only the plots that are on that one drive (plus any partial plots that are on that drive and an adjacent drive – one more plots if your first or last drive failed, and two more plots if any of the other drives failed). So you cut your losses, significantly, compared to a RAID 0.

RAID 5 is a RAID 0 with a parity drive (allowing the loss of any 1 drive, and no data loss). If you lose a drive in a RAID 5, it effectively becomes a RAID 0, until you swap out the failed drive, and your RAID controller will automatically build the array back to a RAID 5 (back to being able to absorb a single drive failure).

The problem with RAID 5, as it pertains to Chia, is that you basically lose one drive’s worth of data storage.

So if your RAID 5 has 10 drives in it, then you get the total storage capacity of 9 drives (the 10th, parity drive, is your parachute when a drive fails). So you are paying a bit more per TB of storage space.

If you have 50 drives in a RAID 5, then the cost per TB is, as a percentage, not a big difference. But if you have 5 drives in a RAID 5, then you are paying a big hike per TB to run that RAID 5.

This is why I believe that a JBOD RAID (if my understanding is correct) will give you the maximum available space, and the minimum amount of risk (in the event of a drive failure).

RAID 0 and RAID 5 are faster than JBOD (RAID 0 being the fastest). But for Chia, that speed difference should not matter.

If anyone can confirm what the impact of a lost disk, in a JBOD array is, I would appreciate hearing from them.

==========

***** Please disregard.
I discovered that JBOD will trash 100% of your data, if a single drive fails.

juppin · October 11, 2021, 6:06pm

JBOD = “Just a bunch of disks” means that every hdd in your JBOD pop off as a single drive in your OS… Nothing more and nothing less.