Modern hard drive write speeds -- they slow down as they fill up?

codinghorror · May 2, 2021, 10:09pm

I’ll be honest – I haven’t looked at 3.5" spinny rust hard drives in a DECADE. So grabbing a bunch of 18tb drives in the name of Chia (more density = far better long term value in terms of power and hosting costs) and experimenting with them has been interesting. How fast can they write to fill them with plots? How fast can you read off them to transfer the files off to a different location?

One limit I’ve noticed is the maximum write speed I have seen anyone get when writing to an 18tb drive (any brand; I think I’ve tried WD and Seagate) is approximately 235mb/sec?

235mb/sec writes

… is anyone doing better than 235mb/sec for write speed on any 3.5" spinny rust drives? And then, the kicker…

Write speeds may slow down as the drive fills! I’ve definitely observed as the drive fills, the write speeds decline. This above examples are with the drives more or less empty. Once it gets full, I remember writes dipping down from 200mb/sec all the way to 130mb/sec.

This is kind of annoying, because per the copy time calculator applied to a single K32 101GB plot…

copy speed	time
`235 MB/sec`	`7:20`
`200 MB/sec`	`8:37`
`175 MB/sec`	`9:50`
`160 MB/sec`	`10:46`
`150 MB/sec`	`11:29`
`140 MB/sec`	`12:18`
`130 MB/sec`	`13:15`
`120 MB/sec`	`14:21`
`111 MB/sec`	`15:31`
`100 MB/sec`	`17:14`

(I bolded 111 because that’s max 1 gigabit ethernet copy speed – at that point you might as well do it over the network, there’s no more advantage to directly connecting the drive!)

As you can see, your copy time for plots can double as the drive fills – going from a breezy 7 minutes (440s) … to 15 minutes (931s)!

But read speeds for 3.5" spinny rust hard drives shouldn’t change with available capacity, I don’t think?! Only write times?

eresende · May 2, 2021, 10:33pm

From my limited experience around spinny rust is that the write speed slows as it fills because it starts from the edge of the plate and moves towards the center. So the edge passes through the head much faster than the center of the plate.
I know that some applications use only 40% (from de edge) of the drives to maintain write speeds above certain required threshold.
That is what I know from professional experience but I will try to find more info on that.

Regarding copy of plots, I started with 1318s for a 208GB plot and last one took 1655s same size with 4TB of 8TB full.

EDIT: Found an old video from LTT that explains what I said above:

alphatio · May 2, 2021, 10:36pm

I can confirm the write time increases

from
average(586,567,572,589,592,585,577,579,594,575,563,589)
to
average(798,823,595,586,849,587,847,791,610)

K32 plot size. However, there is a subtle variance due to CPU/ram usage at that exact copying time which is slightly different at each parallel plotting sequence. May need to do it at test bench by single plotting.

codinghorror · May 2, 2021, 10:53pm

Check this out. Simultaneous copy from very fast USB mounted SSDs, to two different identical model, identically USB mounted 18tb hard drives:

Check out how full each drive is:

So, the copy to drive G @ 170mb/sec, which is maybe 50% full, is faster than the copy to drive I @ 150mb/sec, which is closer to 75% full.

This is consistent with the “the fuller the drive, the slower it gets” and I find as you get down to the last few hundred gigabytes you’re lucky to get 120mb/sec…

alphatio · May 2, 2021, 10:56pm

I can see the benefit of this for planning the plotting process.
Let’s say we have 100 empty HDDs. It may be best to plot bit by bit in all HDDs first before filling them up 100%. The plotted one can be mined earlier. Though not that significant in small scale.

eresende · May 2, 2021, 11:00pm

Or you can partition the drives into fast/slow sections and have the finalized plots copy to the fast partition and then a script that moves them to the slow partition in the background.
Not sure if that would help much tbh.

codinghorror · May 2, 2021, 11:10pm

It doesn’t matter that much to me, but you definitely can’t count on a 3.5" HDD to maintain 200+mb/sec when filling it with writes. So that’s something to factor into your calculations.

Just USB-connected a fresh 18tb drive, copying first set of 9 plots to it… and sure enough… 230mb/sec!

Moshensky · May 2, 2021, 11:34pm

Theoretically HDD should speed up closer to the end, because they start to write data from outside to incide circle. Not sure that windows provides accurate data.

codinghorror · May 2, 2021, 11:38pm

It is definitely accurate. So I think it is writing in the other direction, starting on the outer edges. This makes sense, as you would want to generate the best possible first impression “wow look how fast it is when I install it and copy my first files on the drive!”

eresende · May 3, 2021, 12:11am

Disk spins at a constant speed like 5400rpms. On a plate, the sectors at the outer edge perimeter are longer than the center so each magnetic bit goes faster on the edge.
Check the video above.

casualChia · May 3, 2021, 12:37am

I would expect reads to be affected even more so. Write performance can be sheltered a little by the onboard disk cache (assuming write caching is enabled). For chia plotting reads I don’t believe the disk caching is useful as I believe the nature of the reads keeps the cache flushed. Otherwise, all the factors that make writing the inner tracks on the platter slower should equally apply to reads from the inner tracks.

Does anyone take advantage of setting the temp 2 destination dir equal to the final plot dir? If they are the same the final copy is skipped and it is simply a file rename. In this case building the final file occurs in the destination plot dir.

codinghorror · May 3, 2021, 12:43am

Doesn’t it detect this automatically, that a single path is in use and a rename can happen, or do you really have to duplicate the path commands at the command line?

casualChia · May 3, 2021, 2:19am

I thought if the tmp dir and dest dir where on the same drive/filesystem it would be smart enough to do the rename, but it doesn’t. It does the copy anyway. Then found in the docs that if the 2 tmp dir is the same as the final dest it does do the rename. This does work as advertised. Definitely a win if plotting on disk. Suspect it is still a win if tmp dir is in SSD and dest dir on disk, but haven’t done any benchmarking.

codinghorror · May 3, 2021, 2:40am

Well dang, that’s … not great. I’ll add the params to all my plot commands!

JustinLloyd · May 3, 2021, 2:53am

All modern HDDs write from the outside to the inside, which naturally slows them down due to the areal density which is not uniform across the surface of the platter, nor across platters depending on where they are in the stack. Also, if you have fragmented files, they’ll get slower because the write heads have to move around for the varying fragments.

There is a way to defragment your drives on Linux (there might be software on Windows too) that can move specific files to the outer or inner edges of drives for either faster overall linear read/write speeds or lower latency between random blocks. I suspect some of us right now are reliving the halcyon days of watching Norton Defrag reshuffle files on their drives.

HDDs also keep a small amount of flash available that needs to be updated with allocations of blocks and bad blocks - some parts of that information will be stored on the magnetic media itself.

SMR (Shingled Magnetic Recording) drives are worse than CMR (Conventional Magnetic Recording) AKA LMR (longitudinal magnetic recording) drives and PMR (Perpendicular Magnetic Recording) achieves higher density than either but at the cost of increased cross-talk between recorded data and deterioration of the BER (bit error rate) which is why you want to pay attention to whether a drive has MR (magnetoresistive recording) heads or GMR (giant magnetoresistive recording) heads especially when it comes to drives pushing at the UHD-PMR boundaries above 18TB on glass platters.

SSDs also slow down as they fill up, but for different reasons.

If you want to speed up your HDDS and don’t want to fret about exotic solutions, there’s simple measures you cna take. If you are running Windows, PrimoCache and a chunk of RAM and a small SSD can significantly speed up HDD access for either reads or writes or both reads & writes. You can do the same with btrfs or fscache or bcache on Linux if you want simple solutions whereby you can turn off the tiered caching layers after you are done.

For my configuration: I have my plotter configured to plot to small, separate, SSDs for the -t, a single larger SSD for the -2, and then a 10TB HDD with a 200GB L1 RAM cache and a 1TB L2 SSD cache that sits in front of the HDD for the -f. All eight of the -t SSDs have separate, dedicated 16GB write caches in front of them to smooth out those little blips we see every now and again. The -2 has its own dedicated 256GB write cache. On the farmer, this evening, I am configuring the the 4TB DC SATA SSD to have a dedicated 1TB partition to act as an L2 cache that sits in front of the farm HDDs now that the farmer has been upgraded to 10Gbit.

I think the -t/-2/-f area in the Chia plotter codebase could use some smarts that wouldn’t be particularly difficult to implement.

codinghorror · May 4, 2021, 2:02am

Of course I say that and then today, observe even faster write speeds… this disk started empty, and I’m getting a consistent 265MB/sec write! This is with the Seagate Exos 18tb though, so perhaps those are a bit faster than the WD drives!

ryan · May 4, 2021, 2:19am

Wow. This is why I love the internet sometimes. I never really though about that, but it totally makes sense. It’s also not shocking to see the manufacturers cherry picking the best case scenario for the spec sheet. Lol.

From Tom’s Hardware:

On the outside of a 3.5" platter, the track length is approximately ten inches, as opposed to 2.5" close to the spindle motor. At 7,200 RPM this results in an absolute velocity of ~67 MPH on the outside versus ~17 MPH on the inside of a platter. It is obvious why data transfer rates on the outside of a rotating disk are far higher than on the inside.

Assuming that’s accurate, I’m kind of surprised it’s not worse than double the write time once the disk is full.

Harris · May 7, 2021, 10:38am

Are those WD drives originally externals? If so then the performance is intentionally crippled to ‘5400 class’ (as WD describes it) despite spinning at 7200 RPM.

The performance is limited by the firmware/electronics for market segmentation purposes and is one reason why they can sell externals for cheaper than the internal equivalents.

Source:

login-taken · May 7, 2021, 5:28pm

Not only. These often really are worse quality drives.

My WD Reds run way hotter than 7200rpm WD Golds, both helium filled. Vibrate like crazy too, not really suitable for larger disk shelves.

WD doesn’t even give RPM specs for anything Red label or lower for this reason - it’s one production model later staged to bins. I remember little over decade ago there still was just a single model for capacity/rpm combinations. Oughties coming to an end sub-par quality disks started to show up “for archiving”. Now it’s all lower quality process, then aggressive selection.

codinghorror · May 11, 2021, 8:39pm

Yeah I can say for sure these Seagate 18tbs are definitely faster than the Western Digital 18tb drives, by about 30mb/sec. 235 vs 265mb/sec.

I don’t know if they are faster when they are full, but they are definitely faster when empty…