Has anyone here personally killed an NVME ? (NVME Endurance)

Chia effect? Did you buy the two 980 pros at different times?

All within a couple weeks of each other.

The drive health percentage is a “black box” value calculated by the SSD itself, using internal telemetry data that is not necessarily exposed by SMART, and it is not exclusively derived from the “Total Host Writes”.

The “Total Host Writes” presented by utility apps (Samsung Magician, CrystalDiskInfo) is derived from the “Data Units Written” SMART value which is the number of 512 byte units that the host has written to the drive, or more specifically, to the controller.

Due to Write Amplification, the actual amount of data physically written to NAND flash (“NAND writes”) is usually more than the “Data Units Written” by the host. This is why the drive health percentage is not exclusively attributable to the “Data Units Written” value, even though they are related. Overprovisioning flash is one way to mitigate the effects of Write Amplification, and this is probably another metric used to calculate the drive health percentage value.

AFAIK, Samsung NVMe consumer SSDs don’t expose the NAND writes value, but Intel DC drives do (use nvme-cli to get the nand_bytes_written). Divide this by the host writes to get the Write Amplification Factor (WAF), on my Intels it’s 1.2.

Keeping the WAF low is crucial to ensure SSD longetivity, and this can be achieved with the usual best practices such as ensuring TRIM/discard is operating correctly, and keeping lots of free space available on the drive (Mad Max’s “serial” plotting is preferable as it keeps SSDs less full compared to the official plotter doing multiple plots in parallel).

In addition to the above, factors such as operating temperature may also contribute towards the drive health percentage value, but as it’s a “black box”, we can only make guesses here.

In your case, the apparent inconsistency in SMART values may be due to different workloads, one drive being more full than the other for more of the time, disabled or inefficient TRIM, or maybe even different operating temperatures.

Further reading/watching:

  1. Write amplification - Wikipedia
  2. Endurance for Chia plotting - How SSDs work, endurance, WAF, and TRIM - YouTube
6 Likes

Intel Optane P4800X 375GB

Having one 980 Pro here. TBW according to datasheet: 600TBW.

Funny enough writes according to Magician are at 1150TB but CrystalDiskInfo shows 19% life remaining. Also all other metrics show all is good (SMART self test, Magician Test).

Screenshots:
image

1 Like

image

1 Like

This will take a while i guess:

image

1 Like

What drive is that? 375GB, that’s an odd size.

Intel® Optane™ SSD DC D4800X (375 GB, 2,5 Inch U.2 NVME 3D XPoint™)

Really cheap, bought it for 349€ here in Germany

https://asaboshisystems.de/product/hpe-375gb-nvme-pcie-ssd-p02559-001/

It’s not only super durable but also really fast too.

But that isn’t really what we are trying to track here. That is a traditional SFF SSD with an NVME interface. It is capable of 2400 MB/s compared to the 7000 MB/s of something like a Samsung 980 Pro. It has high TBW because it is a standard SSD. This is what an NVME drive looks like (shown below, if you are unfamiliar). These are very high speed but do have a shorter life span. I just haven’t seen one die yet.

1 Like

Why is a U.2 NVMe not interesting in this thread?
This U.2 NVMe devices are from hardware side exactly the same as a PCIe or M.2 NVMe SSD!
All talk NVMe over PCIe only the hw interface differs!

:-1:

1 Like

Because the discussion is supposed to be more about the drive, not the interface. It is the drive that dies, not the interface.

I guess I should have called the thread V-NAND SSD or something along that line. But I assumed we would all be on the same page.

Dude you have no idea. Google a little bit. Then you might understand what Intel Optane 4800 and 5800 are. And after you did that, you know That you Wrote nonsens. These SSDs are the best you can get for Chia plotting - in performance and durability.
Do you just want to compare Consumer m.2 ssds, ok But whats the Point if you can get something better that is even affordable.

And what did I write that was nonsense? The specs I quoted came right from the spec sheets. I know about the optane drives. But just because some say they are the best, doesn’t make the specs magically change. It is still a Small Form Factor SSD. It is still 2400 MB/s. Yes, it can write forever, but so can all other SSD’s of that form factor.

So, what was nonsense?

Oh year you are Right Good day.

Just say “low TBW consumer ssd”

I think that was your intention right? You were messing yourself up a bit there with the nvme/u.2/sff discussion as that is just the form factor or connection type and not really relevant to the TBW rating or internal components.


We all know enterprise ssd’s last for a long time and some are super fast as well, we also know that they cost an arm and a leg. The question here is if anyone actually killed their consumer ssd by plotting Chia.
Why the question:
They are supposed to be a “bad choice” for plotting because of the low TBW rating. But they are much cheaper so many people still opt for them over enterprise ssd’s. But if the consumer models actually survive much longer than their TBW rating, the equation changes, so this is why it is interesting to know how long they survive in real world plotting conditions.

My WD black sn750’s 1TB are 0% and 1% health atm, but I’m done plotting now so I dont think I will kill them. Will check out the writes on them later, I guess they have written about 1000 plots each by now, TBW is 600

2 Likes

My Samsung 980 Plus 2TB has only reached about 25% of its warranted TBW and is generally working perfectly.
I have had two issues.
One, I have twice interrupted a plot and the files were left on the NVMe. Even with all Chia processes turned off and a re-boot I was not able to delete the files. Only formatting the drive solved the problem.
Two, more recently and probably solved but not sure yet. Two plots in a row choked part way through (different point each time) phase three. The error said it failed in a write to the plotting NVMe and then failed many retries. The drive had lots of space available. I did all of the full checks using Samsung Magician and found no bad sectors or other errors.
We had no AC for three days when my recent fails occurred and the drive was running closer to 62C than its usual 52C so I am hoping this was the problem.
AC back on and I am plotting again. I’ll post if it fails again in frantic need of help, lol! :grinning:

I have two 1 Tb 980 Pro, one at 165% and one at 145%, and still plotting as fast as day one! I love them!
Temperature has been hovering around 60 for sensor 1 and 80 for sensor 2.

1 Like

Since it’s at 0% health already, I’'ve decided to try and kill it by plotting my last 28 TB on this drive (don’t expect it will die though)

I not completely clear on the exact meaning of the values and how it related to write amplification and TBW, but as far as I can see this 600 TBW drive has now written over twice that amount and is still happily plotting away.

So from the lack of people actually coming forward are we to conclude than really nobody here on this forum has actually managed to kill off a ssd??

3 Likes

Since the beginning of this thread the collected info certainly seems to indicate that most NVMEs will last far longer than their rated TBWs.

I think that some executives had a fear reaction when they heard about Chia plotting. Others then saw the marketing possibilities of selling their “high TBW” drives and pushed the fear further.

Sure, if you try to plot on some old, small SSD, it probably wont last long, but a quality NVMe will usually last longer than rated TBW and will only end up costing you pennies per plot.