Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow. It's possible that you have nailed this.

Edit: here's why I like this theory. I don't believe that the two disks had similar levels of wear, because the primary server would get more writes than the standby, and we switched between the two so rarely. The idea that they would have failed within hours of each other because of wear doesn't seem plausible.

But the two servers were set up at the same time, and it's possible that the two SSDs had been manufactured around the same time (same make and model). The idea that they hit the 40,000 hour mark within a few hours of each other seems entirely plausible.

Mike of M5 (mikiem in this thread) told us today that it "smelled like a timing issue" to him, and that is squarely in this territory.



This morning, I googled for issues with the firmware and the model of SSD, I got nothing. But now I am searching for "40000 hours SSD" and a million relevant results. Of course, why would I search for 40000 hours.

This thread is making me feel a lot less crazy.


I'm hoping that deep in your spam folder is a critical firmware update notice from Dell/EMC/HP/SanDisk from 2 years ago :).


There are times I don't miss dealing with random hardware mystery bullshit.

This one is just ... maddening.


This kind of thing is why I love Hacker News. Someone runs into a strange technical situation, and someone else happens to share their own obscure, related anecdote, which just happens to precisely solve the mystery. Really cool to see it benefit HN itself this time.


It's also an example of the dharma of /newest – the rising and falling away of stories that get no attention:

HPE releases urgent fix to stop enterprise SSDs conking out at 40K hours - https://news.ycombinator.com/item?id=22706968 - March 2020 (0 comments)

HPE SSD flaw will brick hardware after 40k hours - https://news.ycombinator.com/item?id=22697758 - March 2020 (0 comments)

Some HP Enterprise SSD will brick after 40000 hours without update - https://news.ycombinator.com/item?id=22697001 - March 2020 (1 comment)

HPE Warns of New Firmware Flaw That Bricks SSDs After 40k Hours of Use - https://news.ycombinator.com/item?id=22692611 - March 2020 (0 comments)

HPE Warns of New Bug That Kills SSD Drives After 40k Hours - https://news.ycombinator.com/item?id=22680420 - March 2020 (0 comments)

(there's also https://news.ycombinator.com/item?id=32035934, but that was submitted today)


Easy to imagine why this didn’t capture peoples’ attention in late March 2020…


Yes, an enterprisey firmware update - all very boring until BLAM!


Was HN an indirect casualty of Covid?


Interesting how something that is so specifically and unexpectedly devastating, yet known for such a long time without any serious public awareness from companies involved, is referred to as a "bug".

It makes you lose data and need to purchase new hardware, where I come from, that's usually referred to as "planned" or "convenient" obsolescence.


The difference between planned and convenient seems to be intent. And in this context that difference very much matters. I wouldn’t conflate the two.


Depends on who exactly we are talking about as having the intent...

Both planned and convenient obsolescence are beneficial to device manufacturers. Without proper accountability for that, it only becomes a normal practice.


> Depends on who exactly we are talking about as having the intent...

The manufacturer, obviously. Who else would it be?

Could be an innocent mistake or a deliberate decision. Further action should be predicated on the root cause. Which includes intent.


Popularity is a very poor relevance / truth heuristic.


I wanted to upvote this comment but that just feels wrong.


You're a good man, Charlie Brown.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: