Announcement

Collapse
No announcement yet.

Reinstall RAID to degraded

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Marcel Birgelen View Post
    Hard drives can get remarkably old depending on usage pattern, since they're hermetically sealed, they don't collect any dust or gunk on their vital moving parts like fans do. That being said, a drive that has been in constant use for 10 years is a pretty big liability...

    Hazard-rate-pattern-for-hard-disk-drives-as-a-function-of-operation-time.png

    I think this graph captures the failure over time of your average hard-drive pretty well, although the Infant mortality in this graph is somewhat excessive. Obviously, you need to compress the graph a bit for drives that are under heavy stress.
    That's not what I have seen at all in the SA/SX series GDC Servers I put in. I have had many go over 7 years and there are still quite a few original drives in service at 10+ years. Hitachi or HGST are the longest lived of what they used.... The couple years that GDC had to use Seagate I have seen the most failures of those raid RAID drives starting at about 5 years. Since GDC RAID drives fail very gently, I just wait till one fails to change the set out.

    Comment


    • #17
      As you can see, there are no numbers on the Y axis. The graph is an approximation of the service life of your average hard disk and I think it matches pretty well with my experiences over the years: There is definitely a spike in infant mortality for new disks. Those are the disks that probably suffered from some small hardware defects that play out within a year or so. Disks that make it past a year or so are almost never problematic for the next 3 to 4 years. After that, you clearly see a larger percentage of disks dying due to simple wear.

      Still, it doesn't mean a failure is guaranteed to happen in 10 years, but stuff starts to add up over the years. Mechanical parts start to wear out, grease starts to gunk up and seals start to get porous. Your mileage may vary between hard drive models.

      I've seen some hard drives in datacenters last far longer than10 years in some odd boxes, but those drives have operated in optimal conditions: a room with almost constant temperature and optimal humidity, almost no dust and the disks have probably never been powered down. Still, another disk in that same datacenter, being part of some heavily used RAID array may not see it's 3rd year in service, simply due to the constant heavy duty I/O on the disk.

      Comment


      • #18
        Marcel - that means, if he buys a new drive set, his chances for a drive failing are the same as with keeping the old set. ;-)

        Comment


        • #19
          Carsten, I have noticed long ago that most failures of this stuff happens in the first 90 days after installation if its going to fail at all. If it makes it past three months then its probably going to make it the full warranty period or beyond of the item.

          Comment


          • #20
            Originally posted by Carsten Kurz View Post
            Marcel - that means, if he buys a new drive set, his chances for a drive failing are the same as with keeping the old set. ;-)
            Yep... the horror and paradox of modern-day RAID sets... you start out with fresh new drives. Some of those major storage vendors out there actually mix their drives from different vendors and different batches to avoid ending up with a whole array of duds...

            Comment


            • #21
              So, what's the benefit for Shai if he buys a new drive set, compared to continue using his old set? ;-)
              Last edited by Carsten Kurz; 05-30-2021, 12:30 PM.

              Comment


              • #22
                Peace of mind, and enhanced reliability. Sure, he could replace just the one that has actually failed, and thereafter keep a close eye on the server and replace the others as they go. However, given that all four drives were presumably installed new at the same time, and one has already failed, what are the odds of two more failing at about the same time? Higher than I would like to risk, and if that happens, the screen is down. In fact, this happened to a customer of mine just last week; though thankfully, on their TMS rather than one of the screen servers. The content drive simply disappeared, and when I looked remotely, I saw that the RAID controller was reporting that two out of the four drives had gone bad.

                Not sure about prices in Israel, but here, 4 x 2TB enterprise drives can be had on Amazon for around $350. That is a very small price to pay to insure against a show stopping halfway through, and having to refund 200 customers.

                Comment


                • #23
                  How would peace of mind be justified if the risk for his new drives failing is the same as for his old drives?

                  (just playing devils advocate here)

                  What I would do personally is either upgrading the RAID to 4*4TB, or just buy another 2TB replacement drive and put it on the shelf for now. So far this drive did not exhibit a particular issue. It would be different it the issue had shown up during regular operation. But in this case it seems that just something went wrong when he reinitialized the RAID.

                  Comment


                  • #24
                    Like I indicated before, the graph is a bit overstated, at least in my experience. The chances of an old drive failing are higher than a new one failing. Also, the only way to get "over the hill", is by starting to climb it, else you'll be stuck in limbo forever. Also, new drives do come with warranty, which old ones don't.

                    And yeah, some enterprise storage vendors pro-actively mix their batches of disks to get out of the so-called bathtub curve...

                    Comment

                    Working...
                    X