Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with OpenID
Advertise on LowEndTalk.com

In this Discussion

Dr. Server RAID Failure - All Data Lost

Dr. Server RAID Failure - All Data Lost

Anyone else get this?

`Dear customers, We have suffered 2 drive failure on the hypervisor hosting your VPS. This has resulted in the RAID array completely gone, thus destroying all data on it. We do extremely apologize for the inconvenience caused, but there is nothing we can really do at this stage. We can offer a full refund of your last monthly payment(pro-rated for yearly payments) or a clean VPS with 1 month compensation. How to proceed?

Please open a ticket at your respective billing system: https://portal.drserver.net or https://vikinglayer.com/clients to inform us of your decision.

Once again, we do extremely apologize for the inconvenience. Such events do happen. They are rare, but sadly they exist and we have been hit by one.

Thank you for your understanding, drServer.net/VikingLayer.com`

«1

Comments

  • Waldo19Waldo19 Member

    Ouch!

  • shit happen.

  • williewillie Member, Moderator

    I wonder what the array size was, what size and brand of drives, whether a recovery was in progress when the 2nd drive failed, etc.

    Thanked by 1Janevski
  • @budi1413 said: shit happen.

    Yea, just sucks as I was doing a slow migration of my sites over to there and now I gotta start over.

  • @willie said: I wonder what the array size was, what size and brand of drives, whether a recovery was in progress when the 2nd drive failed, etc.

    Not sure, I did open a ticket about random IO speeds, one hour it'll crawl at 10 MB/s in another hour it'll be over 150 MB/s

  • LeeLee Member

    I minor inconvenience, everyone had their own backups, right?

    It's better to keep your mouth closed, and let everyone wonder if you're stupid; than to open it and remove all doubt.

  • deankdeank Member

    Back up is for pussies. Real men don't keep back up. Real men live dangerously.

    Morningwoodhosting. Somebody get it now.

  • winnervpswinnervps Member, Provider
    edited August 9

    @deank said: Back up is for pussies. Real men don't keep back up. Real men live dangerously.

    The end is nigh, right?

    WINNERvps | LA/NYC/UK/CA/SG/ID Windows Xen Forex VPS, Asia Server, SG Colocation and ID Rack Services

  • NomadNomad Member

    End is nigh for pussies. Real men don't end.

    NameSilo.com coupons: CheapDoms or Discounted
    - Send Everything My Way

  • winnervpswinnervps Member, Provider

    @Nomad said: End is nigh for pussies. Real men don't end.

    Real mean eat pussies

    WINNERvps | LA/NYC/UK/CA/SG/ID Windows Xen Forex VPS, Asia Server, SG Colocation and ID Rack Services

  • lemonlemon Member

    @winnervps said:

    @Nomad said: End is nigh for pussies. Real men don't end.

    Real mean eat pussies

    real man eat ass

    There are lots of Linux users who don't care how the kernel works, but only want to use it. That is a tribute to how good Linux is.

  • desfiredesfire Member

    Are they really using BoxBilling as client area?

  • deankdeank Member

    What I've learned from this thread.

    Real man eat something.

    Morningwoodhosting. Somebody get it now.

  • jvnadrjvnadr Member

    deank said: What I've learned from this thread. Real man eat something.

    And won't keep backup, because they should live dangerously.

    Wow, so much learning! This is not LET, it is Wikipedia!

    On a serious note now: Woops! Shit happens, but it is always frustrating for the clients. Backups or not, it is downtime + effort to restore... Even to really good providers like Andrej @DrServer and @Radi ...

    • If a program actually fits in memory and has enough disk space, it is guaranteed to crash.
    • If such a program has not crashed yet, it is waiting for a critical moment before it crashes.

  • jsgjsg Member

    @Lee said: I minor inconvenience, everyone had their own backups, right?

    I guess not. Sounds like a case of "backup not needed. We have Raid drives".

  • jvnadrjvnadr Member

    jsg said: I guess not. Sounds like a case of "backup not needed. We have Raid drives".

    Raid is not backup. Anyone who thinks that raid is a replacement for own backups, really likes vivere pericolosamente...

    • If a program actually fits in memory and has enough disk space, it is guaranteed to crash.
    • If such a program has not crashed yet, it is waiting for a critical moment before it crashes.

    Thanked by 1WebProject
  • deankdeank Member

    Anyone who knows about computers a little knows raid is never a backup solution.

    But then even Linus thought raid was a backup solution before his raid got corrupted.

    Morningwoodhosting. Somebody get it now.

  • EHRAEHRA Member

    Such a situation is terrible, I hope I never have to go through it. I hope all those affected have up-to-date backups ...

    image

  • williewillie Member, Moderator

    2-drive failures aren't that common but they happen. I wonder what the raid level was of this system (forgot to ask earlier) and what kinds of services it ran.

    Absolutism about backups or anything else isn't helpful. If someone is injured in a car accident we don't say they should have chosen to drive a tank instead of a car. Rather, risk assessment is a thing and the best we can do is make informed decisions when possible.

    Thanked by 2angstrom zed
  • WebProjectWebProject Member, Provider

    @Lee said: I minor inconvenience, everyone had their own backups, right?

    haha, I bet you that not everyone as some people relay on RAID

    VPS Price Match Guarantee on: All our range of DDOS protected XEN-HVM VPS Plans
    Are you looking for best price for self-managed VPS? See WebProVPS website for more details.
  • ChuckChuck Member

    @deank said: Back up is for pussies. Real men don't keep back up. Real men live dangerously.

    Do you always use pull out method?

    Thanked by 2agonyzt PrestigeWS
  • deankdeank Member

    If must use raid, real man use Raid-0. Why zero? Cuz zero sounds cool AF.

    Morningwoodhosting. Somebody get it now.

    Thanked by 1WebProject
  • edited August 9

    Hate to be a monday morning quarterback but I have seen this happen before. RAID is useless if you don't set up automatic notifications of a drive failure. Usually, the drives have similar hours on them so fail around the same time. If one drive fails the other has to work harder and then it fails not too long after.

  • AnthonySmithAnthonySmith Administrator, Top Provider

    Sometimes its not so much a physical drive fail, I have had 2 x hp 410i's throw out 3 drives from a raid 10, raid is good for redundancy but with HW Raid you still have that single point of potential failure.

    1st time I put it down to a freak accident, 2nd time I was gob smacked, managed to recover the data on both occasions although it took so long I wished I had just declared a data loss and insisted on restore from backups.

    These days with most things moving to SSD dealing with large volumes is much less painful.

  • raindog308raindog308 Moderator

    deank said: f must use raid, real man use Raid-0.

    YOLORAID!

    lemon said: real man eat ass

    jvnadr said: On a serious note now: Woops! Shit happens,

    My Advice: VPS Advice

    For LET support, please click here.

  • williewillie Member, Moderator
    edited August 9

    LosPollosHermanos said: If one drive fails the other has to work harder and then it fails not too long after.

    It also might be preferable to take the server offline during the rebuild, rather than trying to have it serve user traffic at the same time as rebuilding. That speeds up the rebuild a lot and keeps the i/o sequential, decreasing risk of 2nd failure. Finally if the drives were near EOL when the first one failed, perhaps both should be replaced. I don't know that 2-drive HDD RAID-1 is still a thing for normal VPS host nodes any more though.

  • jsgjsg Member

    @LosPollosHermanos said: If one drive fails the other has to work harder and then it fails not too long after.

    ?? please elaborate.

  • deankdeank Member

    Well, in a proper raid 5 or 10 or something of that sort, a single HDD failure doesn't bring down the entire raid and operation can go as normal.

    It also means other drives must work harder due to absence of the dead drive. The extra sustained pressure sometimes bring down another HDD that was dying already.

    Raid drives tend to come from the same batch, so when something fails, the others tend to fail at a similar moment.

    Morningwoodhosting. Somebody get it now.

  • williewillie Member, Moderator

    jsg said: ?? please elaborate.

    In a 2 drive RAID system the read load is split across both drives, so if one drive fails the other has to serve all the reads => higher load. Very high load also results from a rebuild in progress, but that's a separate situation from a failure not being noticed soon enough.

  • jsgjsg Member

    @willie said:

    jsg said: ?? please elaborate.

    In a 2 drive RAID system the read load is split across both drives, so if one drive fails the other has to serve all the reads => higher load. Very high load also results from a rebuild in progress, but that's a separate situation from a failure not being noticed soon enough.

    That's only the case when striping and even then the details depend on the implementation.

    When mirroring the remaining drive does NOT have to work harder.

  • deankdeank Member

    Does not have to, but it will.

    Morningwoodhosting. Somebody get it now.

  • msg7086msg7086 Member
    edited August 9

    @jsg said:

    When mirroring the remaining drive does NOT have to work harder.

    If we assume the first drive died was the one working hard, then the remaining one will have to work harder.

  • jsgjsg Member

    @msg7086 said:

    @jsg said:

    When mirroring the remaining drive does NOT have to work harder.

    If we assume the first drive died was the one working hard, then the remaining one will have to work harder.

    No. Mirroring ("having ones data double") simply writes out the buffer twice. The Raid overhead is minimal and the remaining disk is working exactly as hard as if the other disk was still online.

  • msg7086msg7086 Member

    @jsg said:

    No. Mirroring ("having ones data double") simply writes out the buffer twice. The Raid overhead is minimal and the remaining disk is working exactly as hard as if the other disk was still online.

    Don't you ... read from disks?

  • jsgjsg Member

    @msg7086 said:

    @jsg said:

    No. Mirroring ("having ones data double") simply writes out the buffer twice. The Raid overhead is minimal and the remaining disk is working exactly as hard as if the other disk was still online.

    Don't you ... read from disks?

    Mirroring (Raid 1) means that any data is written to both disks - if available. If one disk isn't available it's taken out of the Raid and data is read from and written to just the remaining disk.

    Note that Raid 1 (unlike e.g. Raid 5 or 6) does no striping and no other magic except for minimal housekeeping.

  • deankdeank Member
    edited August 9

    I've rarely seen a raid that survived from a disk crash in server environment at least.

    When it rains, it pours applies perfectly on raid incidents. When a drive dies, another or even 3rd one would soon follow and major disaster occurs.

    It is partially a fault of raid cards from what I've observed. They begin to act up once a drive goes down and sometimes corrupt itself.

    Morningwoodhosting. Somebody get it now.

    Thanked by 1MrH
  • FHRFHR Member, Provider

    jsg said: Mirroring (Raid 1) means that any data is written to both disks - if available. If one disk isn't available it's taken out of the Raid and data is read from and written to just the remaining disk.

    When you read with mirroring, data is read from all disks in the array.

    Affordable Semi-Dedicated VPS - Enjoy the performance to the fullest extent. | 40% OFF promo

  • MikeAMikeA Member, Provider

    I think I'll stick with software raid.

    ExtraVM DDoS Protected VPS

  • jsgjsg Member

    @deank said: I've rarely seen a raid that survived from a disk crash in server environment at least.

    When it rains, it pours applies perfectly on raid incidents. When a drive dies, another or even 3rd one would soon follow and major disaster occurs.

    It is partially a fault of raid cards from what I've observed. They begin to act up once a drive goes down and sometimes corrupt itself.

    I have in fact seen extremely few desasters with Raid, no matter whether hardware or software Raid. One exception: The 410 adapters @AnthonySmith mentioned; I have learned to not trust them.

    In fact I know of just one single desaster. In all other cases the Raid volumes could be rebuilt without any damage remaining. I do agree though that a system should be (otherwise) inactive during a rebuild. With Raid 1 it's probably less critical than with the striped varieties but I personally always recommend to not take the risk.

  • jsgjsg Member

    @FHR said: When you read with mirroring, data is read from all disks in the array.

    Depends on the implementation and the situation. Unless a read request is very large it doesn't make sense anyway nowadays.

  • dahartigandahartigan Member without signature

    Yikes. Lucky I have backups.

  • williewillie Member, Moderator

    jsg said: Unless a read request is very large it doesn't make sense anyway nowadays.

    It's a VPS node, there's lots of concurrent requests and they get split across the drives.

  • HarambeHarambe Member

    @dahartigan said: Yikes. Lucky I have backups.

    Was this on one of the new offer machines? That's just shitty if it's the case. Can't control hardware just deciding to eat it though.

    Professional Shoeminer

  • dahartigandahartigan Member without signature
    edited August 9

    @Harambe said:

    @dahartigan said: Yikes. Lucky I have backups.

    Was this on one of the new offer machines? That's just shitty if it's the case. Can't control hardware just deciding to eat it though.

    My VPS wasn't affected by this, which is one of those deals you're referring to. I just checked.. phew. I do have a dedicated server that has had shit uptime for the past few days though.

    Edit: but seriously.. backups guys. Damn.

  • jsgjsg Member

    @willie said:

    jsg said: Unless a read request is very large it doesn't make sense anyway nowadays.

    It's a VPS node, there's lots of concurrent requests and they get split across the drives.

    (a) reads aren't the main stress factor for a drive, writes are. (b) The major factor in a nodes load is the OS caches and read/write ordering. (c) You will note that with LARGE requests even a hw Raid controller cache doesn't speed things up considerably. (d) who says that nodes have more concurrent requests than say a company server?

    @All

    I'm not interested in belief systems and even less in wars based on them. I wrote what I know and what I have experienced. If some here WANT to believe that the remaining drive in a Raid 1 works oh so much harder (that it often soon dies too), just ignore me and accept my apologies.

  • HarambeHarambe Member

    @jsg said: (a) reads aren't the main stress factor for a drive, writes are

    Yeah, but reads are the main stress put on remaining drives in a rebuild.. and other drives die during rebuilds all the time.

    That's the main point that everyone else is making.

    Professional Shoeminer

  • dahartigandahartigan Member without signature

    Kudos though to @radi for his offer to compensate those affected by refunding them. Not many providers would take ownership like that.

  • jsgjsg Member

    @Harambe said:

    @jsg said: (a) reads aren't the main stress factor for a drive, writes are

    Yeah, but reads are the main stress put on remaining drives in a rebuild.. and other drives die during rebuilds all the time.

    That's the main point that everyone else is making.

    Maybe the real stress is to have a system running during a rebuild as if nothing happened...

  • deankdeank Member
    edited August 9

    I have seen some. Made brave refunds and then went down few months later.

    Morningwoodhosting. Somebody get it now.

  • drserverdrserver Member, Host Rep

    willie said: I wonder what the array size was, what size and brand of drives, whether a recovery was in progress when the 2nd drive failed, etc.

    Array was 6x1tb (samsung 850 pro) less than a six months in production with intel onboard raid controller. that was 4 node fat twin from supermicro. we have lost same pair of drives.

    The Brand New drServer.net ||| 4 Cores - 4GB RAM - 90GB R10 SSD for only 7 USD with promo code LET-it-GO

    Thanked by 2willie Aidan
Sign In or Register to comment.