Jump to content

Anyone had to rebuild a raid 5?


Recommended Posts

Morning. Much to my dismay, it appears one of the disks in our 1 week old shiny new server has died. Using adaptec's config utility, it *appears* that the spare disk is rebuilding the array.

 

According to the company that built the server, it could take the rest of the weekend to rebuild and will be out of action til then.

 

Does this sound right in the length of time and the fact that it is unuseable until rebuilt?

 

I always thought you could pretty much rip one of the disks out of the 4-disk array and the server wouldn't so much as bat an eyelid.. Can anyone confirm one way or the other?

 

Also, is there anyway to confirm it's actually rebuilding? Apart from the status of the spare disk saying 'rebuilding' there are no lights on any of the drives and I can't find a way to get a display of data transfered or anything..

Link to comment
Share on other sites

Morning. Much to my dismay, it appears one of the disks in our 1 week old shiny new server has died. Using adaptec's config utility, it *appears* that the spare disk is rebuilding the array.

 

Ick. My sympathies go out to you! The last time I had to personally do this was some time ago, on a server with 9GB SCSI drives. The process took quite a while - I seem to remember it being hours - so you could in fact be looking at a weekend operation if you are using really large drives. But again, it's been a while since I've done this... anyone else have more recent experience?

Link to comment
Share on other sites

Hi Nicholas,

 

The thing I'm worried about is the lack of any physical signs that any of the disks are doing anything - no lights flickering and no moving status bar or indication of files being moved.. Was it obvious that it was rebuilding when you had this problem? I don't fancy waiting a whole weekend then find it was just sat there doing nothing.

Link to comment
Share on other sites

Was it obvious that it was rebuilding when you had this problem? I don't fancy waiting a whole weekend then find it was just sat there doing nothing.

 

Honestly, I can't recall what happened during the rebuild. I've done a bit of reading and everything points to at least a day for a rebuild of a 300GB drive, unless you have hot swap capabilities. Do you have the option of grabbing data from a backup and working temporarily from that?

Link to comment
Share on other sites

I am working from backup (albeit 24 hours old) According to our supplier it should have hot-swapped but it seems the drive failure was accompanied by a crash which took it out of windows.. and now it won't reboot.

 

I guess I have to wait and see while preparing for the worst ie my raid is completely screwed! The drives are 400Gb each so I might know by close of play tomorrow. I swear it knows I have a deadline..

Link to comment
Share on other sites

Grrrr turns out it wasn't actually rebuilding at all. Somehow the whole array failed. That isn't meant to happen - The whole point of raid 5 is its redundancy in case of drive failure so I'm pretty annoyed, especially given the premium you pay setting up your disks this way.

 

Just bad luck or has anyone else had the same problem?

Link to comment
Share on other sites

Grrrr turns out it wasn't actually rebuilding at all. Somehow the whole array failed. That isn't meant to happen - The whole point of raid 5 is its redundancy in case of drive failure so I'm pretty annoyed, especially given the premium you pay setting up your disks this way.

 

Just bad luck or has anyone else had the same problem?

 

I would be ready to throttle my vendor if it happened to me. You're right, the whole point of a RAID setup is to avoid exactly what you've experienced. Ask your vendor if they did a proper burn-in of the server; it could be that your RAID card was faulty and a good burn-in probably would have detected that.

 

FYI, I've had a low end Dell server here for a year with a RAID 5 setup and haven't had a single problem. There is something to be said for dealing with an enterprise vendor, not that it helps you at this point :(

Link to comment
Share on other sites

I would be ready to throttle my vendor if it happened to me. You're right, the whole point of a RAID setup is to avoid exactly what you've experienced. Ask your vendor if they did a proper burn-in of the server; it could be that your RAID card was faulty and a good burn-in probably would have detected that.

 

FYI, I've had a low end Dell server here for a year with a RAID 5 setup and haven't had a single problem. There is something to be said for dealing with an enterprise vendor, not that it helps you at this point :(

 

The vendor's here now so I'll go and ask him. (I'll avoid the advice about throttling him for the time being as he's lending us another renderfarm machine to help get back up to speed.)

Link to comment
Share on other sites

(I'll avoid the advice about throttling him for the time being as he's lending us another renderfarm machine to help get back up to speed.)

 

Aww, where's the fun in that? ;)

 

I hope you get things sorted out quickly. It's really frustrating having to deal with hardware problems; we've already got too many hats to wear as it is.

Link to comment
Share on other sites

  • 2 weeks later...
So, did you ever resolve things with your server/vendor/sanity?

 

The vendor removed the disks and the controller and has been testing them for over a week. So far he hasn't found anything wrong which is rather annoying. I am tending towards it being a problem with the controller as this seems more likely than 3 disks failing simultaneously, but without any proof it's rather hit my confidence in raid!

 

No problems with the replacement however and it has proved the worth of having a decent backup - we only lost 24 hours work. Also we were able to hook up our old server and carry on working which makes me think it's going to be a good idea to set this up as a mirror rather than chucking it out as originally planned.

 

..My sanity I'm still working on!

Link to comment
Share on other sites

I have had a RAID 5 server for over 2 years now, built it myself, and it runs like a top.

 

The problem with building IDE or SATA RAID is the drives themselves. They simply arent the same quality as SCSI they die. Be prepared. In the two years my server has been running I've blown 3 drives. I have a spare on hand to swap out if it happens and yes, it does take about 2 days to rebuild.

 

It seems to me that the largest concern on my server is heat. Keep it cool and it will keep you safe. I put it in a closet once to mask the noise. I blew a drive in 12 hrs. I now have drive coolers on each drive and front and rear 120mm fans. Not a hicup in 9 mnths.

 

For those others looking to do the same on the cheap.. this can all be accomplished with linux and standard $20 SATA cards. If you have the know how to install linux, you will likely be able to build yourself a safety net for out of an old render-node or workstation.

Link to comment
Share on other sites

No problems with the replacement however and it has proved the worth of having a decent backup - we only lost 24 hours work. Also we were able to hook up our old server and carry on working which makes me think it's going to be a good idea to set this up as a mirror rather than chucking it out as originally planned.

 

Yes, backup is good;) So is having a mirrored backup server, something that is not always affordable/practical. Glad to hear that things are moving forward.

 

The problem with building IDE or SATA RAID is the drives themselves. They simply arent the same quality as SCSI they die. Be prepared.

 

I respectfully disagree, but I'm no hardware guy. Yes, drives can fail and one should be prepared, but I think the real problem with building SATA RAID systems is the system itself, not the drives.

 

For those others looking to do the same on the cheap.. this can all be accomplished with linux and standard $20 SATA cards. If you have the know how to install linux, you will likely be able to build yourself a safety net for out of an old render-node or workstation.

 

IMO, there is no such thing as "on the cheap" with servers; eventually, all costs involving servers converge at a similar point, whether it is a result of upfront hardware costs or maintenance/recovery costs at a later time. Knowing how to install Linux is the tip of the iceberg, and recovering from a disaster in a non-windows environment can be a huge challenge that can end up costing way more than expected.

 

Anyway, Dan's up and running and on the way to server bliss, so that's the most important thing!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...