Hey guys,
I have a server in production and it has a failed HDD. It's actually been failed for over 6 months. Luckily it was RAID 6 so it's been running smoothly on RAID 5 ever since the HDD failed.
I've verified that the HDD is actually dead, by pulling it and trying to get it to boot on a server in the lab. But it doesn't boot. So I know it's a faulty HDD.
So I bring a new HDD into the production server, it flashes a little blue light for a second, it spins up and then it doesn't do anything.
We use adaptec controller cards, and the Max View Software doesn't show the drive at all. I've already told the client to reboot the server, but he's afraid it will make it worse. He then said in two weeks they will have no need for the data on that raid array, so he asked me if he should wait until the data was neglible (after a tv show is completed and delivered.)
I told him it was his choice. Then I get a call from his boss and they are so angry. It's a political nightmare.
But anyways, they want this server fixed, they don't want to pay for a backup. They don't want to reboot it.
So I was thinking of running arcconf rescan to see if it could find the new drive that way. What do you guys think? There is talk of it being a bad backplane because some of the LED identifiers for when you highlight a HDD and try to identify the HDD location physically on the back plane, well some of those indicators don't work. But I think the reason why the HDD isn't auto reconziged is because it's been offline for over six months.
What do you guys think?
OH also, drive 0 now has 16,000 aborted commands and I'm afraid it won't last another two weeks.
It's so difficult to troubleshoot because the client (their tech guy) is so sure it's a backplane failure, but I'm 90% positive it's just an HDD issue. So if I tell the client to do something and it fails, they judge me as incompetent.
I have a server in production and it has a failed HDD. It's actually been failed for over 6 months. Luckily it was RAID 6 so it's been running smoothly on RAID 5 ever since the HDD failed.
I've verified that the HDD is actually dead, by pulling it and trying to get it to boot on a server in the lab. But it doesn't boot. So I know it's a faulty HDD.
So I bring a new HDD into the production server, it flashes a little blue light for a second, it spins up and then it doesn't do anything.
We use adaptec controller cards, and the Max View Software doesn't show the drive at all. I've already told the client to reboot the server, but he's afraid it will make it worse. He then said in two weeks they will have no need for the data on that raid array, so he asked me if he should wait until the data was neglible (after a tv show is completed and delivered.)
I told him it was his choice. Then I get a call from his boss and they are so angry. It's a political nightmare.
But anyways, they want this server fixed, they don't want to pay for a backup. They don't want to reboot it.
So I was thinking of running arcconf rescan to see if it could find the new drive that way. What do you guys think? There is talk of it being a bad backplane because some of the LED identifiers for when you highlight a HDD and try to identify the HDD location physically on the back plane, well some of those indicators don't work. But I think the reason why the HDD isn't auto reconziged is because it's been offline for over six months.
What do you guys think?
OH also, drive 0 now has 16,000 aborted commands and I'm afraid it won't last another two weeks.
It's so difficult to troubleshoot because the client (their tech guy) is so sure it's a backplane failure, but I'm 90% positive it's just an HDD issue. So if I tell the client to do something and it fails, they judge me as incompetent.