[labnetwork] HDD Best Practice

N Shane Patrick patricns at uw.edu
Thu Feb 9 12:17:45 EST 2023


RAID mirrors are incredibly useful tools when the expectations and management are properly handled, but they are not a backup and don’t replace the need for a separate backup.

RAID mirrors are a downtime mitigation tool. They allow the computer to continue operating in the event a single disk fails (assuming RAID 1, there are multi-mirror RAID setups as well) and are based on it being statistically unlikely that all disks in the array fail simultaneously. This gives the maintainer time to swap out the bad disk and allow the array to re-silver, all while keeping the system online in most cases. It isn’t uncommon to treat a raid array as a “built-in backup”, but that’s not the true purpose of it.

For instance, having a RAID mirror array will fail if there aren’t policies requiring regular checks on the status of the array or automated monitoring and alert tasks to notify the maintainer of a disk problem. It will also fail if nothing is done about any signs of failure or alerts. It will also not save you if the controller (software or hardware) malfunctions or fails and starts writing corrupted data to both mirrors, or if something in an operating system or on the data bus goes screwy and causes data streams to be garbled before or during transmission. Bad data into the array, bad array. You still need a full backup solution, or at least a backup solution of critical data, like an automated incremental backup solution or a regular disk clone solution like you mention in your post.

So. RAID mirror - minimize downtime if managed properly, not a backup.
Backups allow for recovery when, not if, a failure and downtime occurs. Backups should be segregated copies of data ideally held somewhere other than on the system being backed up.



N. Shane Patrick
Manager, Lab Operations and Safety
Electron Beam Lithography
Washington Nanofabrication Facility (WNF) 
National Nanotechnology Coordinated Infrastructure (NNCI)
University of Washington - NanoES
Fluke Hall 129, Box 352143
(206) 221-1045
patricns at uw.edu <mailto:patricns at uw.edu>
http://www.wnf.washington.edu/ <http://www.wnf.washington.edu/>

> On Feb 9, 2023, at 7:14 AM, Chang, Long <lvchang at Central.UH.EDU> wrote:
> 
> Hi All,
> 
> Yesterday we experienced a HDD failure for our AFM. This event should have been prevented by the Raid Mirror with 2 HDD. Our setup where we just have a single working drive and a spare backup clone has been more successful. We are planning to move away from Raid Mirror. Anyone has expertise/experience here they can share? 
> 
> Thanks,
> Long Chang
> Technical Director
> UH Nanofabrication Facility
> Houston, TX
> lvchang at central.uh.edu
> 
> 
> 
> 
> _______________________________________________
> labnetwork mailing list
> labnetwork at mtl.mit.edu
> https://urldefense.com/v3/__https://mtl.mit.edu/mailman/listinfo.cgi/labnetwork__;!!K-Hz7m0Vt54!kD3GvIsu3cdHCobhN6JZoubyakmhXr0GbvPzLpQkuvlVc4J3kVmQo00o47eLqZ8G7rDSz02L0KnseCkmbfg2$ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mtl.mit.edu/pipermail/labnetwork/attachments/20230209/25b71085/attachment.html>


More information about the labnetwork mailing list