RAID Documentation: Difference between revisions

From Research
Jump to navigation Jump to search
No edit summary
Line 84: Line 84:
  //spitfire> '''/c1/u2 del'''    ''do not use '''remove'''; '''del''' will keep the drive, but un-assign it''
  //spitfire> '''/c1/u2 del'''    ''do not use '''remove'''; '''del''' will keep the drive, but un-assign it''
  //spitfire> '''maint rebuild c1 u0 p15'''    ''example, to add replaced drive p15 into Unit u0 on Controller c1''
  //spitfire> '''maint rebuild c1 u0 p15'''    ''example, to add replaced drive p15 into Unit u0 on Controller c1''
== Redistributing users across the Raids ==
To move a user from one raid to another:
# Make sure the user is not logged in anywhere
# Move the files from one raid to another in the appropriate directory for example
#* mv /mnt/raid0/home/m/mdeepwel /mnt/raid1/home/m
# Update LDAP with the new root location
#*In ou=AutoFS,ou=home.users,cn=username,automountInformation, and update to new location. (using phpldapadmin is fine)
# Restart '''autofs''' on any computer or service the user could be using, otherwise they will be able to login, but won't have a home directory.
#* If they use the cluster, you must restart autofs on each node, or restart the whole cluster.
#* Restart teleport, so they can use ssh/ftp to get their files from off site.
# Update amanda to make sure the new user directory is backed up.

Revision as of 17:39, 20 June 2012

Raid Usage

In the Early Days <c> we used Arena EX3 external SCSI-attached RAID arrays. These external units featured dual-redundant power-supplies, and held six IDE drives. Later, we found we could use SATA drives, with a low-profile SATA-to-PATA (IDE) interface, and a little bit of connector trimming. This boosted the capacity for these external SCSI devices, but only to a point - the limitations of the internal 32-bit controller seemed to cap the capacity at 2TB.

So, we phased out these Arena EX3 RAID arrays, in favour of big Chenbro rack-mounted chassis, each capable of holding 16 drives. A 3Ware controller with 16 ports addressed these drives, which are hot-pluggable in the Chenbro chassis + drive-bay-backplane.

Our Chenbro-chassis in-house-built RAID arrays are found on Spitfire and Hurricane. Here is how we use them now:

Spitfire

Controller /c1 contains two RAID arrays:

  • /u0 for /home/users
    • /c1/u0 shows up as a single partition /dev/sda1, mounted at /home/users
    • formed from 14x 500GB drives, occupying physical slots p2-p15. RAID-5 capacity is n-1, for a nominal 6.5TB (6TiB - the reported capacity)
    • uses XFS filesystem


  • /u1 for the Gentoo GNU/Linux operating system
    • /c1/u1 shows up as three partitions, following our classical layout:
      • /dev/sdb1 is mountable at /boot
      • /dev/sdb2 is swap
      • /dev/sdb3 is /, type ext3
    • formed from 2x 150GB drives, occupying physical slots p0-p1. RAID-1 (mirror) capacity is nominal 150GB (140GiB reported)
spitfire ~ # tw_cli /c1 show 

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     6053.47   ON     OFF    
u1    RAID-1    OK             -       -       -       139.688   ON     OFF    

VPort Status         Unit Size      Type  Phy Encl-Slot    Model
------------------------------------------------------------------------------
p0    OK             u1   139.73 GB SATA  0   -            WDC WD1500ADFD-00NL
p1    OK             u1   139.73 GB SATA  1   -            WDC WD1500ADFD-00NL
p2    OK             u0   465.76 GB SATA  2   -            WDC WD5000ABYS-01TN
p3    OK             u0   465.76 GB SATA  3   -            ST3500320NS
p4    OK             u0   465.76 GB SATA  4   -            ST3500320NS
p5    OK             u0   465.76 GB SATA  5   -            ST3500320NS
p6    OK             u0   465.76 GB SATA  6   -            ST500NM0011
p7    OK             u0   465.76 GB SATA  7   -            ST3500320NS         
p8    OK             u0   465.76 GB SATA  8   -            ST500NM0011
p9    OK             u0   465.76 GB SATA  9   -            ST3500320NS
p10   OK             u0   465.76 GB SATA  10  -            ST3500320NS
p11   OK             u0   465.76 GB SATA  11  -            ST3500320NS
p12   OK             u0   465.76 GB SATA  12  -            ST3500320NS
p13   OK             u0   465.76 GB SATA  13  -            ST500NM0011
p14   OK             u0   465.76 GB SATA  14  -            ST3500320NS
p15   OK             u0   465.76 GB SATA  15  -            ST3500320NS

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       229    06-Nov-2011


Hurricane

Controller /c1 contains two RAID arrays:

  • /u0 for projects, which includes SVN, CVS, software-deployments, Amanda-holding-disk, projects/infrastructure (containing eBooks, docs, web_content, scripts and some backups
    • /c1/u0 shows up as a single partition /dev/sda1, mounted at /mnt/raid
    • formed from 14x 500GB drives, occupying physical slots p2-p15. RAID-5 capacity is n-1, for a nominal 6.5TB (6TiB - the reported capacity)
    • uses XFS filesystem
    • Usage, and breakdown as of June 2012:
hurricane raid # cd /mnt/raid/ ; du -h --max-depth=1
15G	./svn
4.0K	./holding
3.5G	./cvs
214G	./projects
763G	./software
995G	.


  • /u1 for the Gentoo GNU/Linux operating system
    • /c1/u1 shows up as three partitions, following our classical layout:
      • /dev/sdb1 is mountable at /boot
      • /dev/sdb2 is swap
      • /dev/sdb3 is /, type ext3
    • formed from 2x 150GB drives, occupying physical slots p0-p1. RAID-1 (mirror) capacity is nominal 150GB (140GiB reported)


Musashi

/mnt/raid4
  • Gentoo GNU/Linux Mirror

RAID Maintenance

We have drive-failures periodically - reported both through Nagios and via daily logwatch-emails. These must be replaced promptly, to avoid data-loss! The Chenbro chassis supports drive-hot-swap, and it used to be that the 3Ware controller would then automatically commence a RAID-rebuild. Lately, however, the replaced drive is showing up as a new Unit, u?. Not helpful :-( but here's what to do:

spitfire ~ # tw_cli
//spitfire> /c1 rescan     will find the replaced drive, assign it to a new Unit u2
//spitfire> /c1/u2 del     do not use remove; del will keep the drive, but un-assign it
//spitfire> maint rebuild c1 u0 p15     example, to add replaced drive p15 into Unit u0 on Controller c1