Ok call me crazy - but I’ve had bad luck with hardware based RAID (especially SCSI based). This problem is probably made worse because I end up sitting on a server for a long time. Which means when there is a problem the same equipment can be hard to find. All of that goes away when you use software RAID under Linux.
To be honest, it probably has a lot less to do with my bad experiences so far as much as it does with my good experiences with software RAID. I’ve been running software raid for a long time and it has always served me well - even when things looked really really bad. At some point I got the bright idea to go for the real deal and having just survived a close call with data death, I’m going back to my roots.
So now I have a new server up in a datacenter somewhere. It’s running Debian Sarge. It is currently booting off of the primary IDE drive. Now the test will be getting it to book off of a software RAID setup without touching the server physically. Once that is working I’m going to migrate over my old machine to this new server and call it a day. Last time I did this I think I had to go to the dc at some point, so here goes.
Full disclosure - I started down this path and it turned out badly. Eventually I had to have the server completely re-installed to try again. The directions below try to capture how I actually got it working.
Gettting Started
First things first - make sure that you have your deb box up to date. There is no point to not having the latest everything before you start working thru this. You will also need to add the package mdadm (which is the tool to handle all the raid stuff). In the process of updating, a new kernel got installed. So I’m rebooting before I go on or create any RAID partitions.
After a quick reboot…
Kernel
It turns out my server is a dual proc box, but the installer didn’t install an SMP kernel. I updated this. Then I used the instructions here to download the current version of the kernel (2.6.16-14 at the time of this article). (Also he has a nice article on how to use distcc to compile the kerne;. Using it cut my compile time in half - which is pretty impressive considering the fact that the server I’m compiling on is pretty fast to begin with.)
I ended up having to install
apt-get install kernel-package fakeroot libncurses5-dev
Then I just followed the instructions in the article. In the end, I got a custom kernel which has RAID and ext3 statically compiled in. The first time I tried to do all of this I didn’t do that - so I’m hoping this is the key to solving my problem.
Great - that was easy - now to get some work done. As per normal recomendation - I have 2 250GB IDE drives on my server. Each one is on a different IDE channel (To maximize performance) so hda/hdc in Linux terms.
Another quick reboot to make sure lilo works
Drive Prep
Before you do this use fdisk to make sure that the drives are actually the same size in terms of bytes. I’ve had drives from the same manufacturer that weren’t and it messes things up later. In this case both my drives are 8225280.
/dev/hda is the main drive
/dev/hdc is the seconday drive (where the RAID will be intially installed)
I’m going to be making 4 partitions. Approximatly:
hdc1 50M /boot
hdc2 1GB swap
hdc3 48GB /
hdc4 185GB /data (basically storage for /var /home)
I could have gone with a single / but on my home file server I’ve had odd problems when the partitions got large.
Also the file system type for all other patitions partitions it is fd (RAID - autodetect).
So the actual output looks like this
/dev/hdc1 1 7 56196 fd Linux raid autodetect
/dev/hdc2 8 138 1052257+ fd Linux raid autodetect
/dev/hdc3 * 139 6364 50010345 fd Linux raid autodetect
/dev/hdc4 6365 30401 193077202+ fd Linux raid autodetect
RAID it!
Now that you have the first drive partitioned we need to set up the actual RAID array. Since we are currently using hda to boot off of we will be seting up hdc RAID using a missing drive config and then adding in drives later.
In the event that the drives you are using have ever been used in a RAID array - use the following commands to wipe out the old info
mdadm --zero-superblock /dev/hda
mdadm --zero-superblock /dev/hdc
Here are the commands to setup this array
mdadm --create /dev/md0 --verbose --level 1 --raid-devices=2 /dev/hdc1 missing
mdadm --create /dev/md1 --verbose --level 1 --raid-devices=2 /dev/hdc2 missing
mdadm --create /dev/md2 --verbose --level 1 --raid-devices=2 /dev/hdc3 missing
mdadm --create /dev/md3 --verbose --level 1 --raid-devices=2 /dev/hdc4 missing
Putting missing second is very important!. There is a bug in Lilo that prevents it from installing if the first drive in the array is broken. This causes other complaints from lilo because it doesn’t like that hdc is not the first drive but at least it works.
You can confirm that everything is working by
cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 hdc4[1]
193077120 blocks [2/1] [U_]
md2 : active raid1 hdc3[1]
50010240 blocks [2/1] [U_]
md1 : active raid1 hdc2[1]
1052160 blocks [2/1] [U_]
md0 : active raid1 hdc1[1]
56128 blocks [2/1] [U_]
unused devices: <none>
Now format the drives (from here on out you just deal with the RAID devices)
mkfs.ext3 /dev/md0
mkswap /dev/md1
mkfs.ext3 /dev/md2
mkfs.ext3 /dev/md3
It is not a bad idea to generate a config file to store all the info about the arrays that have been created. Later you can update the file when you add new devices.
Create /etc/mdadm/mdadm.conf - make sure it has the following line at the top
DEVICE partitions
Then update it with the following
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
Mount & Copy
Now you need to mount the arrays to copy over all the data.
cd /mnt
mkdir md
mount /dev/md2 md
cd md
mkdir boot data
mount /dev/md0 boot
mount /dev/md3 data
mkdir data/var data/home
ln -s data/var var
ln -s data/home home
Now you have the RAID devices mounted. Time to copy. Normally I use rsync but I saw this and couldn’t resist (Found it here in another article about BOOT RAID):
cd /mnt/md
tar -C / -clspf - . | tar -xlspvf -
cd /mnt/md/boot
tar -C /boot -clspf - . | tar -xlspvf -
Update the /mnt/md/etc/fstab
/dev/md2 / ext3 rw 0 1
/dev/md3 /data ext3 rw 0 1
/dev/md1 none swap sw 0 0
none /proc proc defaults 0 0
/dev/md0 /boot ext3 rw 0 2
none /proc/bus/usb usbdevfs defaults
#/dev/fd0 /floppy auto users,noauto 0 0
Now modify the /mnt/md/etc/lilo.conf (The other two entries are actually fall backs in the event that raid doesn’t work)
lba32
prompt
boot=/dev/md0
install=/boot/boot.b
raid-extra-boot=mbr-only
map=/boot/map
default=linux-raid
append="console=tty0 console=ttyS0,9600"
serial="0,9600n8"
timeout=50
delay=20
image=/vmlinuz
label=linux-raid
initrd=/initrd.img
read-only
root=/dev/md2
image=/vmlinuz
label=linux-bak
initrd=/initrd.img
read-only
root=/dev/hda3
image=/vmlinuz.old
label=linux.old
initrd=/initrd.img.old
read-only
optional
root=/dev/hda3
mount -t proc /proc /mnt/md/proc
lilo -r /mnt/md
Now make sure it gets put on hda
Back up the lilo
cp /etc/lilo.conf /etc/lilo.conf.original
lilo
cp /mnt/md/etc/lilo.conf /etc/
lilo
Now in a perfect world You would reboot and partition up hda and use
mdadm --add /dev/md0 /dev/hda1
mdadm --add /dev/md1 /dev/hda2
mdadm --add /dev/md2 /dev/hda3
mdadm --add /dev/md3 /dev/hda4
There are two more parts that need to happen. This part is a little fuzzy.
Basically use fdisk to mark hda3/hdc3 as bootable.
Based on some instructions from here
You build a special initrd.img and use that too boot off of.
The reason I say fuzzy - is basically I didn’t do these last two steps the first time around. The system ended up not being able to boot and things got very complicated. I got special access and was able to fix the machine. Normally I would have started all over again and made sure everything worked, but after spending a weekend of working thru this stuff - I figured it would be a bad idea to trash a server that was finally working. So if you go thru this and figure out a better way let me know for next time.