Debian Lenny Grub install fails when /boot is on a RAID1 device

Jeff Spencer's picture

I noticed something on my Debian Lenny box that runs software RAID1 that struck me. At one time, I had a drive (sdb) fail. When I was testing sdb with sda unattached, sdb wouldn't boot because the MBR of sdb didn't have a record. That was odd, because Debian installed fine with with /dev/md0 as /boot, and grub installed without failure, and tail /proc/mdstat showed that the drives were UU. I thought that I was fine.

Oh... what was I thinking.

I found this on a Debian hack list on the subject from 2004. (continued after quote)

Apparently, grub is trying to get 'smarter' about devices, but in the
process is breaking functionality that used to work. With /boot
installed on a raid1 device (md0), grub-install and update-grub both
fail dismally, reporting that /dev/md0 doesn't exist as a BIOS device
(like that's a news flash!).
The stupid hack to get this to work is:

  • - Install grub from the debian installer menu (it will fail)
  • - Switch to a console and edit /boot/grub/device.map, replacing the
  • (hd0) /dev/??? entry with (hd0) /dev/md0
  • - Re-run install grub from the debian installer menu (it will still
  • fail, but get farther this time)
    • - Run update-grub from a console to create /boot/grub/menu.lst
    • - Run grub and install grub to both HDD's:

    root (hd0,0)
    setup (hd0)
    setup (hd1)

    • - Continue w/o boot loader from the debian installer menu (you manually installed grub)

    It's worth noting that grub from stable has no problems with /boot on a raid device (once installed). I can see grub-install perhaps failing with /boot on a md device, but why in the world does update-grub even care where boot is?!? All it needs to do is edit menu.lst.

Source of hack from Problems/Workarounds on Debian buglist mail

So, my takeaway is that you should first install the OS with /boot as /dev/md0, and load grub on sda (hd0,0) - and it will work fine. (Do not make the mistake of trying to load grub to /dev/md0 because it will fail.... not only happens with Debian, but also Fedora). After you install on drive 0, then enter the console and do the same thing for drive 1. At that point, you can literally unhook one drive and check that each boots to grub - one at a time.