Minty to the rescue, tales of LVM basics and recoveries

In this post I’ll document a few replicable techniques that might help the less experts in managing and recovering a faulty hard drive with LVM in use. Note: it may contain Windows and Virtual Machines.

Why Windows?

Due to my gaming and drawing habits, even though I love the freedom of OpenSource platforms, I find myself using Win7 most of the time. We could argue that in this day and age it’s not even necessary, and I would be better off with a XenServer and a couple passthroughs to run everything in parallel, and I would agree with you. But I’m also lazy, and why fix something that is not broken? In any case all headless servers expertise came to no use when I found myself having to deal with a faulty LVM of a root partition in a notebook hard drive. Sort of a jackpot, of a kind. While LVM have undoubtedly their advantages, I find myself more comfortable in the physical realm rather than the logical. So I wasn’t much of an expert in that regard, and the notebook wouldn’t properly load making all the usual on-machine troubleshooting useless. Being a linux installation I couldn’t even just plug it into my main PC and scan the extN, since I am sporting Win7 for my daily routines. But then it dawned on me…

Why not virtual Zoidberg?

Given the monstrous specs of my PC, and the marvels of virtualization and passthrough technology, I thought to put them all to use and resurrect my dusty VMware Workstation I had lying around for such a long time. While attaching the external hard drive to a USB3 port, I could simply pass-through it to the *nix virtual machine, and while at it I’d try that neat Linux Mint distro I wanted to try for so long (and hence the name of the article). At this point it becomes a simple *nix recovery, which is for the best.

Dealing with LVMs

I armed myself with what documentation I could find, and started going at it:

What does this all mean? Let’s divide it for simplicity. When not dealing with physical partitions but with LVM, there are three different actors in play: physical volumes, volume groups, and logical volumes:

  • Physical Volumes: (pvscan, lvm pvs) are the classical partitions. They can be grouped into a single Volume Group to virtualize disk space and access and only handle space as a virtual entity.
  • Volume Group: (lvm vgs, vgdisplay) can be considered as a union of partitions. Just like logical volumes in RAID setups, VGs support the addition (and/or removal) of drives from it, which makes it easier to silently expand the space available without touching a partition. Suppose for example that we want an additional hard drive in our PC, we can mount the new hard drive, format it and attach it to our current VG. From there we can simply “expand” the Logical Volume(s) we want the additional space to go to, and it’s done. No need to mount partitions in directories or similar, it just becomes a de-facto stripe.
  • Logical Volume: (lvm lvs, lvdisplay) the usable “virtual” partition. These LVs can be mounted just like the good ol’ partitions, and can be used as such. If an hard drive is added to the VG we can simply expand it to make use of the new space. At the same time, we will have no need to take in accounting different mounting problems since it’s just a (possibly striped on various hard drives of different sizes) partition.

With this knowledge at hand, after understanding the concepts of LVMs, it was a matter of simply using mount /dev/<VGNAME>/root /mnt, and recover what salvageable data I had left. A question I never found myself asking (for obvious reasons) was to be asked, though: “what if the damage to the hard drive was fatal for just coincidence, and could I just fix and reuse it for less than critical events?”.

Once upon a time, in a bad block far, far away…

While everybody agrees that a bad block is a great signal for “duck and cover”, I’ve always been more of an inquisitorial type. Armed with a live Mint distro and an idiot proof documentation, I proceeded to simply do the following:

While reinstalling ex novo a new OS, I thought it would be helpful, in order to avoid the pesky “Unrecovered read error – auto reallocate failed” to leave a GB or two as unallocated space, for all intents and purposes. So far everything has worked fine, let’s hope and pray that it will continue to do so, but given the hard drive reassignment to non-aggressive duties, it probably will.

XenServer fix script

In a previous article (Fixing XenServer error “Unable to find partition containing kernel”) I described how to fix a recurring problem after patching XenServer 6.2 installations. While the fix is known from years it’s never been adopted, and different distros (such as Ubuntu LTS 14.04) fail to boot properly when the (on dom0) gets reset to its default state.

Being the lazy person that I am I decided to set up a script to do the work for me, after all we’re admins, not monkeys.

This does just what I/we used to do manually: detects if has been reverted and, if not, patches it up. Supplementary tests added for paranoia 🙂