Wednesday, August 19, 2009

ZFS boot/root - backup and restore

Today we will look at backup and restore of a ZFS root pool using native ZFS tools.

The command “ufsdump” does not work on a ZFS file system. This should not surprise anybody.

“flashbackup” will supposedly work if you have all the right patches. I recommend avoiding it until Solaris 10 update 8 is released.

The command “zfs send” is used to create backup images.
The command “zfs receive” is used to restore from backup images.

Some of you may ask: “Do we need backup/restore if we have snapshots?” The answer is yes… if you are performing dangerous maintenance on a critical system, it may be wise to have more than one rollback plan.

Additionally, backup/restore may be useful when migrating an OS between servers, or shinking the size of a root pool.

And some of you will ask: “Can’t we just restore the OS from Netbackup?” The answer is yes… but it will take much longer than using native ZFS tools.

I am not advocating that “zfs send” and “zfs receive” be a replacement for regularly scheduled Netbackup backups; instead I recommend that these commands be used when there is a high probability that a restore of a ZFS root pool will be required.

Backup a ZFS root pool

ZFS backups are done from snapshots. This ensures “single point in time” consistency.

It is simple to recursively create snapshots of all datasets in the root pool with a single command:

# SNAPNAME=`date +%Y%m%d`

# zfs snapshot -r rpool@$SNAPNAME


Backups can be saved to local disk, remote disk, tape, DVD, punch cards, etc. I recommend using an NFS server.

It is possible to backup all the snapshots with a single command. But I don’t recommend this unless you wish to have the contents of the swap and dump devices included in the backup. Backups of swap and dump can be avoided by splitting the backup into four separate commands.


Start with a non-recursive backup of the top level dataset (rpool); then perform recursive “replication stream” backups of rpool/ROOT, rpool/home, and rpool/tools.


You might wish to embed the hostname and date in your filenames, but for this example I will use simple filenames.


# zfs send -v rpool@$SNAPNAME > /net/NFS_SERVER/BACKUP_DIR/rpool.zfs_snap

# zfs send -Rv rpool/ROOT@$SNAPNAME > /net/NFS_SERVER/BACKUP_DIR/root.zfs_snap

# zfs send -Rv rpool/tools@$SNAPNAME > /net/NFS_SERVER/BACKUP_DIR/tools.zfs_snap

# zfs send -Rv rpool/home@$SNAPNAME > /net/NFS_SERVER/BACKUP_DIR/home.zfs_snap


Now list the contents of the backup directory… we should see four backup files.


# ls –l /net/NFS_SERVER/BACKUP_DIR/

total 4957206

-rw-r--r-- 1 root root 2369414848 Aug 12 15:45 root.zfs_snap

-rw-r--r-- 1 root root 690996 Aug 12 15:46 home.zfs_snap

-rw-r--r-- 1 root root 92960 Aug 12 15:43 rpool.zfs_snap

-rw-r--r-- 1 root root 166600992 Aug 12 15:46 tools.zfs_snap


The OS backup is now complete.


You may wish to view and record the size of the swap and dump datasets for future use.


# zfs get volsize rpool/dump rpool/swap

NAME PROPERTY VALUE SOURCE

rpool/dump volsize 1G -

rpool/swap volsize 8G -


Also, if there are multiple boot environments it is a good idea to record which one is currently in use.


# df -k /

Filesystem 1024-blocks Used Available Capacity Mounted on

rpool/ROOT/blue 70189056 1476722 58790762 3% /


#############################################################################

Pick a server and boot it from cdrom or network


Pick a server to restore to.


Depending on your requirements, you can restore to the original server or to a different server.

If you restore to a different server, the CPU architecture must match that of the original server.

e.g. don’t try to restore a backup from a sun4v server to a sun4u server.

In my testing I was able to recover from a 480R to a V210 without any problems.


Boot the server from the network or cdrom using a recent release of Solaris.


ok> boot net


Wait for the server to boot.


Follow the prompts to exit the installation program (usually Esc-2, Esc2, Esc-2, Esc-5, Esc2).

##############################################


Prepare the disks

Pick a pair of boot disks. The disks do not need to be the same size as the disks on the original system.


If you are using the original disks on the original system you can skip this entire section and jump to the restore.


If you have an x86 system, you need to first make sure there is a valid fdisk partition with type “Solaris 2”; it needs to be marked as “active”.


Here is a sample fdisk layout:


Partition Status Type Start End Length %

========= ====== ============ ===== === ====== ===

1 Active Solaris2 1 8923 8923 100


Regardless of whether you are using a sparc or x86 system, the disks must be labelled with SMI labels and a valid VTOC.


Do not use EFI labels!


Keep in mind that:

When you create a VTOC on a sparc system it applies to the entire disk.

When you create a VTOC on an x86 system it applies to the first fdisk partition of type “Solaris 2”. i.e. in the x86 world a disk is not a disk.


Only one slice is required on each disk… generally it is recommended to use slice #0.


Usually it will make sense to use the entire disk, but if you don’t wish to have your root pool occupying the entire disk then size slice #0 to fit your needs. If you have an x86 system, you should avoid cylinder 0.


Here is a sample VTOC for a sparc server:

Part Tag Flag Cylinders Size Blocks

0 root wm 0 - 14086 68.35GB (14087/0/0) 143349312

1 unassigned wu 0 0 (0/0/0) 0

2 backup wu 0 - 14086 68.35GB (14087/0/0) 143349312

3 unassigned wm 0 0 (0/0/0) 0

4 unassigned wm 0 0 (0/0/0) 0

5 unassigned wm 0 0 (0/0/0) 0

6 unassigned wm 0 0 (0/0/0) 0

7 unassigned wm 0 0 (0/0/0) 0


Here is a sample VTOC for an x86 server.

Part Tag Flag Cylinders Size Blocks

0 root wm 1 - 8920 68.33GB (8920/0/0) 143299800

1 unassigned wm 0 0 (0/0/0) 0

2 backup wu 0 - 8920 68.34GB (8921/0/0) 143315865

3 unassigned wm 0 0 (0/0/0) 0

4 unassigned wm 0 0 (0/0/0) 0

5 unassigned wm 0 0 (0/0/0) 0

6 unassigned wm 0 0 (0/0/0) 0

7 unassigned wm 0 0 (0/0/0) 0

8 boot wu 0 - 0 7.84MB (1/0/0) 16065

9 unassigned wm 0 0 (0/0/0) 0


Install the bootblocks on both disks


For sparc systems run:


# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t2d0s0

# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t3d0s0

-

For x86 systems run:


# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t2d0s0

# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t3d0s0


Configure the server to use one of the new disks as the boot disk.


For sparc systems this can be done using luxadm(1M).


# luxadm set_boot_dev /dev/dsk/c0t2d0s0


##############################################

Create a new pool and perform the restore


Mount the file system where the backup images are located:


e.g.

# mount -o ro NFS_SERVER:BACKUP_DIR /mnt

# ls -1 /mnt

be.zfs_snap

home.zfs_snap

rpool.zfs_snap

tools.zfs_snap


Create a new pool using the syntax shown here. Change only the disk names.

And make sure to include the slices in the names. e.g. c0t2d0s0 and not c0t2d0.

If you only have one disk, the “mirror” keyword must be dropped.


But don’t drop the “mirror” keyword if you specify more than one disk or you will hit problems later on.


# zpool create -f -o failmode=continue -R /a -m legacy -o cachefile=/etc/zfs/zpool.cache rpool mirror c0t2d0s0 c0t3d0s0


Start the restore; the datasets will be automatically created.

# zfs receive -Fd rpool < /mnt/rpool.zfs_snap

# zfs receive -Fd rpool < /mnt/root.zfs_snap

# zfs receive -Fd rpool < /mnt/home.zfs_snap

# zfs receive -Fd rpool < /mnt/tools.zfs_snap


Optionally you may wish to view the recovered datasets


# zfs list -t filesystem -o name,mountpoint,mounted

NAME MOUNTPOINT MOUNTED

rpool legacy no

rpool/ROOT legacy no

rpool/ROOT/blue /a yes

rpool/ROOT/blue/var /a/var yes

rpool/home /a/home yes

rpool/tools none no

rpool/tools/marimba /a/opt/Marimba yes

rpool/tools/openv /a/usr/openv yes

rpool/tools/bmc /opt/bmc yes


Create datasets for swap and dump.


# zfs create -V 8G -b 8k rpool/swap

# zfs create -V 1G rpool/dump


Set the default boot envionment

# zpool set bootfs=rpool/ROOT/blue rpool


Note: if you created the pool with multiple disk devices and forgot to specify the “mirror” keyword, this command will fail.


Note: if you have EFI labels on the boot disks this command will fail.


Note: if you followed the instructions above, then everything should work perfectly!


It is not necessary to rebuild the device entries prior to rebooting, even if you are migrating to completely different hardware!


In fact, the system will boot fine even if /dev/dsk and /dev/rdsk are empty directories.

If the restore is part of a migration, you may safely edit /a/etc/nodename, /a/etc/hostname*, /a/etc/inet/hosts, etc. Otherwise move on to the next step.


###########################################################

Cross Fingers, reboot, and pray


# touch /a/reconfigure

# reboot


Wait for the server to reboot. The operating system should come up looking like it did before.


If you have recovered to completely different hardware, you may need to modify the network interface files (/etc/hostname.*) to match the network devices. If necessary, all the network devices can be temporarily plumbed by running “ifconfig –a plumb”


Finally, you may notice that the snapshots still exist. They can be recursively removed at your convience.


e.g.

# zfs list -t snapshot

NAME USED AVAIL REFER MOUNTPOINT

rpool@20090813 17K - 35.5K -

rpool/ROOT@20090813 0 - 18K -

rpool/ROOT/blue@20090813 5.07M - 1.41G -

rpool/ROOT/blue/var@20090813 634K - 312M -

rpool/home@20090813 0 - 447K -

rpool/tools@20090813 0 - 18K -

rpool/tools/marimba@20090813 353K - 125M -

rpool/tools/openv@20090813 0 - 27.1M -


# zfs destroy -r rpool@20090813


###########################################################

Reward yourself for pulling off an OS recovery in 15 minutes flat.

i.e go have a an ice cream.

Readers who read this page, also read:




Bookmark and Share My Zimbio http://www.wikio.com

0 comments: