How to install FreeBSD on a ZFS system 'root' partition

Problem

ZFS — the breakthrough file system in FreeBSD 7 (ported from Sun's Solaris 10 Operating System) delivers virtually unlimited capacity, provable data integrity, and near-zero administration. However, FreeBSD's sysinstall(8) does not yet support installing the system onto anything more exotic than a commonly used UFS partition scheme. Furthermore, FreeBSD's boot loader(8) cannot yet load the kernel and modules from ZFS.

Solution

Install FreeBSD onto a bootable USB flash drive along with a copy of the zfsboot script on it. Boot from the USB flash drive and run the zfsboot script. This usbinstall script can be used to prepare a bootable USB drive.

Show usbinstall source

#!/bin/sh # # HOWTO/script to create a USB mounted bootable "root" install for # bootstrapping a root-on-ZFS and the like. # # -- Yarema #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# program=`basename $0` if [ -f "$0.rc" ];then . "$0.rc" fi : ${RELEASE:="8.0-RELEASE"} : ${DESTDIR:="/media"} if [ $# -lt 2 ]; then echo "Usage: ${program} " echo " where is the path to the distribution" echo " files, usually an .iso mounted on /dist" echo " and is usually da0" exit 1 fi dist="$1" vdev="$2" #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# ask() { local question default answer question=$1 default=$2 read -p "${question} [${default}]? " answer if [ -z "${answer}" ]; then answer=${default} fi echo ${answer} } yesno() { local question default answer question=$1 default=$2 while :; do answer=$(ask "${question}" "${default}") case "${answer}" in [Yy]*) return 0;; [Nn]*) return 1;; esac echo "Please answer yes or no." done } #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if yesno "Create one slice covering the entire ${vdev}" n; then dd if=/dev/zero of=/dev/${vdev} bs=512 count=79 fdisk -BIv /dev/${vdev} else exit 1 fi if yesno "Initialize the BSD label" y; then bsdlabel -wB /dev/${vdev}s1 fi bsdlabel /dev/${vdev}s1 while yesno "(Re)Edit the BSD label" n; do bsdlabel -e /dev/${vdev}s1 bsdlabel /dev/${vdev}s1 done newfs -U /dev/${vdev}s1a [ -d ${DESTDIR} ] || mkdir -p ${DESTDIR} mount /dev/${vdev}s1a ${DESTDIR} #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# : ${dists:="base games manpages info dict"} for i in ${dists}; do echo " Extracting the ${i} distribution into ${DESTDIR}..." cat ${dist}/${RELEASE}/${i}/${i}.?? | tar --unlink -xpzf - -C ${DESTDIR} done echo " Extracting the GENERIC kernel into ${DESTDIR}/boot/" cat ${dist}/${RELEASE}/kernels/generic.?? | tar --unlink -xpzf - -C ${DESTDIR}/boot && ln -f ${DESTDIR}/boot/GENERIC/* ${DESTDIR}/boot/kernel/ echo " Copying ${dist}/${RELEASE}/ to ${DESTDIR}..." tar cf - ${dist}/${RELEASE}/* | tar --unlink -xpf - -C ${DESTDIR} # Mounting /var on tmpfs will work for both tmp directories # by creating a softlink from /tmp to /var/tmp rm -rf ${DESTDIR}/tmp ln -sf var/tmp ${DESTDIR}/tmp #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if [ -f $0.local ];then . $0.local fi #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# echo "nullfs_load=\"YES\" # Null filesystem" >> ${DESTDIR}/boot/loader.conf.local echo "zfs_load=\"YES\" # ZFS" >> ${DESTDIR}/boot/loader.conf.local echo "geom_eli_load=\"YES\" # Disk encryption driver (see geli(8))" >> ${DESTDIR}/boot/loader.conf.local echo "geom_mirror_load=\"YES\" # RAID1 disk driver (see gmirror(8))" >> ${DESTDIR}/boot/loader.conf.local ${EDITOR:-"vi"} ${DESTDIR}/boot/loader.conf.local #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# echo "# Device Mountpoint FStype Options Dump Pass#" > ${DESTDIR}/etc/fstab echo "/dev/${vdev}s1a / ufs rw,noatime 0 0" >> ${DESTDIR}/etc/fstab echo "tmpfs /var tmpfs rw,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "proc /proc procfs rw 0 0" >> ${DESTDIR}/etc/fstab echo "# Device Mountpoint FStype Options Dump Pass#" >> ${DESTDIR}/etc/fstab echo "/dev/acd0 /cdrom cd9660 ro,nodev,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "/dev/fd0 /media/fd ufs rw,nodev,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "/dev/fd0 /media/floppy msdosfs rw,nodev,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab ${EDITOR:-"vi"} ${DESTDIR}/etc/fstab #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# umount /dev/${vdev}s1a #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# # EOF
Posted late Tuesday evening, July 6th, 2010

Implementation

The zfsboot script presented here will create the regular bootable 'a' partition, but configure it as RAID-1 using gmirror(8). This approach provides the possibility to directly boot from any of the drives by configuring the BIOS boot order. This works even if only one drive is used, although gmirror(8) will not be doing any mirroring. See Ralf S. Engelschall's <rse@FreeBSD.org> excellent System Disk Mirroring HOWTO and note the implementation here is using GEOM mirror on partitions of a slice on a disk for the boot partition.

Since we're forced to use a regular 'a' boot partition there is some advntage in using a regular 'b' partion for swap. Due to the way ZFS is currently implemented there is a known issue (in FreeBSD and OpenSolaris) where ZFS needs to allocate memory to send an I/O request. When there is no memory, ZFS cannot allocate any and thus cannot swap a process out and free it. Such low memory conditions can cause the machine to effectively freeze — unable to swap out to disk right when it is most crucial to do so. See http://Lists.FreeBSD.org/pipermail/freebsd-current/2007-September/076831.html

The script assumes the FreeBSD 8 disk1 ISO to be mounted on /dist on another host and /root/.ssh/authorized_keys on that host should allow the target machine booted via the usb drive to connect without requiring a password.

Show zfsboot source

#!/bin/sh # # HOWTO/script to create a ZFS pool mounted as an almost bootable "root" # partition. # # Adapted by Yarema from # http://People.FreeBSD.org/~rse/mirror/ # http://Wiki.FreeBSD.org/ZFSOnRoot # http://Wiki.FreeBSD.org/ZFS # http://Wiki.FreeBSD.org/ZFSTuningGuide # # This script will create a 512M gmirror(8) boot partition. # A swap 'b' partition on each drive if more than one is used. # And a 'd' partition for the zpool. # When using more than one drive the zpool can be 'mirror'. # When using three or more drives the zpool type can be 'raidz'. # # The gmirror(8) boot partition will be mounted to the /strap mountpoint # on the zpool. /strap/boot will be nullfs mounted to the /boot mountpoint. # /strap/rescue will be nullfs mounted to the /rescue mountpoint. # /strap/bin and /strap/sbin will be soft-linked to /rescue. # # Normally even single user boot will still mount the zfs as root. # Putting the statically linked /rescue utilities on the boot/'a' # partition seems like a good idea in the unlikely case that the zpool # is unmountable. Albeit zpool(1M) and zfs(1M) are not (yet?) built or # installed in /rescue. # # WARNING: Due to the way ZFS is currently implemented there is a known # issue (in FreeBSD and OpenSolaris) where ZFS needs to allocate memory # to send an I/O request. When there is no memory, ZFS cannot allocate # any and thus cannot swap a process out and free it. Such low memory # conditions can cause the machine to effectively freeze -- unable to # swap out to disk right when it is most crucial to do so. # http://Lists.FreeBSD.org/pipermail/freebsd-current/2007-September/076831.html #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# program=`basename $0` if [ $# -lt 1 ]; then echo "Usage: ${program} [mirror|raidz] ..." exit 1 fi case "$1" in mirror|raidz|raidz[12]) type="$1"; shift if [ "${type}" = 'mirror' -a $# -lt 2 ]; then echo "Usage: ${program} ${type} ..." exit 1 elif [ "${type}" = 'raidz2' -a $# -lt 4 ]; then echo "Usage: ${program} ${type} ..." exit 1 elif [ "${type}" != 'mirror' -a $# -lt 3 ]; then echo "Usage: ${program} ${type} ..." exit 1 fi ;; esac vdev1="$1" vdevs="$@" #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if [ -f "$0.rc" ];then . "$0.rc" fi # The recommended name is simply the letter 'z' which makes output # form mount(8), zfs(8M) and zpool(8M) easy to read for us humans. : ${POOL:='z'} DESTDIR="/${POOL}" # file describing the ZFS datasets to create : ${DATASETS:="$0.fs"} # FreeBSD release directory to install from : ${RELEASE:="7.1-RELEASE"} # Server hostname to net install from : ${SERVER:="server"} # Maximum installed Megabytes of RAM the system is likely to have. # Used for calculating the size of the swap partision(s). if [ -z "${MAXRAM}" ]; then MAXRAM=$(expr $(expr $(sysctl -n hw.physmem) / 1024 / 1024 / 1024 + 1) \* 1024) fi # Set swapsize to double the ${MAXRAM} swapsize=$(expr ${MAXRAM} \* 2) #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# ask() { local question default answer question=$1 default=$2 read -p "${question} [${default}]? " answer if [ -z "${answer}" ]; then answer=${default} fi echo ${answer} } yesno() { local question default answer question=$1 default=$2 while :; do answer=$(ask "${question}" "${default}") case "${answer}" in [Yy]*) return 0;; [Nn]*) return 1;; esac echo "Please answer yes or no." done } #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if yesno "Create one slice covering the entire disk(s)" n; then { # Wipe out any prior zpool, gmirror, fdisk and bsdlabel meta data. umount ${DESTDIR}/strap/boot umount ${DESTDIR}/strap/rescue umount ${DESTDIR}/strap zpool destroy ${POOL} rm -f /boot/zfs/zpool.cache eval "gmirror remove -v boot `echo ${vdevs} | sed -E 's/?:<:?:alnum:+?:>:/&s1a/g'`" eval "gmirror remove -v swap `echo ${vdevs} | sed -E 's/?:<:?:alnum:+?:>:/&s1b/g'`" } > /dev/null 2>&1 for d in ${vdevs}; do dd if=/dev/zero of=/dev/${d} bs=32m count=17 fdisk -BIv /dev/${d} done fi if yesno "Initialize the BSD label" y; then for d in ${vdevs}; do bsdlabel -wB /dev/${d}s1 done fi # Create a bsdlabel where the 'a' bootable partition is 512M plus one # extra sector for gmirror metadata. The default /boot directory with a # FreeBSD 7.0 GENERIC kernel and all the modules takes up about 178MB. # 256MB can be a little tight if you need to keep a backup kernel or two. # 512MB is still a good size for the boot partition. # # Create a large enough 'b' swap partition to be used for a swap backed # /tmp mdmfs, /usr/obj mdmfs, and/or any other partitions you do not care # to keep between reboots which might benefit from the extra speed of mfs. # According to zfs(1M), using a ZFS volume as a dump device is not supported. # Setting 'dumpdev' to the swap partition requires that it be at least as # big as the installed RAM. When setting up a single drive, the swapsize # will default to double the maximum installed RAM the system is likely to # have plus what's left from rounding the 'd' partition down to the nearest # Gigabyte. With more than one drive swap will be striped across all drives # and only needs to be as large on each drive as the installed RAM so that # a coredump can fit. # # Allocate the 'd' partition to be used by the "root" ZFS pool rounded down to # the nearest Gigabyte with the remainder going to the 'b' swap partition. bsdlabel /dev/${vdev1}s1 | awk -v s=${swapsize} ' /^ +c: / { a = 512 * 1024 * 2 + 1 b = s * 1024 * 2 c = $2 d = c - 16 - a - b } END { print "8 partitions:" print "# size offset fstype" printf(" a: %d 16 4.2BSD\n", a) print " b: * * swap" print " c: * 0 unused" printf(" d: %dG * unused\n", d / 2 / 1024 / 1024 ) } ' - > /tmp/bsdlabel.txt bsdlabel -R /dev/${vdev1}s1 /tmp/bsdlabel.txt bsdlabel /dev/${vdev1}s1 while yesno "(Re)Edit the BSD label" n; do bsdlabel -e /dev/${vdev1}s1 bsdlabel /dev/${vdev1}s1 done bsdlabel /dev/${vdev1}s1 > /tmp/bsdlabel.txt if yesno "Apply the same BSD label to all mirrored devices" y; then for d in ${vdevs}; do bsdlabel -R /dev/${d}s1 /tmp/bsdlabel.txt done fi # Create a gmirror of the 'a' and 'b' partitions across all the devices eval "gmirror label -v -b load boot `echo ${vdevs} | sed -E 's/?:<:?:alnum:+?:>:/&s1a/g'`" eval "gmirror label -v -b load swap `echo ${vdevs} | sed -E 's/?:<:?:alnum:+?:>:/&s1b/g'`" #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if yesno "Create the ZFS pool on the 'd' partition(s)" y; then eval "zpool create -f ${POOL} ${type} `echo ${vdevs} | sed -E 's/?:<:?:alnum:+?:>:/&s1d/g'`" zfs set atime=off ${POOL} while read filesystem options; do case "${filesystem}" in /*) command="zfs create" for option in ${options};do command="${command} -o ${option}" done eval `echo "${command} ${POOL}${filesystem}" | sed -E 's/-?:<:o?:>:?:space:+-?:<:V?:>:?:space:+-?:<:o?:>:/-V/g'` zfs get mountpoint,compression,exec,setuid ${POOL}${filesystem} # Only set mountpoint for top level filesysytem # Let the child filesystem(s) inherit the moutpoint if echo "${filesystem}" | egrep '^/[^/]+$' > /dev/null 2>&1; then # Exclude volumes since they do not have the moutpoint property if echo "${command}" | egrep -v '?:space:+-V?:space:+' > /dev/null 2>&1; then mountpoints="${mountpoints} ${filesystem}" fi fi ;; *) continue ;; esac done < ${DATASETS} fi #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# # Create bootstrap directory where the boot file system will be mounted mkdir -p ${DESTDIR}/strap newfs -U /dev/mirror/boot mount /dev/mirror/boot ${DESTDIR}/strap mkdir -p ${DESTDIR}/strap/dev ${DESTDIR}/strap/boot ${DESTDIR}/strap/rescue ${DESTDIR}/boot ${DESTDIR}/rescue # mount_nullfs here is the same as "ln -s strap/boot ${DESTDIR}/boot" # but more resiliant to install scripts unlinking files before untarring. mount -t nullfs ${DESTDIR}/strap/boot ${DESTDIR}/boot mount -t nullfs ${DESTDIR}/strap/rescue ${DESTDIR}/rescue ln -s rescue ${DESTDIR}/strap/bin ln -s rescue ${DESTDIR}/strap/sbin #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# dists="base games manpages info dict" for i in ${dists}; do echo " Extracting the ${i} distribution into ${DESTDIR}..." cat /dist/${RELEASE}/${i}/${i}.?? | tar --unlink -xpzf - -C ${DESTDIR} done echo " Extracting the GENERIC kernel into ${DESTDIR}/boot/" cat /dist/${RELEASE}/kernels/generic.?? | tar --unlink -xpzf - -C ${DESTDIR}/boot && ln -f ${DESTDIR}/boot/GENERIC/* ${DESTDIR}/boot/kernel/ if yesno "Install /usr/src ?" n; then dists="base bin cddl contrib crypto etc games gnu include krb5 lib libexec release rescue sbin secure share sys tools ubin usbin" for i in ${dists}; do echo " Extracting source component: ${i}" cat /dist/${RELEASE}/src/s${i}.?? | tar --unlink -xpzf - -C ${DESTDIR}/usr/src done fi #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# if [ -f $0.local ];then . $0.local fi #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# echo "dumpdev=\"${vdev1}s1b\" # Set device for crash dumps" >> ${DESTDIR}/boot/loader.conf.local echo -n "vfs.root.mountfrom=\"zfs:${POOL}\"" >> ${DESTDIR}/boot/loader.conf.local echo " # Specify root partition in a way the kernel understands" >> ${DESTDIR}/boot/loader.conf.local echo "vfs.zfs.arc_max=\"192M\"" >> ${DESTDIR}/boot/loader.conf.local echo "nullfs_load=\"YES\" # Null filesystem" >> ${DESTDIR}/boot/loader.conf.local echo "zfs_load=\"YES\" # ZFS" >> ${DESTDIR}/boot/loader.conf.local echo "geom_eli_load=\"YES\" # Disk encryption driver (see geli(8))" >> ${DESTDIR}/boot/loader.conf.local echo "geom_mirror_load=\"YES\" # RAID1 disk driver (see gmirror(8))" >> ${DESTDIR}/boot/loader.conf.local echo "# Device Mountpoint FStype Options Dump Pass#" > ${DESTDIR}/etc/fstab echo "/dev/mirror/swap.eli none swap sw 0 0" >> ${DESTDIR}/etc/fstab echo "${POOL} / zfs rw,noatime 0 0" >> ${DESTDIR}/etc/fstab echo "/dev/mirror/boot /strap ufs rw,noatime,nosuid 1 1" >> ${DESTDIR}/etc/fstab echo "/strap/boot /boot nullfs rw 0 0" >> ${DESTDIR}/etc/fstab echo "/strap/rescue /rescue nullfs rw 0 0" >> ${DESTDIR}/etc/fstab echo "tmpfs /tmp tmpfs rw,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "proc /proc procfs rw 0 0" >> ${DESTDIR}/etc/fstab echo "# Device Mountpoint FStype Options Dump Pass#" >> ${DESTDIR}/etc/fstab echo "/dev/acd0 /cdrom cd9660 ro,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "/dev/fd0 /media/fd ufs rw,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab echo "/dev/fd0 /media/floppy msdosfs rw,noauto,nosuid 0 0" >> ${DESTDIR}/etc/fstab # Ensure that zfs_enable="YES" is set for /etc/rc.d/zfs mkdir -p ${DESTDIR}/etc/rc.conf.d echo '# ZFS support' > ${DESTDIR}/etc/rc.conf.d/zfs echo 'zfs_enable="YES" # Set to YES to automatically mount ZFS file systems' >> ${DESTDIR}/etc/rc.conf.d/zfs #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# umount ${DESTDIR}/strap/boot umount ${DESTDIR}/strap/rescue umount ${DESTDIR}/strap #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# zfs set readonly=on ${POOL}/var/empty zfs unshare -a zfs unmount -a -f # At the end, set mount point to 'legacy' so ZFS won't try to mount it automatically zfs set mountpoint=legacy ${POOL} # After install, set the real mountpoint(s) for mp in ${mountpoints}; do zfs set mountpoint=${mp} ${POOL}${mp} done zfs list zfs unmount -a -f zfs volfini #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# # Ensure /boot/zfs/zpool.cache is up to date on the boot filesystem mount /dev/mirror/boot /media mv /boot/zfs/zpool.cache /media/boot/zfs/ umount /dev/mirror/boot #===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===#===# # EOF
Posted late Tuesday evening, July 6th, 2010

ZFS filesystem definition file should be a variation of the zfsboot.fs modified to suit your needs.

Files

All of the zfsboot scripts and config files can be found here.

References

See the following links for further details: