NFSroot HOWTO
This HOWTO covers network booting in general, and how to set up a system to
run diskless with NFSroot and oneSIS. It will cover the basics for getting a
system to boot using DHCP, TFTP, and PXELINUX, and mount a root filesystem
over NFS. All of these packages must be
configured correctly before a diskless machine can boot and operate with an
NFS root. The configuration order presented here
is largely interchangeable. You could just as easily configure NFS before
DHCP. The end result is the same. After following this HOWTO you should be
able to boot a client node diskless with an NFS root filesystem. To run many
diskless nodes from the same root image or to configure some nodes to
behave differently from others, refer to the
oneSIS HOWTO.
A word about running diskless
First, 'diskless' doesn't mean you need
to open up the case and rip out all your hard drives. It just means that you're
booting up and running the system entirely off of the network. Local disks can
still be used for temp space, swap space, or (with oneSIS) for any portion
of the root filesystem you
want to reside on the local disk. In fact, there can be a fully configured local OS
installed and booting diskless will not affect it in any way.
This can be a great way to test new software environments
before rolling them out, or for switching environments quickly whenever
desired.
The boot process for a diskless oneSIS client booting off the network typically runs through the following steps:
The machine comes up and runs its BIOS. The BIOS initializes the network
card and a program such as PXE does a DHCP request.
A DHCP server somewhere on the network offers an IP address to the node and
the location of its bootfile. This HOWTO covers the use of one commonly
used bootfile: the PXELINUX bootloader.
The node downloads the PXELINUX bootloader via TFTP and runs it.
PXELINUX reads its configuration, then downloads the specified kernel image
(and optional initrd) via TFTP and begins running the kernel.
The kernel will do another DHCP request to receive (minimally) the node's
IP address, hostname, and location of the root filesystem.
The kernel will mount the root filesystem read-only via NFS.
oneSIS create's the node's RAM disk and configures the node to behave as
desired.
The client distribution's rc scripts execute as normal and start all desired daemons and applications.
Network booting is certainly desirable, but not absolutely necessary for running
diskless. The goal here is to get a kernel to your machine by any means necessary.
If you have a kernel residing in local flash or on disk, that will work just
fine. The important part is that you are able to boot. To network boot, the
machine must be capable of receiving a kernel from the network. This is accomplished
in different ways on different architectures. A common method for doing this
on x86 and x86_64 based architectures is to use the Preboot Execution Environment
(PXE) that now comes standard on most motherboards that have onboard NICs (Network
Interface Cards). Look for PXE in the BIOS. If your machine boots off the network
using LinuxBIOS, all the better -- you can skip straight to the 'Configuring DHCP' section.
To have the machine boot from the network, it is necessary to configure your
BIOS to boot from the onboard NIC. If you can't find the option to boot from
your network card in the BIOS, dig around a bit. Sometimes the onboard NIC must
be enabled and the machine rebooted before the network boot option becomes available.
If your BIOS does not support network booting, consider replacing it. If replaing
the BIOS does not fix it or is not an option, the EtherBoot package may be able
to help your x86-based machine boot from the network.
A DHCP server must be running
before any node can boot from the network. When
a machine first boots up, the first thing it does is broadcast a DHCP request
for an IP address and a bootable image to run. When using PXE, the image that is
sent to the node should be
the PXELINUX bootloader.
If you are using EtherBoot (ie: from linuxBIOS), the
filename should actually reference a network bootable kernel image,
which can be created with
mkelf-linux
or mkelfimage.
At this stage of the boot process DHCP sends the filename of
the image to be
downloaded. For this example, the actual file (pxelinux.0) must be
accessible on a TFTP server on the same node as the DHCP server. A typical configuration
for DHCP version 3 could look like this:
|
Example 1: Simple DHCP configuration for NFSroot |
|
# /etc/dhcpd.conf
| ddns-update-style none;
| use-host-decl-names on;
| subnet 10.0.0.0 netmask 255.255.0.0 {
| | option routers 10.0.0.254;
| | group {
| | | filename "/tftpboot/pxelinux.0";
| | | option root-path "10.0.0.254:/var/lib/oneSIS/image,v3,tcp,hard";
| | | host node1 {
| | | | hardware ethernet 01:23:34:56:78:9A;
| | | | fixed-address 10.0.1.1;
| | | }
| | | host node2 {
| | | | hardware ethernet 01:23:34:56:78:9B;
| | | | fixed-address 10.0.1.2;
| | | }
| | }
| }
|
|
In this example, there are two nodes with known MAC addresses configured to
mount their root filesystem from
'/var/lib/oneSIS/image' on an NFS server at 10.0.0.254.
The clients' root filesystem is basically a copy of the installed image of
a linux distribution. One way to create this root image is by using the
copy-rootfs
command as detailed here.
These two nodes are configured to receive the
'/tftpboot/pxelinux.0' file (the PXELINUX bootloader) at boot time.
This file must be relative to the path that the tftp daemon uses as its
root directory. If in.tftpd is running with the '-s'
argument, such as '-s /tftpboot', then DHCP would need to reference all files relative to /tftpboot:
|
Example 2: Filename relative to TFTP root path: /tftpboot |
| # /etc/dhcpd.conf
[...]
filename "/pxelinux.0";
[...]
|
At this point a node should be able to get a DHCP response from a server and download
the PXELINUX bootloader via TFTP. If the node cannot get a DHCP response, check network
connectivity and verify the DHCP server is running and listening on the right interface.
Use tcpdump to snoop on the interface for incoming DHCP from the client node.
Also check that the MAC address of the client node exists in the configuration, and that
you have restarted the dhcpd daemon since making any changes to dhcpd.conf.
If the node receives a DHCP response but cannot download the bootloader, check that the
correct filename is specified in dhcpd.conf, and that the file has
correct permissions for reading. You can verify TFTP functionality (and get better error
messages) by attempting to manually download the bootloader via tftp from the
command line on your TFTP server.
We want to use PXE to specify a linux kernel (and possible an initrd) for the machine
to boot. To this end, we can use the PXELINUX software that comes
with the venerable SYSLINUX package. PXELINUX
is a bootloader that can be configured to download any kernel image/initrd that is
accessible on your TFTP server.
By now, you should be at the point where PXELINUX is running on a node.
When PXELINUX runs, it downloads its configuration from a directory called
'pxelinux.cfg'. The default configuration is downloaded from /tftpboot/pxelinux.cfg/default. It is useful to have /tftpboot/pxelinux.cfg/default be
a link to symbolic link pointing to a more descriptive filename containing the actual
configuration (such as '2.6.32').
A typical PXELINUX configuration to boot an NFSroot node generally looks something like
this:
|
Example 3: PXELINUX configuration for NFS root |
|
# /tftpboot/pxelinux.cfg/default
| default nfsroot |
label nfsroot |
| kernel /vmlinuz-2.6.32 |
| append root=/dev/nfs ro ip=dhcp console=ttyS0,115200 |
label with_initrd |
| kernel /vmlinuz-2.6.32 |
| append initrd=/initrd-2.6.32 console=ttyS0,115200 |
|
Notice the path to the kernel and initrd image are both relative to where the
pxelinux.0 file was download from
(usually /tftpboot). The 'nfsroot' label above will boot the system
using the Linux kernel's built-in
NFSroot mechanism. The 'with_initrd' option will download and run the
specfied initrd. With an initrd generated by oneSIS, this method will also boot
the system using NFSroot.
Both of the above options will send console output to the serial port at a baud rate
of 115200 bits per second. If you have a monitor plugged into the node, don't be
alarmed if you boot, download a kernel, then see nothing but a blank screen. If a
monitor is attached to the node, you will probably also want to send console output to the
screen:
|
Example 4: Output to both the serial console and the screen |
|
# /tftpboot/pxelinux.cfg/default
[...]
append [...] console=tty0 console=ttyS0,115200
[...] |
Refer to 'Documentation/serial-console.txt' in the Linux source for more
information on how to configure your serial console.
Certain features must be enabled in the kernel configuration to use linux's built-in
NFSroot capability. First, the driver for the network card that you are using must be
compiled directly into the kernel (ie: not as a module). Next, kernel-level DHCP support
must be compiled in, and finally NFS and NFSroot support. For a client node with the
e1000 gigabit ethernet card, the following options would be necessary:
|
Example 5: Kernel options necessary for NFSroot |
|
Configuration option | Location in menuconfig |
CONFIG_E1000=y | Ethernet (1000 Mbit) ---> Intel(R) PRO/1000 Gigabit Ethernet support |
CONFIG_IP_PNP=y | Networking options ---> IP: kernel level autoconfiguration |
CONFIG_IP_PNP_DHCP=y | Networking options ---> IP: DHCP support |
CONFIG_NFS_FS=y | Network File Systems ---> NFS file system support |
CONFIG_ROOT_NFS=y | Network File Systems ---> Root file system on NFS |
|
Note: If you use an initrd created with
mk-initramfs-oneSIS,
these options are not necessary. However, the network driver for your network
card(s) must at least be compiled as a module, as well as NFS client support.
The use of a dynamic device filesytem for /dev, such as
udev or devfs is required by oneSIS
due to the read-only nature of the root filesystem. 2.6 and newer kernels use udev, which is much preferred over devfs
and doesn't require any extra kernel configuration.
For 2.4 kernels however, devfs is necessary, which requires a few extra
options in the kernel configuration.
The following options will add devfs support to your kernel.
|
Example 6: Kernel options for devfs support |
|
Options preferred by oneSIS on 2.4 systems (or 2.6 or newer systems without udev) |
CONFIG_EXPERIMENTAL=y | Code Maturity level options ---> Prompt for development and/or incomplete code/drivers |
CONFIG_DEVFS_FS=y | File systems ---> /dev file system support |
CONFIG_DEVFS_MOUNT=y | File systems ---> Automatically mount at boot |
|
Add any other desired option into the kernel. Compile the kernel and copy the bzImage
into your tftpboot directory with a descriptive name such as vmlinuz-2.6.32.
Don't forget to copy the corresponding /lib/modules/XXX directory to
the root filesystem image (/var/lib/oneSIS/lib/modules in our example).
Next, create a PXELINUX configuration to boot this kernel as described above.
At this point you should be able to send a kernel to a client node and watch
the kernel's boot progress on a serial console or on an attached monitor.
Finally, what is NFSroot with no NFS? Our sample DHCP configuration above listed
'/var/lib/oneSIS/image' as the root-path directory on the NFS server.
This is the location of the
root filesystem image to be used for all of the client nodes.
Note: The actual directory where you keep your root image can vary as long as
that directory (or a parent directory) is exported via NFS and matches the
root-path option used in dhcpd.conf.
The NFS server needs to
export the root directory to the client nodes so they can mount it as their
root filesystem. Although we could export the directory to each node
individually, lets export it to all of the machines on the local subnet.
Note: you may want to export your root image read-only for security
reasons -- if so, simply change 'rw' to 'ro' in the following
example.
|
Example 7: NFS Exporting the root filesystem |
| # /etc/exports
/var/lib/oneSIS/image 10.0.0.0/16(rw,sync,no_root_squash)
-bash# exportfs -r
|
After editing /etc/exports, run 'exportfs -r' on the NFS server to
re-export the /var/lib/oneSIS/image directory to all nodes on the
10.0.0.0 / 255.255.0.0 network.
In the DHCP example further above, we specified the NFS server, root-path, and the NFS
options to use when mounting the root filesystem (v3,tcp,hard). Other useful
options to add are rsize and wsize, which will increase the speed of the
NFS connection to the client nodes. Good values to use are rsize=8192 and wsize=8192,
but these values can be increased depending on the strength of your NFS server.
Also, it is highly recommended that you increase the number of nfsd
server threads running on your NFS server to at least 8 or 16 threads. This
can be done by running rpc.nfsd with the desired number of threads.
At this point a node should be able to boot a kernel over the network, receive its IP
address, hostname, and root-path from DHCP, and mount its root filesystem over NFS.
If all is not well and the node fails to mount the root filesystem, verify
that the kernel has been
configured correctly for NFSroot, and that the NFS server is correctly
exporting the right directory. Watch the client's console and the NFS server's
log file for any helpful error messages. From the NFS server, you can try
manually mounting the client's root filesystem onto a local directory. If
you booted using a oneSIS initrd you will be dropped to a shell where you can
troubleshoot the problem. If not, your kernel has probably panicked.
Determine the cause of the problem and try again.
If all has gone well you should see the familiar bootup of the client's linux
distribution. Congratulations, you now have a client running diskless! Now,
to enable many clients to run from this same root filesystem, carry on with the
oneSIS HOWTO.
|