oneSIS HOWTO: A typical setup for a cluster image:
Large clusters typically don't simply consist of a large group of compute
nodes. There are almost always some peripheral nodes needed for supporting
infrastructure such as login/compile nodes, I/O nodes, administration nodes,
etc. This section of the HOWTO will show how to set up several
of nodes to handle each of these functional roles in a typical cluster
with different behaviors
Note: The concepts introduced here are likely to be unfamiliar to you,
even if you are a skilled systems administrator. Once you grasp how it works,
though, the techniques will seem quite natural.
In 'the simplest setup' we configured a group
of nodes to all use the same root image and behave in exactly the same way.
In this section of the HOWTO we'll set up a configuration that will enable
several groups of nodes to all use the same root image, but still have
differences in their configuration files and system services.
We will set up this particular image to handle
classes of compute nodes, admin nodes, I/O nodes,
and login nodes.
For this exercise, lets assume that the following naming convention has been
designated for the nodes:
Nodes with hostnames matching the pattern ccn\d+ are
Nodes with hostnames matching the pattern cadmin\d+
are admin nodes.
Nodes with hostnames matching the pattern cio\d+
are io nodes.
Nodes with hostnames matching the pattern clusterC-\d+ are login nodes.
Lets set them up.
Example 1: oneSIS node class definitions
When the nodes come up with those hostnames, they will each assume the
role that has been assigned to them. Don't worry, there nothing magic going
on. The directives for defining the behavior of each role are very clear.
Lets walk through a typical setup, and define the differences that make
each of these node classes unique.
Now that we have defined our classes, we need to
explicitly define what makes each class different
from the rest. All nodes will use the same /etc/hosts file, so we
don't need to configure a difference for that file. However, suppose our
io nodes need to mount different filesystems than the rest of the nodes.
We'll need to configure /etc/fstab to be different for those nodes.
This can be done by defining a class-specific
linkback for /etc/fstab.
Example 2: Defining a CLASS-specific linkback
LINKBACK directive is the
single most powerful directive for defining the individual or group behavior
of your nodes.
The 'CLASS' linkback above specifies that we
want the /etc/fstab file to 'link back' to a
class-specific version of itself. The io
nodes will use /etc/fstab.io, the compute nodes will use
/etc/fstab.compute, the login nodes will use
/etc/fstab.login, etc. Click here
to see how it works.
However, we really only wanted the io nodes to be different. No
problem. If we
simply don't create the /etc/fstab.compute file, our compute
nodes will fall back to using the 'default' file: /etc/fstab.default.
The same holds true for every class of nodes:
if a class-specific file doesn't exist, the
'.default' is used instead.
We do need to create the class-specific version of
/etc/fstab for our io nodes, though: so, create the
/etc/fstab.io file with the desired
configuration for the io nodes. Now, when an io node determines
its role at boot time, it will use the class-specific
version of /etc/fstab
we have configured for it. All other nodes will use the default.
This same technique can be used to create
class-specific versions of almost any file or
Lets take this one step further. Suppose one of our admin nodes,
'cadmin2', has directly attached storage that should be mounted from
/etc/fstab. We don't want all admin nodes trying to mount
this disk, so we'll need to give cadmin2 its own node-specific version
Example 3: Defining a NODE-specific linkback
This directive will override our earlier linkback for
/etc/fstab and configure all nodes to look for a
node-specific version of /etc/fstab.
After creating the /etc/fstab.cadmin2 file, 'cadmin2' will use that
configuration to mount its extra disk.
You may have just noticed that, in our zeal, we just configured
/etc/fstab to be node-specific on every node, which is not what we
want. What we want is for /etc/fstab to be node-specific only
on cadmin2, and class-specific on all other
nodes. Since later
directives override earlier ones, we need to declare the node-specific
directive last and have it only apply to the 'cadmin2' node.
Here is what we end up with:
Example 4: Limiting a directive to a single node
The '-n' flag above is limiting the directive to only have an effect
on the cadmin2 node. Almost every directive can be limited to apply
only to a given list of classes
(with the -c flag) or nodes (with -n). Now all nodes are
configured to use their own respective configurations for /etc/fstab.
Using linkbacks in this way is the primary method for configuring the
behaviour of individual nodes or groups of nodes. The linkback technique
can be used for almost any file, directory, or link in your root image.
Be aware that oneSIS only manages the differences between nodes, not their
System files not in the oneSIS configuration are global across every node.
Similar to linkbacks, oneSIS only manages system services that are different
between nodes. Global services can be managed using any utility provided
by the linux distribution. Lets add some typical services that could be
handled by oneSIS in this environment.
Example 5: Setting up system services
Here we use the
to configure several services to start on admin nodes, other
services on compute nodes and io nodes, some services
that will only start on cadmin2, and one service that is configured
to start on all admin, login, and io nodes. Any service
can be enabled in this way on a per-class (with
'-c') or per-node (with '-n') basis.
Now, since we have added some LINK* directives and SERVICE
directives to the configuration, we'll need to run mk-sysimage on
# mk-sysimage /var/lib/oneSIS/image
oneSIS: Creating LINKBACK: /var/lib/oneSIS/image/etc/fstab
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/dhcpd (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/xinetd (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/nfs (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/lustre (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/pbs_mom (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/pbs_server (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/maui (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/crond (hidden)
Don't worry for now about why the output shows each SERVICE as
'hidden'. All of our node classes are now
behaving independently in the way that we have specified. With not very much
configuration, we now have four functionaly different kinds of nodes all
operating from a single shared root filesystem.