oneSIS HOWTO: A typical setup for a cluster image:
Multiple classes with different behaviors

Large clusters typically don't simply consist of a large group of compute nodes. There are almost always some peripheral nodes needed for supporting infrastructure such as login/compile nodes, I/O nodes, administration nodes, etc. This section of the HOWTO will show how to set up several classes of nodes to handle each of these functional roles in a typical cluster environment.

Note: The concepts introduced here are likely to be unfamiliar to you, even if you are a skilled systems administrator. Once you grasp how it works, though, the techniques will seem quite natural.

In 'the simplest setup' we configured a group of nodes to all use the same root image and behave in exactly the same way. In this section of the HOWTO we'll set up a configuration that will enable several groups of nodes to all use the same root image, but still have differences in their configuration files and system services. We will set up this particular image to handle classes of compute nodes, admin nodes, I/O nodes, and login nodes.

Define some node classes

For this exercise, lets assume that the following naming convention has been designated for the nodes:

Nodes with hostnames matching the pattern ccn\d+ are compute nodes.
Nodes with hostnames matching the pattern cadmin\d+ are admin nodes.
Nodes with hostnames matching the pattern cio\d+ are io nodes.
Nodes with hostnames matching the pattern clusterC-\d+ are login nodes.

Lets set them up.

Example 1: oneSIS node class definitions

# /etc/sysimage.conf

When the nodes come up with those hostnames, they will each assume the role that has been assigned to them. Don't worry, there nothing magic going on. The directives for defining the behavior of each role are very clear. Lets walk through a typical setup, and define the differences that make each of these node classes unique.

Defining class-specific configuration files

Now that we have defined our classes, we need to explicitly define what makes each class different from the rest. All nodes will use the same /etc/hosts file, so we don't need to configure a difference for that file. However, suppose our io nodes need to mount different filesystems than the rest of the nodes. We'll need to configure /etc/fstab to be different for those nodes. This can be done by defining a class-specific linkback for /etc/fstab.

Example 2: Defining a CLASS-specific linkback

# /etc/sysimage.conf

The LINKBACK directive is the single most powerful directive for defining the individual or group behavior of your nodes. The 'CLASS' linkback above specifies that we want the /etc/fstab file to 'link back' to a class-specific version of itself. The io nodes will use /etc/, the compute nodes will use /etc/fstab.compute, the login nodes will use /etc/fstab.login, etc. Click here to see how it works.

However, we really only wanted the io nodes to be different. No problem. If we simply don't create the /etc/fstab.compute file, our compute nodes will fall back to using the 'default' file: /etc/fstab.default. The same holds true for every class of nodes: if a class-specific file doesn't exist, the '.default' is used instead.

We do need to create the class-specific version of /etc/fstab for our io nodes, though: so, create the /etc/ file with the desired configuration for the io nodes. Now, when an io node determines its role at boot time, it will use the class-specific version of /etc/fstab we have configured for it. All other nodes will use the default.

This same technique can be used to create class-specific versions of almost any file or directory.

Defining node-specific configuration files

Lets take this one step further. Suppose one of our admin nodes, 'cadmin2', has directly attached storage that should be mounted from /etc/fstab. We don't want all admin nodes trying to mount this disk, so we'll need to give cadmin2 its own node-specific version of /etc/fstab:

Example 3: Defining a NODE-specific linkback

# /etc/sysimage.conf

This directive will override our earlier linkback for /etc/fstab and configure all nodes to look for a node-specific version of /etc/fstab. After creating the /etc/fstab.cadmin2 file, 'cadmin2' will use that configuration to mount its extra disk.

Limiting a directive to a single node

You may have just noticed that, in our zeal, we just configured /etc/fstab to be node-specific on every node, which is not what we want. What we want is for /etc/fstab to be node-specific only on cadmin2, and class-specific on all other nodes. Since later directives override earlier ones, we need to declare the node-specific directive last and have it only apply to the 'cadmin2' node. Here is what we end up with:

Example 4: Limiting a directive to a single node

# /etc/sysimage.conf
LINKBACK/etc/fstabNODE-n cadmin2

The '-n' flag above is limiting the directive to only have an effect on the cadmin2 node. Almost every directive can be limited to apply only to a given list of classes (with the -c flag) or nodes (with -n). Now all nodes are configured to use their own respective configurations for /etc/fstab.

Using linkbacks in this way is the primary method for configuring the behaviour of individual nodes or groups of nodes. The linkback technique can be used for almost any file, directory, or link in your root image. Be aware that oneSIS only manages the differences between nodes, not their actual configuration. System files not in the oneSIS configuration are global across every node.

Defining variant system services

Similar to linkbacks, oneSIS only manages system services that are different between nodes. Global services can be managed using any utility provided by the linux distribution. Lets add some typical services that could be handled by oneSIS in this environment.

Example 5: Setting up system services

# /etc/sysimage.conf
SERVICEdhcpd-c admin
SERVICExinetd-c admin
SERVICEnfs-c admin
SERVICElustre-c io
SERVICEpbs_mom-c compute
SERVICEpbs_server-n cadmin2
SERVICEmaui-n cadmin2
SERVICEcrond-c admin,login,io

Here we use the SERVICE directive to configure several services to start on admin nodes, other services on compute nodes and io nodes, some services that will only start on cadmin2, and one service that is configured to start on all admin, login, and io nodes. Any service can be enabled in this way on a per-class (with '-c') or per-node (with '-n') basis.

Now, since we have added some LINK* directives and SERVICE directives to the configuration, we'll need to run mk-sysimage on the image.

# mk-sysimage /var/lib/oneSIS/image

oneSIS: Creating LINKBACK: /var/lib/oneSIS/image/etc/fstab
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/dhcpd (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/xinetd (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/nfs (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/lustre (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/pbs_mom (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/pbs_server (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/maui (hidden)
oneSIS: Creating SERVICE: /var/lib/oneSIS/image/etc/init.d/crond (hidden)

Don't worry for now about why the output shows each SERVICE as 'hidden'. All of our node classes are now behaving independently in the way that we have specified. With not very much configuration, we now have four functionaly different kinds of nodes all operating from a single shared root filesystem.




Copyright © 2004-2007 Sandia Corp. All rights reserved.
Last Modified: 03/28/05