Untitled Document

Diskless (and diskfull) computing made easy

oneSIS is an open-source software package aimed at simplifying diskless cluster management. It is a simple and highly flexible method for deploying and managing a system image for diskless systems that can turn any supported Linux distribution into a master image capable of being used in a diskless environment. One image is sufficient for serving thousands of nodes. Functional groups of nodes are easy to define, and any single node or group of nodes can easily be configured to behave independently. Configuration is simple. All node differences are defined in a central configuration file, providing unprecedented simplicity and clarity for system administrators. oneSIS can be used to manage diskless systems using NFS root, and potentially root over any other network filesystem or network storage system (such as iSCSI, iSER, SRP, Fiber Channel). It can be used to manage the root filesystem in any kind of diskless environment from desktops to high availability web servers to high performance compute clusters. Click here to learn more about oneSIS.

Why use oneSIS?

Consider this common scenario in HPC and data centers today: even with a team of highly skilled administrators, it is hard to keep up with the demand required to maintain every system. Admins are often strained to keep the many different types of computers they have all functioning as they should. The efforts of the sysadmins are often duplicated over and over on tasks such as installing new systems, new system software, and individually handling configuration problems when they arise on each of the different systems. Here is where oneSIS can help.

Consider an alternative scenario: in a oneSIS environment, all an admin needs to do is reboot the machines in order for them to enter into any pre-configured operational environment. Any number of these 'operational environments', or images, can exist. Think of them as a container for the system software, configuration, and behavior of whatever group of nodes the image was designed to run on. For instance, an image could be built that is tailored to manage the operational necessities of running a large multi-user cluster. It would contain the software, configuration, and behavior of compute nodes, login nodes, admin nodes, I/O nodes, and possibly more. Another image could be built around the latest linux distribution that your sysadmin team is currently testing for a future deployment. You could have an image configured to handle web servers, application servers, database programs, render farms, even user desktops: it doesn't matter.

oneSIS enables an administrator to create system images that define the behavior of the entire computing infrastructure, or just a bunch of identical nodes. It doesn't insert some administration interface for configuring your system: any linux admin will still feel right at home. It does provide some helper utilities, but it mainly just allows an admin to precisely define the behavior of many kinds of machines all from the same root image. The image can be configured for any target use, and can be as simple or as complex as you want to make it. With a single image controlling the behavior of every machine, overall system complexity and system administration overhead is dramatically reduced. It also tends to lead to a very stable environment since an administrator can focus on hardening one single system instead of spreading attention thin across many different setups.

When booting diskless (NFSroot), any subset of nodes can be booted into any image you want. Changes to an image are seen by all nodes instantaneously. The system image is interchangeable with the speed of a reboot, and any image can be cloned with a simple copy. A single oneSIS image can contain the software, configuration, and behavior of many functionally different systems at once. A single image can be used without any change on any number of functionally different machines. oneSIS can contain the software, configuration, and behavior of two or more clusters almost as easily as one. Any working modifications made to one system can be propagated to all other systems by performing a simple synchronization. You can be confident the behavior of each configured type of node will be the same because the image is always exactly the same no matter which machine is using it.

On local networks, the image can be cloned as many times as necessary, with each clone capable of serving the image out to as many diskless clients as the machine/network is capable of handling. NFS bottlenecks due to large numbers of diskless clients can be easily distributed to whatever scale is necessary. All nodes determine their role at boot time, and then operate like normal using the configuration designed for that role. The configuration of any node can be updated at run-time (after the node has booted) and it is even possible to change the functional role of any node 'on the fly'.

oneSIS is a new way for building and maintaining compute systems of any size. It reduces the cost of cluster administration by creating an environment that is easier to maintain. It is lightweight, easy to configure, and flexible enough to adapt to any environment. It requires no processor overhead on client nodes, and no kernel modifications. It provides a simple, clean interface for centrally controlling any number of diverse clients from a single system image. Simply put, it is an easier way to manage a system.


Copyright © 2004-2007 Sandia Corp. All rights reserved.
Last Modified: 07/10/08