Thursday, April 17, 2008

Linux Cluestering how to


What is GNU/Linux?

GNU/Linux is a freely or inexpensively available operating system based around
the Linux kernel which was developed by Linus Torvalds. GNU is a recursive
acronym that stands for Gnu is Not Unix. GNU is basically a collection of open
source implementations of common Unix utilities and programs written to
provide and alternative to expensive commercial Unix software. Both are open
source, meaning that the source code for the kernel and GNU applications are
freely available to anyone. Since the source code is available, anyone who can
code can make modifications to them. This has lead to many implementations of
GNU/Linux of which clustering is but one.

A cluster is a group of computers which work together toward a final goal. Some
would argue that a cluster must at least consist of a message passing interface
and a job scheduler. The message passing interface works to transmit data among
the computers (commonly called nodes or hosts) in the cluster. The job scheduler
is just what it sounds like. It takes job requests from user input or other
means and schedules them to be run on the number of nodes required in the
cluster. It is possible to have a cluster without either of these components,
however. Consider a cluster built for a single purpose. There would be no need
for a job scheduler and data could be shared among the hosts with simple methods
like a CORBA interface.

By definition, however, a cluster must consist of at least two nodes, a master
and a slave. The master node is the computer that users are most likely to
interact with since it usually has the job scheduler running on it. The master
can also participate in computation like the slave nodes do, but it is not
required or even recommended in large clusters. The slave nodes are just that.
They respond to the requests of the master node and, in general, do most of the
computing.

  1. Boot from the CD-ROM.
    • With the computer powered up, insert the RedHat Binary CD #1 and restart the computer.
    • If the computer will not boot from the CD-ROM, check the BIOS settings of the computer and make sure that the first boot device is the CD-ROM.
    • If the computer will not boot from CD-ROM, use the install boot diskettes that came with the version of Linux being used.
  2. Choose an install mode, then press ENTER
    • This documentation assumes that the user will choose "text" from the boot prompt. Choosing text is not necessary but is faster and more efficient.
    • Other choices can include "text expert" or just simply press return to boot into the graphical installer. The graphical mode is best for users with little Linux experience.
  3. If you have a driver diskette for any special devices, like a monitor, sound card, etc., insert that diskette and press ENTER
    Otherwise, say no, and continue with the installation.
  4. Follow the directions on the screen. The process is fairly straightforward until partitioning.
  5. When installing Linux on the master node, it is recommended to use separate partitions. This is not necessary, but it can allow for easier administration in the long run. The Master node of Mimosa, the cluster at the Mississippi Center for Supercomputing Research where this document is being written, is configured basically like this:

    Device Size Mount Point
    /dev/sda1 9.6G /
    /dev/sda2 53M /boot
    /dev/sda5 16G /home
    /dev/sdb6 6.1G /usr/local
    Do not forget to add a swap partition when partitioning the drive. The swap partition should be at least 128 megabytes. Actually, the Red Hat Linux Reference Guide recommends 2x RAM or 32 MB, whichever is larger.

    For the slave nodes, you just the / partition and the swap partition, since the applications will all be installed on the master node's /usr/local partition, and since all user files will be stored on the master nodes /home partition. The /home and /usr/local of the slave nodes are mounted to those on the master node, so they can be any size on the slave node. If you type "cd /home or cd /usr/local" on the slave node, you are actually going to the directory on the master node thanks to NFS (Network File System).

  6. Since it is assumed that Linux will be the only operating system on the master node and the slave nodes, install LILO, the Linux bootloader, to the MBR(Master Boot Record) of the primary hard disk.
  7. Now choose the packages to install. At the minimum, make sure that NFS, Networked File System, and RSH, Remote Shell, are installed. It is also a good idea to install SSH as a backup in case RSH fails but this is not required. One way to ensure that these packages get installed is to simply choose everything. This is not a bad idea if disk space is not a concern, as you can always turn off unneeded services later. Otherwise, you must check the "Select Individual Packages" box on the "Package Group Selection" page.
  8. RedHat will now ask about installing and configuring their firewall. If the cluster has no contact with the outside world or is behind a very good firewall, it is advisable to not install the firewall at all. If the firewall is installed, make sure that it allows connections via SSH and RSH.
  9. We need to add a step here about configuring X.
  10. Once the packages have installed, the user is presented with an option to make a boot disk. Making a boot disk is always a very good idea.
  11. One of the last steps in installing Linux is deciding whether or not the computer should boot up into graphical mode. Setting the default run level to 5, graphical mode, can cause some problems if it has not been fully tested. Generally, it is a better choice to have the system's default run level set for 3, multi-user with networking, until it is certain that the X window system will not cause any problems. Once it has been determined that the system functions properly in run level 5, it is possible to set the computer to boot into graphical mode by editing /etc/inittab and changing the line that looks like:

    id:3:initdefault:

    to

    id:5:initdefault:

  12. Now Linux is installed. Reboot the computer and continue to the next step.