The Rembo Wizard 2.0
|
|
|
|
Automatic Operations
How To
by
Petri Mäkijärvi
December 9, 2002
Abstract:
The Rembo Wizard 2.0 is a free plug-in module for the Rembo Toolkit 2.0, a
PXE-enabled, Pre-OS platform for the system hard disk management for Windows
and Linux PC-computers.
This document explains how system administrator can use The Rembo Wizard’s various automatic operations. For example, the document explains how it is possible to configure a system so that when a failed hard disk is changed, the operating system gets installed automatically when the system is switched back on.
AutoBoot – automatic system start-up
Unattended - once-only installation of the base image
Additional data partition and automated operations
AutoRepair - base image reinstallation if a new hard disk detected
AutoFix - compare and fix the system partition. 8
AutoBackup - make a new base image at each reboot 8
Server side shared files expansion pitfall with AutoBackup
The Rembo Wizard is a free plug-in module for the Rembo Toolkit.
From the Main Menu, follow Start Admin operations -> The Rembo Wizard Configuration. The configuration dialog has a section Automated operations, which will allow you to select from the following automated operations that The Rembo Wizard can do for you:
Automatic
System Start-Up
Typically always on, this automated operation allows the operating
system to be started automatically, with an possibility to activate The Rembo
Wizard if needed.
Unattended
Installation
Installing a server cluster? Want to re-install client’s system without
moving from you office? Unattended installation is a tool for you.
Automatic
Installation After A Hard Disk Failure
Your industrial computer’s hard disk breaks down in the middle of the
night. Luckily there is a spare disk and a technician is available to change
it. But how to restore the system on it? The Rembo Wizard can help.
Fixing A
Public Access System To A Known State
Organizing a conference? Everybody wants to read their e-mail with your
public access workstations but each evening the machines are in a sorry state.
Get them fixed automatically for you by The Rembo Wizard.
You can set a Delay for automatic boot of the operating system. The delay is by default 10 seconds. It is worthwhile to note that this delay is used for all automatic operations. This gives you a “panic switch” in the case of an unattended installation, for example.
Use The Rembo Wizard as a simple boot loader which works both for Windows and for Linux by setting the AutoBoot.
For Windows the boot operation is always from the hard disk, using the Master Boot Record (MBR).
For Linux, you can select between the hard disk booting and the network kernel boot. The hard disk boot is both LILO and GRUB aware. The Master Boot Record is not used for the Linux automated boot. Instead, the /etc/lilo.conf is analyzed first and the default boot parameters extracted. The Rembo Wizard searches the defined kernel (and the initial ram disk, if defined) and launches the Linux kernel with the default root file system parameter. If there is no /etc/lilo.conf, then /boot/grub/menu.lst is analyzed in a similar manner.
The Linux network kernel boot supported by The Rembo Wizard is a great way to make sure that a group of machines, such as server clusters and control system computers all work with the same, unique kernel and ram disk. Each time The Rembo Wizard takes a base image, it stores the LILO or GRUB defined default kernel and initial ram disk on the server.
If you and your organization have several machines of the same type to install, The Rembo Wizard Unattended installation tool is your dream come true. Although nothing comes free in this world, with careful preparation of your DHCP-server and the Rembo Server, you can arrange an impressive installation procedure:
Please note that the Unattended installation works always with the base image.
If you have an additional data partition in your installation, The Rembo Wizard creates the partition but it cannot format it. There are two reasons for this:
By its design definition, The Rembo Wizard never works with other partitions but the system partition.
If there is a data partition of NTFS 3 or greater type to create, you would need a Rembo image to format the data partition with Rembo’s DeviceCleanEx().
Supposing that you would still like to get the data partition formatted automatically by The Rembo Wizard. It is possible to arrange that by writing your own routine in the autoload file, either at the group level or at the target system’s own autoload file. Following is the example code for the case where you have created a FAT32 partition for the data on the second primary partition of the first system disk. The Rembo Wizard takes care of the system disk partitioning at the primary partition level and it takes of the installation of the system on the system partition (in this example, partition 1). Then your code is called.
// This is user's code
bool userCodeLoaded;
if (userCodeLoaded)
goto userCodeEnd;
// Hook at the end of Unattended installation
void UnattendedHook ( void ) {
Printf ("UnattendedHook() executing<br>");
SysLog( NetInfo.IPAddress+": UnattendedHook()
HDClean(0,2)");
HDClean(0,2);
} // UnattendedHook()
userCodeLoaded = true;
userCodeEnd: ;
// end of user's code
Immediately after the above hook’s execution, the system is booted, providing that you have the AutoBoot option set. If the data partition is of NTFS-type, then the hook’s code would be quite complicated, but not impossible to write. You would need to take a dummy image of the reference system’s data partition. Then you would use the dummy image to obtain the structure required by the DeviceCleanEx() Rembo function which must be used instead of HDClean() for the NTFS file systems.
All systems that have the following type persistent variable stored in their host-level autoload-file can take the advantage of the AutoRepair feature of The Rembo Wizard:
str PartitionTable = "NTFS:2048256 EMPTY:0 EMPTY:0 EMPTY:0";
The Rembo Wizard calls the above parameter as disk’s Partition Table Signature. It is created automatically each time you take a base image of the system. Also The Rembo Wizard checks the partition table signature each time when you enter into the configuration dialog utility on the client machine.
Once the system is installed and running in normal conditions, it is not likely that the partition table of the system changes. This is especially true in computers that would be ideal candidates for the AutoRepair-operation:
Ø Clustered computer’s nodes
Ø Control system computers
If we take the example of a node in a large cluster of computers, there is likely no screen, keyboard or mouse attached to the node that has suddenly broken down. But typically there is a Hot-Plug SCSI-disk which is now broken, just to make an example. AutoRepair-operation will help us now to quickly get the system up and running:
An other example would be with the control system computers that could be repaired from a hard disk failure by the maintenance personnel without any particular tools or knowledge of the actual system that is running on the dedicated system that is backed up with The Rembo Wizard.
AutoRepair-operation provides similar user level code hook option as with the Unattended-installation. Following is an example code in the host’s autoload-file:
// This is user's code
bool userCodeLoaded;
if (userCodeLoaded)
goto userCodeEnd;
// Hook at the end of AutoRepair installation
void AutoRepairHook ( void ) {
Printf ("AutoRepairHook() executing<br>");
SysLog( NetInfo.IPAddress+":
AutoRepairHook()HDClean(0,2)");
HDClean(0,2);
} // AutoRepairHook ()
userCodeLoaded = true;
userCodeEnd: ;
// end of user's code
Typical usage of this feature is with the public access computers and such. For example, you are organizing a conference and you must provide public access computers for people so that they can read their e-mails from their home institute. With the AutoFix-feature of The Rembo Wizard you can make sure that each morning the participants would find a fresh, working installation.
Unlike the Unattended-installation, which is once-only automated installation operation the AutoFix-feature is executed at every reboot. The system partition is not formatted but its content is compared against the base image:
Ø Files that exists on the disk but not on the base image are deleted
Ø Files that exists in the base image but not on the disk are returned back to the disk
Ø Files on the hard disk that are declared different than those in the base image after a MD5 checksum verification are replaced on the disk with the original version of each file.
Please note that with this procedure the system partition never gets formatted. It would be a good idea to reinstall the system from the base image before the next big conference (in our example) in order to avoid system disk’s fragmentation.
AutoFix -operation provides similar user level code hook option as with the Unattended-installation. Following is an example code in the host’s autoload-file:
// This is user's code
bool userCodeLoaded;
if (userCodeLoaded)
goto userCodeEnd;
// Hook at the end of AutoFix operation
void AutoFixHook ( void ) {
Printf ("AutoFixHook() executing<br>");
SysLog( NetInfo.IPAddress+":
AutoFixHook()executing");
} // AutoFixHook ()
userCodeLoaded = true;
userCodeEnd: ;
// end of user's code
Apart the very obvious usage of forcing a backup on user’s computer, this feature has been created for a more complex usage. Nothing prevents you AutoBackup in single-shot cases, but following is an explanation of cluster-computing level synchronizing, using both AutoBackup and AutoFix features of The Rembo Wizard.
Let’s suppose that you have a cluster of identical computers. They can be general purpose login servers, a Beowulf cluster, an OpenMosix cluster, a group of control system computers, or such. You may have originally set up all the computers to have an equal configuration using the disk cloning techniques provided by The Rembo Wizard and you may even use network kernel booting to assure the identical configuration of all the computers in the cluster.
Typically the configuration of a cluster evolves; new user logins are defined, common shell resources are modified, new NFS-mount points created, new drivers installed, and much more. You may have already a script based mechanism to keep the systems on identical state. But you should ask yourself following questions:
The solution to the above problems is to use The Rembo Wizard’s AutoBackup feature together with the AutoFix (case 1) or with the AutoRepair (case 2) features. You would select one machine on the cluster as the reference machine, on which you would do all the necessary maintenance modifications. On this machine you would use AutoBackup-feature. All other machines would be set to use AutoFix or AutoRepair-features.
In the above picture, Node 1 is the reference machine of the cluster of computers. Periodically we take a new base image of its system partition using the AutoBackup-feature. All other nodes, from Node 2 to Node n are periodically fixed to the same state as the reference node using the AutoFix-feature.
Of course, you cannot fix the other systems from the base image while The Rembo Wizard is creating the base image. The most obvious solution to this would be to reboot the reference machine Node 1 24 hours before the other nodes of the system. This would give you the advantage of being alerted by The Rembo Wizard if something goes wrong in the base image creation before other nodes are applying the base image.
This can be a problem on many computer clusters where maintenance rebooting of systems is scheduled for once per entire cluster. The problem can be resolved with a configuration which is also the most secure one:
Can I set both AutoFix and AutoRepair on? Yes, it is possible but only by editing autoload-file by hand. If you pass through the configuration dialog, it allows you to select but one automatic operation at a time. Selecting both AutoFix and AutoRepair options will give the priority to AutoRepair-option over the AutoFix-option. If the disk is changed and it will be AutoRepair’ed, it will be also AutoFix’ed (uselessly, of course).
AutoBackup -operation provides similar user level code hook option as with the Unattended-installation. Following is an example code in the host’s autoload-file:
// This is user's code
bool userCodeLoaded;
if (userCodeLoaded)
goto userCodeEnd;
// Hook at the end of AutoBackup operation
void AutoBackupHook ( void ) {
Printf ("AutoBackupHook() executing<br>");
SysLog( NetInfo.IPAddress+":
AutoBackupHook()executing");
} // AutoBackupHook ()
userCodeLoaded = true;
userCodeEnd: ;
// end of user's code
As with any Rembo Toolkit 2.0 or higher server, shared files repositories continue to expand on the server. With systems where The Rembo Wizard’s System Snapshot is taken manually, this can be a problem in long term and the server’s file system must be surveyed. With AutoBackup-feature the shared files repository expansion becomes something almost programmed and therefore you have to analyze the repository’s expansion on regular basis and take action accordingly. Following is the how the Rembo Toolkit server is used to check and fix its own file system.
Stop the Rembo Toolkit server.
/etc/init.d/rembo stop
Following commands will keep the server disabled and you client
computers should not be rebooted during the shared file system reparations.
Press Ctrl+C to stop the application when the Rembo Toolkit server says
“listening on socket…”.
Run the fsck equivalent of the Rembo Toolkit, first in reporting only
mode:
cd /usr/local/rembo
./rembo –d –v 3 –chkshared
If there are no big warnings, you can run the repair mode on the shared file system
./rembo.-d –v 3 -fixshared
You can get an idea what can be gained if the shared file system is backed by issuing the command
./rembo.-d –v 3 -statshared
Finally, if you consider that the gains are worth of the small risk involved (something can go wrong and therefore you must have a recent backup of the shared file system), you can pack the shared file system
./rembo.-d –v 3 -packshared
Start the Rembo Toolkit server.
/etc/init.d/rembo start
v