Deliver high availability with a Xen virtual server

These steps help you implement Xen virtual server high availability using the IBM System Director Virtual Availability Manager. These steps are based on IBM Blade Server HS21 and IBM Storage DS4300.

Prerequisites

To start, you need three HS21 blade servers that have been installed with SUSE Linux 10 SP1. The shared storage should be configured for two of the three blade servers, blade2 and blade3, and the shared storage space should be bigger than 19GB.

Back to top

Prepare the environment for the host server

Follow these steps to prepare the environment for your host server:

    1. Install the prerequisite software packages from the SUSE 10 SP1 Install image file:

        • heartbeat at least version 2.0.8

        • ocfs2-tools at least version 1.2.2

        • evms-ha at least version 2.5.5

        • python-curses at least version 2.4.2

    2. If you install these packages from other sources, be sure to get the right versions!

    3. Install the Xen kernel whose version is newer than 2.6.16.53—that's the one with a patch for the package ocfs2.

    4. Configure the firewall:

        • For the Virtual Availability Manager

        • To allow services NFS Server and SLP Daemon

        • To allow TCP port 3268,3269,15988 15989 6988 and the UDP port 2407

    5. Disable the Service Location Protocol (SLP) from the system service.

    6. Check the shared storage by run command ls /dev/mapper. The result should look like this:

              1. 3600a0b800017939900003573484f4be0 3600a0b800017939900003573484f4be0-part1 3600a0b800017939900003573484f4be0-part4 3600a0b800017939900003573484f4be0-part5 control

    1. If the result isn't like this, then the multiple path is not open. Run commands chkconfig boot.multipath and chkconfig multipathd to open it.

    2. Check to make sure the two hosts can recognize each other. If the two host names are configured in DNS, configure the correct DNS server for the two hosts. If the host name is configured manually, you need to edit the file /etc/hosts on both hosts to include the other host. That means the file on both hosts will include one record for itself and another record for the other.

    3. The date/time setting in the two hosts blade2 and blade3 should be same as the server blade1; otherwise, the IBM Director server will assume they are offline.

You have successfully prepared the host environment; now let's install the Virtual Availability Manager-related software.

Back to top

Install the Virtual Availability Manager software

To install the IBM Director Virtual Availability Manager-related software:

    1. Install software on the server for high availability of server blade1 (IBM Director Server 5.20.2 and IBM Director Virtualization Manager 1.2 Server). The Director Virtualization Manager includes the Virtual Availability Manager Server.

    2. Install software on the host servers blade2 and blade3 (Director Agent Core server version 5.20.2 and IBM Director Virtualization Manager 1.2 Agent, which includes Virtual Availability Manager Agent). When installing the Virtualization Manager Agent on the host servers, choose to create the master image later, and don't clone the physical server to a virtual server.

Back to top

Create the high availability (HA) farm

To create the HA farm, do the following:

    1. Add the hosts to the IBM Director Server and request access to them. Then check the attribute of the hosts—the object should be online, its agent level should be level 1, and the supported protocols are SSH and CIM.

      1. If the host is offline or it does not support protocol CIM, make sure the service cimserverd is running on the hosts via the command service cimserverd status. If cimserverd is stopped, start it.

      2. If the host's agent level is level 0, close the firewall temporarily and re-add the host to the IBM Director Server.

    2. Open the IBM System Director Virtualization Manager Web interface, and choose the task Virtual Servers and Hosts from the Director Console. This will open the IBM System Director Virtualization Manager Web interface.

    3. Choose the task Create Virtual Farm under the Hardware and Software node, and then:

        1. Input the virtual farm name, like HAFarm.

        2. Choose High Availability with workload management as the farm capability.

        3. Choose the host blade2 as the initial host.

        4. Input the storage WWN as the shared storage.

        5. Choose Restart as the farm policy.

        6. You will get the timeout error; the Virtual Availability Manager can not complete the creation in this duration.

        7. Check the log file /opt/ibm/director/am/logs/aminfr.log on the host blade2 to see if the creation process is finished when you see these logs:

        8. [07/08/08 13:49:23] startam [INFO]: Exiting startam().

        9. [07/08/08 13:49:24] amm [INFO]: amm_add_node Exit"

        10. It means the HA farm creation process is already completed.

        11. Choose the Add Host... task of the HA farm, and add host blade3 to the farm after the farm is created. It should be successfully created; you should see logs similar to those in the previous step.

    4. To ensure the HA farm is created successfully:

        • Check the folder /opt/ibm/director/am/mnt/keystore on the two hosts; they should be same and have four files.

        • Check the folder /opt/ibm/director/am/mnt/clusterdata on the two hosts; they should be same and have three files.

        • Check the shared storage file system configuration file /etc/ocfs2/cluster.conf; the content of this should be same on the two hosts.

Back to top

Create the virtual server

To create the virtual server, do the following:

    1. Copy the SUSE 10 SP1 installation image file content to a folder (for example, /shared/tmp); this folder should be writable.

    2. Run the command /opt/ibm/director/vm/im/suse_inst.py --dest=/shared/vm_master.img --src=/shared/tmp --net=local on the host blade2 to create the master image.

    3. Copy the master image file to folder /opt/ibm/director/am/mnt/images/masters/.

    4. Enter the Web interface of the IBM Virtualization Manager, and open the task Create System template under the Templates and Deployment node; choose the image you just created as the master image.

    5. Choose task Create Virtual Server under the Hardware and Software node, and create a virtual server on host blade2; choose the master image template you just created.

    6. You can create more virtual servers as you did in the previous step.

That's it for creating the HA farm and virtual server. Now let's test.

Back to top

Enable and test high availability

To enable the HA and test it:

    1. Choose the task Activate HA Capabilities to enable the HA ability. Run command crm_mon on the two hosts to check the monitor status.

    2. Power off the host blade2. You can see that the virtual server we created has been migrated to the host blade3; this means we have successfully implemented the HA capabilities of the virtual server.

Back to top

Reset the environment

To reset the environment, do the following:

    1. Deactivate the HA farm High Availability Capabilities.

    2. Remove both hosts from HA farm.

    3. Delete both hosts and the HA farm from the IBM System Director Virtualization Manager.

    4. Run /opt/ibm/director/am/bin/amDoctor purify on both hosts. If the command fails to complete, reboot the host and try the command again.

    5. Delete the mounted storage from the hosts. Run command evms_query volumes to query the current mounted storage. The results should look like this:

              1. /dev/evms/600a0b8000179399000035794872607e_FS_Volume /dev/evms/sda1 /dev/evms/sda2

    1. Run command evms, then use command dr:/dev/evms/600a0b8000179399000035794872607e_FS_Volume to delete the volumes. Select all the default choices.

    2. Reboot the two hosts again.

Now you're ready to start everything up.

Back to top

Troubleshooting

If the file in folder /opt/ibm/director/am/mnt/keystore and /opt/ibm/director/am/mnt/clusterdata on the two hosts are different, or if the contents of the storage configuration file /etc/ocfs2/cluster.conf are different on the two hosts, check whether the two hosts know each other's host name. If they don't, add the other host name and IP address to the file /etc/hosts.

If you get an error when creating the HA farm or when adding hosts to the farm or activating the HA farm, check the status of the cimserverd service on the two hosts via command service cimserverd status; also do this if cimserverd is suddenly stopped in the current director agent.

Resources

Learn

Get products and technologies

    • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

    • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

About the authors

Da Shuang He is a software engineer at the IBM China Development Lab in Shanghai, China. He is currently working on system management software; he focuses on creating self-bootable servers, remote operation system installation, and power management.

Ma Zhuo is a software engineer at the IBM China Development Lab based in Shanghai, China. He is currently working on virtualization solution software; his interests focus on virtualization solutions, system management tools, and CIM and WS-management.

http://www.ibm.com/developerworks/linux/library/l-xenvirt/index.html