Introduction
High Availability can in simple terms be stated as the masking or elimination of both planned and unplanned downtime. PowerHA System Mirror (formerly HACMP) is a High Availability product from IBM. It ensures High Availability for the cluster by monitoring all its resources and automatically failover/fallback them to other nodes of cluster at failures
The XD feature of PowerHA System Mirror provides High Availability across site-wide failures or large-scale disasters by mirroring business-critical data across sites while enabling failover to the remote sites. A Site (in PowerHA System Mirror terms) is a location where nodes of the cluster reside and PowerHA System Mirror supports a maximum of two sites.
PowerHA System Mirror/XD uses two different mirroring/replication approaches to achieve its purpose.
Data Replication/Mirroring over TCP/IP
Data Replication using Storage PPRC function
GLVM and HAGEO features of PowerHA System Mirror/XD are based on mirroring over TCP/IP and PPRC (ESS800), SPPRC (ESS800, DS6000, and DS8000), SVC PPRC features use the Storage
PPRC function for mirroring across sites. Also, GLVM can operate in stand-alone mode wherein the responsibility of data mirroring to other end is handled by GLVM but any failure at the current site/node is to be handled manually. This failover and recovery at such failures will be handled by PowerHA System Mirror if GLVM Configuration is integrated with PowerHA System Mirror.
The Geographic Logical Volume Manager (GLVM) is a new AIX software based technology (intended as replacement for HAGEO) which is built upon AIX Logical Volume Manager to allow create a mirror copy of data at a geographically distant location. GLVM relies fully upon LVM for mirroring and is based on the concept of RPV (Remote Physical Volume) which is explained at later section of this article.
GLVM supports two modes of mirroring.
1) Synchronous and
2) Asynchronous.
Synchronous Mirroring
GLVM had the support for Synchronous mirroring from HACMP v5.3 but the Recovery support to HACMP was from HACMP v5.4. Consider the figure below for the basic concept of RPV and GLVM synchronous mirroring,
Figure 1: Synchronous GLVM
Here as shown, Node A is in production site and Node B is in disaster/backup site with TCP/IP connectivity between them. Say 'ex_gmvg' is the GMVG which is on PV1 and PV2 of Node A and PV3 and PV4 of Node B. A volume group which contains one or more RPVs is called GMVG (Geographically Mirrored Volume group).How to create a gmvg spanning across sites will be explained in further sections of this article.
As mentioned, GLVM is based on the concept of RPV (Remote Physical volume) which enables LVM at production site to access the disks PV3 and PV4 of disaster site as if they were locally attached .So, when Application or Filesystem reads/writes to logical volume of ex_gmvg at production site, LVM at production site processes read/write requests for remote disks by reading/writing to its RPV clients. So users have to define PV3 and PV4 as RPV Clients at production site and then RPV Clients communicate with RPV Servers at remote site to perform actual I/O to PV3 and PV4(Here again, users have to define PV3 and PV4 as RPV Servers at disaster site). RPVServer reads/writes to the real disk and passes the results back to RPV client, which in turn passes the results back to LVM.
Consider the figure below to have clear view of how GLVM operates,
Figure 2: Synchronous GLVM
It is actually the RPV device driver of GLVM which allows LVM at production site to access the disks at disaster site as if they were locally attached. This RPV device driver consists of two parts. One being the RPV Client, which is a pseudo device on local system to send I/O requests to the RPV Server. Another is the RPV Server which is a kernel extension on the remote system to processes I/O requests from a remote RPV client. It is RPV server which performs actual I/O to the PVs.
The principle of Synchronous mirroring is that Each Logical Volume Write is not considered complete until data has been written to disks at both sites. So with synchronous Mirroring, both the sites have same latest data.
Advantages
Faster to read from local disk (since both the sites have same data)
No data loss with site failure.
Disadvantages
Reads and writes on remote physical volumes are much slower.
Affected by the factors of network latency and network bandwidth.
Asynchronous Mirroring
With Asynchronous Mirroring, Writes are considered complete after data is written to local disks. RPV writes are cached at local site and mirroring to remote site takes place in background. From PowerHA System Mirror v5.5, Support was added for Asynchronous mirroring.
Before getting into the details of GLVM Asynchronous Mirroring, Understand the following terms which are associated with GLVM Asynchronous Mirroring
Terms Associated with GLVM Asynchronous mirroring
LVM provides a new feature known as mirror pools from AIX 61D. A "mirror pool" is a user-specified collection of disks that will be used by LVM to contain one and only one copy of each of the logical volumes in a volume group. Asynchronous mirroring requires the RPV device driver to coordinate processing of asynchronous write requests across a set of disks in order to maintain data integrity and consistency and mirror pools provide a convenient way to group disks together for this purpose.
Data divergence is a state where each site disks contain data updates that have not been mirrored to the other site. Assume if the production site goes down due to some failure, then disaster site may have back level of data as some of writes may not have got mirrored. If at later point of time, production site comes up, users will have two copies of data.
The decision of whether to mirror synchronously or asynchronously is made at the mirror pool level. Therefore, the user can decide to mirror from the production site to the disaster recovery site asynchronously, and then mirror from the disaster recovery site back to the production site synchronously. This can be accomplished by configuring the mirror pool that contains the disaster recovery site disks as asynchronous while configuring the mirror pool that contains the production site disks as synchronous.
Asynchronous GLVM
Asynchronous mirroring builds upon LVM mirror pools. (Mirroring happens from, Say mirror pool MP1 to other mirror pool MP2.) So users might have to define local PVs in one mirror pool(MP1) and remote disks in other mirror pool(MP2).GLVM works the same at synchronous and asynchronous modes of mirroring, when the interaction of RPV clients and RPV Servers and the data replication across TCP/IP is considered. The difference lies only at how reads/writes are handled. With Asynchronous mirroring, it might happen that users have two copies of data due to data divergence. At this point, GLVM provides option to users, either to select the copy at production site or disaster site. With PowerHA System Mirror, this choice can be automated such that it always chooses the copy at production site A when it comes back.
Advantages
Application response time can be improved.
Heavy writes are absorbed in cache during heavy bursts
Disadvantages
Remote site will always have back-level data so data loss might happen if local site goes down suddenly.
Possibility of data divergence.
GMVG Creation
GMVGs can be created through "smitty storage" or using AIX commands. AIX commands are given below. Appropriate attributes should be used in smitty storage to create GMVGs.
Steps to Create Sync GMVG
To create Sync GMVG without mirror pools
To Create Volume Group(GMVG),Logical Volumes and Filesystems
mkvg -f -y sgmvg hdisk6 hdisk16
extendvg -f sgmvg hdisk22 hdisk23
mklv -t jfs2 -y slv -s s -u 1 gmvg 10
mklv -t jfs2log -y slv_log -s s -u 1 gmvg 10
crfs -v jfs2 -A no -m /sfs -d slv -a logname=slv_log
Create Mirror Copy on RPVs
To Create Mirror Copy on RPVs
mirrorvg -c 2 sgmvg
We can also create Sync GMVG with mirrorpools on Scalable Volume Group as follows
To create Sync GMVG with mirror pools
To Create Volume Group(GMVG),Logical Volumes and Filesystems
mkvg -f -S -y sgmvg hdisk6 hdisk16
chpv -p mp1 hdisk6 hdisk16
extendvg -f -p mp2 sgmvg hdisk22 hdisk23
mklv -t jfs2 -y slv -p copy1=mp1 -b n -s s -u 1 sgmvg 10
mklv -t jfs2log -y slv_log -p copy1=mp1 -b n -s s -u 1 sgmvg 10
crfs -v jfs2 -A no -m /sfs -d slv -a logname=slv_log
Create Mirror Copy on RPVs
To Create Mirror Copy on RPVs
mirrorvg -c 2 -p copy2=mp2 sgmvg
Now we will see the requirements for Async Mirroring
Requirements for Async Mirroring
Only supported on AIX 6.1 technology level 2 and up.
Must be SVG (Scalable Volume Group) type.
Must be super strict mirror pool enabled.
Must be non-concurrent volume group.
Auto varyon and bad block relocation disabled.
Async GMVG cannot contain paging type device.
Cannot be snapshot volume group.
rootvg cannot be configured for asynchronous mirroring.
Now As we know the requirements for Async mirroring let us get into the details of creating GMVG.
Steps to Create Async GMVG
Create Volume Group(GMVG),Logical Volumes and Filesystems
To Create Volume Group(GMVG),Logical Volumes and Filesystems
mkvg -f -S -y agmvg hdisk6 hdisk16
chpv -p mp1 hdisk6 hdisk16
extendvg -f -p mp2 agmvg hdisk22 hdisk23
mklv -t jfs2 -y alv -p copy1=mp1 -b n -s s -u 1 agmvg 10
mklv -t jfs2log -y alv_log -p copy1=mp1 -b n -s s -u 1 agmvg 10
crfs -v jfs2 -A no -m /afs -d alv -a logname=alv_log
Create Mirror Copy on RPVs
To Create Mirror Copy on RPVs
mirrorvg -c 2 -p copy2=mp2 agmvg
Create aio_cache Logical Volumes
Asynchronous GLVM mirroring will require a new type of logical volume for caching of asynchronous write requests. We do not want this logical volume to be mirrored across sites. Super strict mirror pools will handle the new aio_cache logical volume type as a special case.
To Create aio_cache Logical Volumes
mklv -t aio_cache -y mp1_cache -p copy1=mp1 -b n agmvg 10 hdisk16
mklv -t aio_cache -y mp2_cache -p copy1=mp2 -b n agmvg 10 hdisk23
Enable super strict mirror pools; turn off auto-varyon and bad block relocation. the user must configure the volume group to use super strict mirror pools
To Turnoff auto-varyon and bad block relocation
chvg -b n -a n -M s agmvg
Configure Async properties
High water mark says how much the cache can fill up before new write requests have to wait for mirroring to catch up. This value is expressed in megabytes.
To Set Async Properties
chmp -A -h 80 -m mp2 agmvg
chmp -A -h 50 -m mp1 agmvg
lsmp -A agmvg
Basic Steps of GLVM Configuration (common for both sync and Async)
Install PowerHA System Mirror/XD(glvm filesets)
TO configure RPV site name and RPV devices use smitty glvm_utils
configure RPV Site name
To define RPV Site name
Do smitty glvm_utils
-> Remote Physical Volume Servers
-> Remote Physical Volume Server Site Name Configuration
-> Define / Change / Show Remote Physical Volume Server Site Name
configure RPV Servers
To create RPV Servers
Do smitty glvm_utils
-> Remote Physical Volume Servers
-> Add Remote Physical Volume Servers
configure RPV Clients
To create RPV Clients
Do smitty glvm_utils
-> Remote Physical Volume Clients
-> Add Remote Physical Volume Clients
Create GMVG on RPV Devices. Make sure RPV clients are up the node where you create GMVG and appropriate RPV Servers to which these RPV Clients point to, are up on the other site. Also, RPV Devices shouldn’t be up on rest of the nodes of the cluster.
please refer to the section of GMVG Creation above, for steps to create Sync/Async GMVG.
Import GMVG on to the remaining nodes Please make sure that RPV clients and RPV servers are up on appropriate nodes while importing nodes. Also, RPV Devices shouldn’t be up on rest of the nodes of the cluster.
Basic PowerHA System Mirror Configuration steps.
PowerHA System Mirror site names should be same of RPV Site names.
Configure XD_data network
Use node-bound service-ip or persistent service-ip labels.
Set forced varyon option true for gmvgs.
Set allow varyon with missing data updates to true.
Give default choice for data divergence recovery.(siteA)