Pacemaker is a cluster resource manager, that is, a logic responsible for a life-cycle of deployed software — indirectly perhaps even whole systems or their interconnections — under its control within a set of computers (a.k.a. nodes) and driven by prescribed rules. It achieves maximum availability for your cluster services (a.k.a. resources) by detecting and recovering from node- and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either Corosync2 or Heartbeat3), and possibly by utilizing other parts of the overall cluster stack
Install the required software
yum install -y pacemaker pcs psmisc policycoreutils-python
Configure the Cluster Software
On each node allow the cluster software through the firewall
firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload
On each node enable the PCS Daemon
systemctl start pcsd.service
systemctl enable pcsd.service
On each node set the password for the hacluster user
passwd hacluster
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
On either node, authenticate with the hacluster user
pcs cluster auth gfs1 gfs2
On the same node, generate and sync the hacluster
pcs cluster setup --name mycluster gfs1 gfs2
pcs - Tool used to configure and administer Pacemaker
pcs category help - Used to get help on categories like stonith and resource
pacemakerd --features - List available cluster stacks available
pcs status - Status of the cluster
pcs resources standards - List available resource standards
pcs resource providers - List available resource providers
pcs resource agents ocf:heartbeat - List resource agents specific to an ocf provider
Starting a cluster
It is recommended not to start Corosync and Pacemaker on startup in case of node failure. This allows you to troubleshoot and bring the node back into the cluster cleanly.
On all nodes
pcs cluster start --all
On individual nodes
pcs cluster start
Or
systemctl start corosync.service
systemctl start packemaker.service
Verify Corosync and Pacemaker Installation
corosync-cfgtool -s
corosync-cmapctl | grep members
ps axf | egrep "corosync|pacemaker"
pcs status
journalctl | grep -i error
Before we begin, check the status of the cluster
crm_verify -L -V
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
To guarantee data safety, stonith is enabled by default but as yu can see above we have not configured stonith yet. We will disable stonith and configure it later.
pcs property set stonith-enabled=false
crm_verify -L
Add a floating IP address that users can target
pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.56.21 cidr_netmask=32 op monitor interval=30s
Verify the IP resource has been added.
pcs status
Test failover of the IP resource
pcs status
pcs cluster stop gfs1
Check the status on both nodes. One should be stopped and the resource running on the other.
pcs status
Pacemaker does not prevent resources from being moved around by default and so could move the resource when the other server is backup. We want more control and so can set something called resource stickiness which will tell pacemaker we prefer resources where they are.
pcs resource defaults resource-stickiness=100
pcs resource defaults
resource-stickiness: 100
ss