Pacemaker

Overview

Pacemaker is a cluster resource manager, that is, a logic responsible for a life-cycle of deployed software — indirectly perhaps even whole systems or their interconnections — under its control within a set of computers (a.k.a. nodes) and driven by prescribed rules. It achieves maximum availability for your cluster services (a.k.a. resources) by detecting and recovering from node- and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either Corosync2 or Heartbeat3), and possibly by utilizing other parts of the overall cluster stack

Installation and Initial Configuration

Install the required software

yum install -y pacemaker pcs psmisc policycoreutils-python

Configure the Cluster Software

On each node allow the cluster software through the firewall

firewall-cmd --permanent --add-service=high-availability

firewall-cmd --reload

On each node enable the PCS Daemon

systemctl start pcsd.service

systemctl enable pcsd.service

On each node set the password for the hacluster user

passwd hacluster

Changing password for user hacluster.

New password:

Retype new password:

passwd: all authentication tokens updated successfully.

Configure Corosync

On either node, authenticate with the hacluster user

pcs cluster auth gfs1 gfs2

On the same node, generate and sync the hacluster

pcs cluster setup --name mycluster gfs1 gfs2

Commands

pcs - Tool used to configure and administer Pacemaker

pcs category help - Used to get help on categories like stonith and resource

pacemakerd --features - List available cluster stacks available

pcs status - Status of the cluster

pcs resources standards - List available resource standards

pcs resource providers - List available resource providers

pcs resource agents ocf:heartbeat - List resource agents specific to an ocf provider

Starting a cluster

It is recommended not to start Corosync and Pacemaker on startup in case of node failure. This allows you to troubleshoot and bring the node back into the cluster cleanly.

On all nodes

pcs cluster start --all

On individual nodes

pcs cluster start

systemctl start corosync.service

systemctl start packemaker.service

Verify Corosync and Pacemaker Installation

corosync-cfgtool -s

corosync-cmapctl | grep members

ps axf | egrep "corosync|pacemaker"

pcs status

journalctl | grep -i error

Create an Active/Passive Cluster

Before we begin, check the status of the cluster

crm_verify -L -V

   error: unpack_resources:     Resource start-up disabled since no STONITH resources have been defined

   error: unpack_resources:     Either configure some or disable STONITH with the stonith-enabled option

   error: unpack_resources:     NOTE: Clusters with shared data need STONITH to ensure data integrity

Errors found during check: config not valid

To guarantee data safety, stonith is enabled by default but as yu can see above we have not configured stonith yet. We will disable stonith and configure it later.

pcs property set stonith-enabled=false

crm_verify -L

Add a floating IP address that users can target

pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.56.21 cidr_netmask=32 op monitor interval=30s

Verify the IP resource has been added.

pcs status

Test failover of the IP resource

pcs status

pcs cluster stop gfs1

Check the status on both nodes. One should be stopped and the resource running on the other.

pcs status

Pacemaker does not prevent resources from being moved around by default and so could move the resource when the other server is backup. We want more control and so can set something called resource stickiness which will tell pacemaker we prefer resources where they are.

pcs resource defaults resource-stickiness=100

pcs resource defaults

  resource-stickiness: 100

Google Sites

Report abuse