Ensure the required firewall ports are open.
# firewall-cmd --add-service=glusterfs --permanent# firewall-cmd --add-service=glusterfsEnsure all server names can be resolved. e.g.
# echo >> "192.168.56.1 gfs1 gfs1.grow4.co.uk"# echo >> "192.168.56.2 gfs2 gfs2.grow4.co.uk"Install GlusterFS on your servers and enable the service to start on boot
# yum install glusterfs-server...Dependencies Resolved====================================================================================================== Package Arch Version Repository Size======================================================================================================Installing: glusterfs-server x86_64 3.8.15-2.el7 gluster 1.4 MInstalling for dependencies: glusterfs x86_64 3.8.15-2.el7 gluster 512 k glusterfs-api x86_64 3.8.15-2.el7 gluster 90 k glusterfs-cli x86_64 3.8.15-2.el7 gluster 184 k glusterfs-client-xlators x86_64 3.8.15-2.el7 gluster 784 k glusterfs-fuse x86_64 3.8.15-2.el7 gluster 134 k glusterfs-libs x86_64 3.8.15-2.el7 gluster 380 k userspace-rcu x86_64 0.7.16-1.el7 epel 73 k...# systemctl enable glusterdBy default a server is added to its own trusted storage pool. Now add additional servers to the pool.
# gluster peer probe gfs2peer probe: success.Verify the status of the peers. This can be done on either peer.
# gluster peer statusNumber of Peers: 1Hostname: gfs2Uuid: 1fef9000-6c0e-452c-a7bc-46639db5635bState: Peer in Cluster (Connected)Assign the hostname to the first server (gfs1) by probing it from another server:
[root@gfs2 ~]# gluster peer probe gfs1peer probe: success. Host gfs1 port 24007 already in peer listPrerequisites
Formatting and Mounting Bricks
Creating a Thinly Provisioned Logical Volume
Create a physical volume
pvcreate /dev/sdbCreate a volume group. It is recommended to only create one volume group per device
vgcreate gfs_vg1 /dev/sdbCreate a thin pool by creating the data and metadata LVs and then create the pool using these LV commands.
lvcreate -n gfs_pool_meta -L 500M gfs_vg1lvcreate -n gfs_pool_data -L 4G gfs_vg1lvconvert --thinpool gfs_vg1/gfs_pool_data --poolmetadata gfs_vg1/gfs_pool_meta
Create a thin LV using the previously created thinpool. It is recommended that only one thinpool LV should be created in a thinpool.
lvcreate -V 1G -T gfs_vg1/gfs_pool_data -n gfs_thin_lv1You can also create the above with a single lvcreate command
lvcreate -L1G -V4G -n gfs_thin_lv1 --thinpool gfs_vg1/gfs_pool_dataFormat and mount the new thin LV
mkfs.xfs /dev/gfs_vg1/gfs_thin_lv1mkdir /mnt/gfsmount -t xfs /dev/gfs_vg1/gfs_thin_lv1 /mnt/gfs/Creating a Replica Volume
On mount points created on each of the servers, create a new directory which will be used to create the volume. Having a sub directory help prevents data loss if the volume is not mounted.
# mkdir /mnt/gfs/apps# gluster volume create apps replica 2 gfs1:/mnt/gfs/apps/ gfs2:/mnt/gfs/apps/volume create: apps: success: please start the volume to access data# gluster volume listapps# gfs1 mnt]# gluster volume start appsvolume start: apps: successInstalling the Gluster Native Client
Add the FUSE loadable kernel module (LKM) to the Linux kernel:
# modprobe fuseVerify that the FUSE module is loaded:
# dmesg | grep -i fuse fuse init (API version 7.13)Install Required Prerequisite Packages
sudo yum -y install openssh-server wget fuse fuse-libs openib libibverbsInstall the GluserFS packages
yum -y install glusterfs glusterfs-fuse glusterfs-rdmaMounting the Volumes
The server specified in the mount command is only used to fetch the gluster configuration volfile describing the volume name. Subsequently, the client will communicate directly with the servers mentioned in the volfile (which might not even include the one used for mount).
Manual mount
# mount -t glusterfs gfs1:/apps /mnt/glusterfsAutomatic mount. Add something like the below to /etc/fstab
gfs1:/apps /apps glusterfs defaults,_netdev 0 0NFS-Ganesha is a user space file server for the NFS protocol with support for NFSv3, v4, v4.1, pNFS. It provides a FUSE-compatible File System Abstraction Layer(FSAL) to allow the file-system developers to plug in their own storage mechanism and access it from any NFS client. NFS-Ganesha can access the FUSE filesystems directly through its FSAL without copying any data to or from the kernel, thus potentially improving response times.
Install the required packages
yum install glusterfs-server glusterfs-ganesha glusterfs-apiStop and disable any other nfs servers
systemctl stop nfssystemctl disable nfsStart and enable nfs-ganesha
systemctl start nfs-ganeshasystemctl enable nfs-ganeshaBefore carrying on, ensure that you have created a GlusterFS Volume.
Create the export file for the volume you want to export in /etc/ganesha/exports/export.apps.conf. It should look something like this.
# WARNING : Using Gluster CLI will overwrite manual# changes made to this file. To avoid it, edit the# file and run ganesha-ha.sh --refresh-config.EXPORT{ Export_Id= 2 ; Path = "/apps"; FSAL { name = GLUSTER; hostname="localhost"; volume="apps"; } Access_type = RW; Disable_ACL = true; Squash="No_root_squash"; Pseudo="/apps"; Protocols = "3", "4" ; Transports = "UDP","TCP"; SecType = "sys"; }Include the export configuration file in the ganesha configuration file
# vim /etc/ganesha/ganesha.conf%include "/etc/ganesha/export.apps.conf"The above two steps can be done with following script
#/usr/libexec/ganesha/create-export-ganesha.sh <ganesha directory> <volume name>By default ganesha directory is "/etc/ganesha". This will create export configuration file in <ganesha directory>/exports/export.<volume name>.conf. I have seen the script create the export directory in the mount itself. Also it will add above entry to ganesha.conf.
Turn on features.cache-invalidation for that volume
gluster volume set <volume name> features.cache-invalidation ondbus commands are used to export/unexport volume
Export
# dbus-send --system --print-reply --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.AddExport string:/exports/export..conf string:"EXPORT(Path=/\<volume name>)"Unexport
# dbus-send --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.RemoveExport uint16:\<export id>The export/unexport can also be done with the following script.
#/usr/libexec/ganesha/dbus-send.sh <ganesha directory> [on|off] <volume name>Check to see if the volume has been exported.
#showmount -e localhost#dbus-send --type=method_call --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ExportMgr org.ganesha.nfsd.exportmgr.ShowExportsTo see clients
#dbus-send --type=method_call --print-reply --system --dest=org.ganesha.nfsd /org/ganesha/nfsd/ClientMgr org.ganesha.nfsd.clientmgr.ShowClientsCreate Gluster storage volumes as above on all nodes
Install NFS-Ganesha rpms on all nodes as above
Enable IPV6 if disabled on all nodes
Ensure all nodes can be resolved in DNS
Disable NetworkManager and start Networking on all nodes
systemctl stop NetworkManager;systemctl disable NetworkManager;systemctl start network;systemctl enable networkCreate and mount a Gluster shared volume. Only required on a single peer.
gluster volume set all cluster.enable-shared-storage enableInstall Pacemaker and Corosync on all machines
yum install corosync pacemakerSet the cluster auth password on all nodes
echo redhat | passwd --stdin haclusterPasswordless ssh needs to be enabled on all the HA nodes
On one (primary) node in the cluster, run:
ssh-keygen -f /var/lib/glusterd/nfs/secret.pemDeploy the pubkey ~root/.ssh/authorized keys on all nodes, run:
ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@$nodeCopy the keys to all nodes in the cluster, run:
scp /var/lib/glusterd/nfs/secret.* $node:/var/lib/glusterd/nfs/Create the /etc/ganesha/ganesha-ha.conf file
# Name of the HA cluster created.# must be unique within the subnetHA_NAME="ganesha-ha-360"## The gluster server from which to mount the shared data volume.HA_VOL_SERVER="gfs1,gfs2"## N.B. you may use short names or long names; you may not use IP addrs.# Once you select one, stay with it as it will be mildly unpleasant to# clean up if you switch later on. Ensure that all names - short and/or# long - are in DNS or /etc/hosts on all machines in the cluster.## The subset of nodes of the Gluster Trusted Pool that form the ganesha# HA cluster. Hostname is specified.HA_CLUSTER_NODES="gfs1,gfs2"## Virtual IPs for each of the nodes specified above.VIP_gfs1="192.168.56.21"VIP_gfs2="192.168.56.22"To setup the HA cluster, enable NFS-Ganesha
gluster nfs-ganesha enableExport Volumes through NFS-Ganesha using cli
gluster volume set apps ganesha.enable onCheck if NFS-Ganesha has started and that the volume has been exported.
ps aux | grep ganesha.nfsdshowmount -e localhostganesha.nfsd writes logs to /var/log/ganesha.log
Replica bricks can become out of sync due to server outages. You can run heal on the volume to resolve this issue.
gluster volume heal apps fullWhen a brick is added to Gluster volume, extended attributes are set (for example trusted.glusterfs.volume-id). These extended attributes serve to decides to which Gluster volume a brick belongs to. When a brick is removed from a volume, the extended attributes remain on the filesystem of the brick. At the time this is protection mechanism and is there to prevent data corruption caused by eventual re-adding of the brick to a Gluster volume. Bug 812214 explains the need for this feature, and possible issues that can occur when these extended attributes would be removed, but content is left behind.
List and remove the extended attributes.
# getfattr -m . -d -e hex /mnt/gfs/apps/getfattr: Removing leading '/' from absolute path namesfile: mnt/gfs/apps/security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000trusted.gfid=0x00000000000000000000000000000001trusted.glusterfs.dht=0x000000010000000000000000fffffffftrusted.glusterfs.volume-id=0x3289f87ef449466081c2f848f3a4d501# setfattr -x trusted.gfid /mnt/gfs/apps/# setfattr -x trusted.glusterfs.dht /mnt/gfs/apps/# setfattr -x trusted.glusterfs.volume-id /mnt/gfs/apps/