Accueil‎ > ‎

GlusterFS - Detailed Installation on Google Cloud Platform

posted 26 Feb 2018, 01:48 by Christophe Noël   [ updated 27 Feb 2018, 05:47 ]

Introduction

This post provides additional information about the installation of GlusterFS onGoogle Cloud VMs.  The guidelines available at Ansible Playbook to Deploy Gluster in Google Cloud Platform unfortunately does not work so easily. This post will  provide additional advices that may help user facing other problems.

Note that the installation may be performed from your local machine with Google SDK but also from the GCloud Shell. In that case, you won't be able to mount (because of restricted port access).

Requirements Explained

The requirements describe below consist of the following actions:
1. Configure IAM to allow Ansible to create VM instances
2. Configure Gcloud project with machine SSH key to allow connection to created VMs
3. And of course the installation of Ansible

Ansible Playbook Purpose

Some words about Ansible are provided Ansible Quickstart Post (on this blog).

Relying on Ansible, it creates a cluster of 4 hosts which will store files on persistent disk.

xxx

Configuration of the YAML File:
 
- name: Create disks(s) and instance(s)
  hosts: localhost
  gather_facts: no
  connection: local
  vars:
    state: active
    machine_type: n1-standard-2 # default
    hosts_b:
      - gluster-1b
      - gluster-2b
    hosts_c:
      - gluster-1c
      - gluster-2c
    disk_type: pd-ssd
    disk_size: 20
    zone: europe-west1

Google Cloud Account - IAM

From outside the Cloud Shell / Cloud SDK, you need to authenticate upon GCloud in order to perform operations on the platform. Instead of using the "Owner" account, it is suggested to create a gcloud service account ("ansible-gluster" in the example below). Additional info: Creating and Managing Service Accounts.

The approach is based on the concept of Service Account. A service account is a special Google account that belongs to your application or a virtual machine (VM), instead of to an individual end user. Your application uses the service account to call the Google API of a service, so that the users aren't directly involved.

o use a service account outside of the Google Cloud Platform (on other platforms or on premise), you must establish the identity of the service account. Public/private key pairs will let you do that.The account keys can be exported in a JSON file.

# First create a new service-account (ansible-gluster)
gcloud iam service-accounts create ansible-gluster --display-name ansible-gluster
# Little hack for setting the var SA_EMAIL with associated email of the created account
export SA_EMAIL=$(gcloud iam service-accounts list --filter="displayName:ansible-gluster" --format='value(email)')
# Set the var of the gcloud project
export PROJECT=$(gcloud info --format='value(config.project)')
# Now we grant authorization on all gcloud operations performed by ANSIBLE playbook
gcloud projects add-iam-policy-binding $PROJECT --role roles/compute.storageAdmin --member serviceAccount:$SA_EMAIL
gcloud projects add-iam-policy-binding $PROJECT --role roles/compute.instanceAdmin.v1 --member serviceAccount:$SA_EMAIL
gcloud projects add-iam-policy-binding $PROJECT --role roles/compute.networkAdmin --member serviceAccount:$SA_EMAIL
gcloud projects add-iam-policy-binding $PROJECT --role roles/compute.securityAdmin --member serviceAccount:$SA_EMAIL
# We create and export the key of the service account 
gcloud iam service-accounts keys create ansible-gluster-sa.json --iam-account $SA_EMAIL

Allow ANSIBLE to connect to GCloud VM Instances

It is easy to connect using SSH to Gcloud instance using gcloud command. But if you need to allow a third-party tool (i.e. Ansible !) to connect to any instance (with a reachable IP) of the google cloud (why? to perform configuration on it for sure ! ). To achieve this, you need to add you local machine SSH public key and add it to gcloud project metadata (you can find additional info Managing SSH keys in Metadata).

gcloud compute project-info add-metadata can be used to add or update project-wide metadata. Every instance has access to a metadata server that can be used to query metadata that has been set through this tool. It will old the sshKeys in this case. 

The following example create a SSH pair, and add the public key to gcloud metadata. 

# if not present, generate a machine SSH key pair (private/public) (without passphrase -N "")
ls ~/.ssh/id_rsa.pub || ssh-keygen -N ""
# obtain any existing metadata with "sshKeys" (you don't want to override existing keys!)
gcloud compute project-info describe --format=json | jq -r '.commonInstanceMetadata.items[] | select(.key == "sshKeys") | .value' > sshKeys.pub
# append the machine public key (the one we are interested in)
echo "$USER:$(cat ~/.ssh/id_rsa.pub)" >> sshKeys.pub
# Add the file to project metadata
gcloud compute project-info add-metadata --metadata-from-file sshKeys=sshKeys.pub
Ansible (running on that machine) will be now able to connect using SSH to IP from this Gcloud project. Note also that, alternatively, you may access remote instance using another mechanism: Managing Instance Access Using OS Login(but not relevant for Ansible).

Other Requirements

In order to run the playbook you will need to install Ansible and Libcloud as follows:

pip install --user "ansible==2.2.2" "apache-libcloud==1.5.0"

Once those dependencies have been installed, clone the repository and enter the directory:

git clone https://github.com/GoogleCloudPlatform/compute-ansible-gluster
cd compute-ansible-gluster

Encountered issues

If the ansible YAML file contains wrong data it will trigger some vague exception message. Check carefully that the machine_type is correct and relevant to your account (if trial account is used). By default it uses too many CPUs for a trial version! 

Also check that the zone is defining a region (and not a zone). For example: europe-west1-b is wrong. But europe-west1 is correct. Click here to see the available region.

Also the SSD size is too large for the trial version. Change 500 to 20Gb. It is the size used per host !

You cannot remove host from this configuration. The scripts were based on that configuration.

It is not possible to mount the GlusterFS from the Cloud Shell for 2 reasons: yum is not available to install the client (you could use apt-get instead).  As stated on the Google Cloud Platform documentation, the outgoing connections of the Google Cloud Shell are limited to the following ports: 20, 21, 22, 80, 443, 2375, 2376, 3306, 8080, 9600, and 50051.
YUM is not available. So use sudo apt-get install -y glusterfs-client to install the client.
Therefore, it is suggested to create a VM manually and connect from that VM.

Mouting in Kubernetes

Examples YAML may help to deploy a testing pod verifying the mount of the GlusterFS: here

Required memory is about 2Go for the read cache !

xxxx
Comments