Monitoring using Prometheus and Grafana on AWS EC2 which is built with Terraform.

In this tutorial, we will use Terraform to construct EC2 instances dedicated to the task of monitoring our AWS EC2 instances, all while using Prometheus for data collection and Grafana to visualize it.

The instance to be monitored includes, one hosting our containerized application managed by Docker, while the other serves as an Nginx server.

Agenda

Build your ec2 instances and security group using Terraform
Prometheus Architecture
Install Prometheus and configure Prometheus to monitor itself
Install Node Exporter on other EC2 Instances
Configure Prometheus for the EC2 Instance
EC2 Service Discovery for Prometheus
Install Grafana

Prerequisite:

AWS account;
AWS Identify and Access Management (IAM) credentials and programmatic access. The IAM credentials that you need for EC2 can be found here;
Set up AWS credentials locally with AWS configured in the AWS Command Line Interface (CLI). You can find further details here;
a VPC configured for EC2. You can find a CloudFormation template to do that here;
a code or text editor.
Prometheus EC2 instance t2.micro
Grafana EC2 instance t2.micro
Node EC2 instances to monitor (Application instance)
Security Groups Configured properly
Clone this git repo

Security Groups Configured on EC2 Instances

Port 9090 — Prometheus Server

Port 9100 — Prometheus Node Exporter

Port 9323 — Docker

Port 9113 — Nginx Prometheus Exporter

Port 3000 — Grafana

1. What is Terraform?

Terraform is an open-source infrastructure as code (IaC) tool created by HashiCorp. It enables you to define and provision infrastructure resources, such as virtual machines, networks, storage, and more, using a declarative configuration language. Terraform allows you to manage and automate the entire lifecycle of your infrastructure in a consistent and reproducible manner.

Installation: You can go to the official site of terraform and download Terraform in easy way.

For Ubuntu

1. sudo yum install -y yum-utils

2. sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo

3. sudo yum -y install terraform

Verify the Installation: Type terraform -help to check terraform is correctly installed

$ terraform -help

Usage: terraform [-version] [-help] <command> [args]

Terraform Authentication

Authentication to cloud providers can be done in 3 ways. For example in the case of AWS:

Through command lines using `aws configure` command, then add your access code, secret code, aws region, and JSON format after each prompt(recommended method). Run the command in the VS Code terminal to configure programmatic access.

aws configure

Grab your AWS credentials for programmatic access and enter them one line at a time.

AWS Access Key ID: ACIR2ZFMGVSZABSEE

Secret Access Key: fYi+iKMoA+tHisDsAfAEEkz123+abcde

Default region name: ap-south-2

Default output format: JSON

Output like this ought to be the result.

Let’s put our programmatic access to the test.

2. AWS config file : ~/.aws/credentials which stores the authentication information entered while using the aws config command

3. Specifying aws_access_key_id and aws_secret_access_key in the .tf file but it's not the recommended method.

provider "aws" {

region = "ap-south-2"

version = "~> 2.46"

aws_access_key_id = ____________________________

aws_secret_access_key = ________________________

}

Step 1. Create the main.tf file

Open your text/code editor and create a new directory. Make a file called main.tf. When setting up the main.tf file, you will create and use the Terraform AWS provider — a plugin that enables Terraform to communicate with the AWS platform — and the EC2 instance.

First, add the provider code to ensure you use the AWS provider.

provider "aws" {

region = "ap-south-2"

}

Next, set up your Terraform resource. This will create the security group needed to access the services.

#securitygroup using Terraform

resource "aws_security_group" "TF_SG" {

name = "metrics SG"

description = "metrics security group using Terraform"

vpc_id = "vpc-08513ae8b191fbe25"

ingress {

description = "prometheus"

from_port = 9090

to_port = 9090

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

# ipv6_cidr_blocks = ["::/0"]

}

ingress {

description = "grafana"

from_port = 3000

to_port = 3000

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

# ipv6_cidr_blocks = ["::/0"]

}

ingress {

description = "prometheus Node Exporter"

from_port = 9100

to_port = 9100

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

# ipv6_cidr_blocks = ["::/0"]

}

ingress {

description = "http"

from_port = 80

to_port = 80

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

# ipv6_cidr_blocks = ["::/0"]

}

ingress {

description = "SSH"

from_port = 22

to_port = 22

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

# ipv6_cidr_blocks = ["::/0"]

}

egress {

from_port = 0

to_port = 0

protocol = "-1"

cidr_blocks = ["0.0.0.0/0"]

ipv6_cidr_blocks = ["::/0"]

}

tags = {

Name = "TF_SG"

}

Step 2. set up your Terraform resource, which describes an infrastructure object, for the EC2 instance. This will create the instance. Define the instance type and configure the network.

The network interface ID is to attach to the EC2 instance from the VPC.
The Amazon Machine Image (AMI) of an instance. In the code snippet below, the AMI defaults to Ubuntu.
The size of the instance. In the code snippet below, the instance type defaults to a t2 Micro instance size.

resource "aws_instance" "web1" {

ami = "ami-0f5ee92e2d63afc18"

instance_type = "t2.micro"

security_groups = [aws_security_group.TF_SG.name]

key_name = "cn_v1"

tags = {

Name = "prometheus"

}

#part of prometheus installation sripts

user_data = filebase64("${path.module}/prometheusInstall.sh")

}

resource "aws_instance" "web2" {

ami = "ami-0f5ee92e2d63afc18"

instance_type = "t2.micro"

security_groups = [aws_security_group.TF_SG.name]

key_name = "cn_v1"

tags = {

Name = "grafana"

}

#part of grafana installation sripts

user_data = filebase64("${path.module}/grafanaInstall.sh")

}

Step 3. Create the EC2 environment

To deploy the EC2 environment, ensure you’re in the Terraform module/directory in which you write the Terraform code, and run the following commands:

terraform init: Initializes the environment and pulls down the AWS provider.
terraform plan: Creates an execution plan, outputs the outcome for the environment, and confirms no bugs are found.
terraform apply — auto-approve: Creates and automatically approves the environment.

Step 4. Clean up the environment

To destroy all Terraform environments, ensure that you’re in the Terraform module/directory that you used to create the EC2 instance and run terraform destroy.

2. Prometheus Architecture

Prometheus is an open-source tool designed for monitoring and alerting applications. It operates on a multi-dimensional data model where time series data is categorized by metric names and key/value pairs. It harnesses PromQL (Prometheus Query Language) for querying data. This tool employs a pull model over HTTP for the collection of time series data. You can pinpoint the systems you wish to monitor by utilizing Service Discovery or through static configuration within the YAML file.

Below is the diagram of Prometheus architecture and its components

Prometheus Server: This component is the central component that collects the metrics from multiple nodes. Prometheus uses the concept of scraping, where target systems’ metric endpoints are contacted to fetch data at regular intervals.
Node Exporter: This is called a monitoring agent which we installed on all the target machines so that Prometheus can fetch the data from all the metrics endpoints
Push Gateway: Push Gateway is used for scraping metrics from applications and passing on the data to Prometheus. Push Gateway captures the data and then transforms it into the Prometheus data format before pushing.
Alert Manager: Alert Manager is used to send various alerts based on the metrics data collected in Prometheus.
Web UI: The web UI layer of Prometheus provides the end user with an interface to visualize data collected by Prometheus. In this, we will use Grafana to visualize the data.

Install Prometheus

Now we will install the Prometheus on one of the EC2 Instances.

You can download the latest version from here

Clone my git repo
Run the install-prometheus.sh script
This script will install everything and configure it. You can change the version as per your project.

This script will do the following

steps:

Create a new user and add new directories

sudo apt update

sudo groupadd --system prometheus

sudo useradd -s /sbin/nologin --system -g prometheus prometheus

sudo mkdir /etc/prometheus

sudo mkdir /var/lib/prometheus

2. Download the Prometheus, extract it, put it in /usr/local/bin folder and finally delete the software

wget https://github.com/prometheus/prometheus/releases/download/v2.43.0/prometheus-2.43.0.linux-amd64.tar.gz

tar vxf prometheus*.tar.gz

cd prometheus-2.43.0.linux-amd64

sudo mv prometheus /usr/local/bin

sudo mv promtool /usr/local/bin

sudo chown prometheus:prometheus /usr/local/bin/prometheus

sudo chown prometheus:prometheus /usr/local/bin/promtool

sudo mv -r consoles /etc/prometheus

sudo mv -r console_libraries /etc/prometheus

sudo mv prometheus.yml /etc/prometheus

3. Now we will configure Prometheus to monitor itself using yaml file. Create a prometheus.yml file at /etc/prometheus/prometheus.yml with the below content

global:

scrape_interval: 15s

external_labels:

monitor: 'prometheus'

scrape_configs:

- job_name: 'prometheus'

static_configs:

- targets: ['localhost:9090']

4. Now we want to run the Prometheus as a Service so that in case of server restart service will come automatically.

Let’s create a file /etc/systemd/system/prometheus.service with the below content:

[Unit]

Description=Prometheus

Wants=network-online.target

After=network-online.target

[Service]

User=prometheus

Group=prometheus

Type=simple

ExecStart=/usr/local/bin/prometheus \

--config.file /etc/prometheus/prometheus.yml \

--storage.tsdb.path /var/lib/prometheus/ \

--web.console.templates=/etc/prometheus/consoles \

--web.console.libraries=/etc/prometheus/console_libraries

[Install]

WantedBy=multi-user.target

4. Change the ownership of all folders and files which we have created to the user which we have created in the first step

sudo chown prometheus:prometheus /etc/prometheus

sudo chown prometheus:prometheus /usr/local/bin/prometheus

sudo chown prometheus:prometheus /usr/local/bin/promtool

sudo chown -R prometheus:prometheus /etc/prometheus/consoles

sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries

sudo chown -R prometheus:prometheus /var/lib/prometheus

5. Now we will configure the service and start it

sudo systemctl daemon-reload

sudo systemctl enable prometheus

sudo systemctl start prometheus

sudo systemctl status prometheus

Now open it on the browser using the below URL:

http://18.220.110.81:9090/

If you are not able to access it then make sure your security group is configured for port 9090 and it's open from your IP.

3. Install Node Exporter

Now to monitor your servers you need to install the node exporter on all your target machines which is like a monitoring agent on all the servers.

You can clone this repo and run it directly using the following commands

./nodeexporterInstall.sh

This script will do the following steps:

It will create a new user, and download the software using wget and then run the node-exporter as a service

sudo useradd --no-create-home node_exporter

wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz

tar xzf node_exporter-1.6.1.linux-amd64.tar.gz

sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/node_exporter

cd node_exporter-1.6.1.linux-amd64

sudo cp node_exporter /usr/local/bin

cd ..

rm -rf node_exporter-1.6.1.linux-amd64.tar.gz node_exporter-1.6.1.linux-amd64

sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

sudo systemctl daemon-reload

sudo systemctl enable node-exporter

sudo systemctl start node-exporter

sudo systemctl status node-exporter

Create a file sudo `vi /etc/systemd/system/node_exporter.service` and add this :

[Unit]

Description=Node Exporter

Wants=network-online.target

After=network-online.target

[Service]

User=node_exporter

Group=node_exporter

Type=simple

ExecStart=/usr/local/bin/node_exporter

Restart=always

RestartSec=3

[Install]

WantedBy=multi-user.target

sudo systemctl daemon-reload

sudo systemctl enable node_exporter

sudo systemctl start node_exporter

sudo systemctl status node_exporter

Make sure port 9100 is open from your IP to access this URL. You should be able to access all the metrics which is coming from this server.

http://3.129.211.10:9100/metrics

4. Configure exporter for docker container

Now we configure the docker container, create a file sudo vi /etc/docker/daemon.json , and add this address for the metrics

{

"metrics-addr" : "127.0.0.1:9323",

"experimental" : true

}

{

"metrics-addr" : "0.0.0.0:9323",

"experimental" : true

}

then restart your docker and make sure it is active

sudo systemctl restart docker

sudo systemctl status docker

Make sure port 9323is open from your IP to access this URL. You should be able to access all the metrics which is coming from this server.

http://3.129.211.10:9323/metrics

5. Configure your Nginx ec2 instance to Expose Basic Nginx Metrics

Let’s make a new Nginx configuration file to include an extra server block with our metric module. If you installed Nginx using a different method, such as the default Ubuntu packages, you may have a different location for Nginx configurations.
Switch to the root Linux user before generating a file. We will later modify Linux permissions and ownership.

sudo -s

Now the configuration file.

vim /etc/nginx/conf.d/status.conf

Optionally you can restrict this plugin to emit metrics to only the local host. It may be useful if you have a single Nginx instance and you install Prometheus exporter on it as well. In case you have multiple Nginx servers, it’s better to deploy the Prometheus exporter on a separate instance and scrape all of them from a single exporter.

We’ll use the location Nginx directive to expose basic metrics on port 8080 /status page. Go to sudo vi nginx/status.conf

server {

listen 8080;

# Optionally: allow access only from localhost

# listen 127.0.0.1:8080;

server_name _;

location /status {

stub_status;

}

Always verify if the configuration is valid before restarting Nginx nginx -t

To update the Nginx config without downtime, you can use the reload command.

systemctl reload nginx

Now we can access http://<ip>:8080/status page.

Active connections: 2

server accepts handled requests

4 4 3

Reading: 0 Writing: 1 Waiting: 1

Unfortunately, the Open Source Nginx server provides just these insignificant stats. From here on out, I’m going to focus solely on the active connections metric.
They opted to only give useful analytics in Nginx Plus, the commercial version of Nginx. I’ll show you how to get around later in the tutorial.

6. Now Install Nginx Prometheus Exporter

Still, let’s get all of the accessible stats for the time being. To do this, we’ll use the Nginx Prometheus exporter. It’s a Golang program that generates a single binary with no external dependencies and is extremely simple to install.
First, we’ll make a folder for the exporter and swap directory.

mkdir /opt/nginx-exporter

cd /opt/nginx-exporter

As a best practice, you should always create a dedicated user for each application that you want to run. Let’s call it a nginx-exporter user and a group.

sudo useradd --system --no-create-home --shell /bin/false nginx-exporter

Let’s locate the most recent version from the GitHub releases site and copy the URL to the relevant archive. It’s an ordinary amd64 platform in my situation.

We can use curl to download the exporter on the Ubuntu machine.

curl -L https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz -o nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz

Extract the prometheus exporter from the archive.

tar -zxf nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz

You can also remove it to save some space.

rm nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz

Let’s make sure that we downloaded the correct binary by checking the version of the exporter.

./nginx-prometheus-exporter --version

It’s optional; let’s update the ownership on the exporter folder.

chown -R nginx-exporter:nginx-exporter /opt/nginx-exporter

To run it, let’s also create a systemd service file. In case it exits systemd manager can restart it. It’s the standard way to run Linux daemons.

vim /etc/systemd/system/nginx-exporter.service

Make sure you update the scrape-url to the one you used in Nginx to expose basic metrics. Also, update the Linux user and the group to match yours in case you used different names.

nginx-exporter.service

[Unit]

Description=Nginx Exporter

Wants=network-online.target

After=network-online.target

StartLimitIntervalSec=0

[Service]

User=nginx-exporter

Group=nginx-exporter

Type=simple

Restart=on-failure

RestartSec=5s

ExecStart=/opt/nginx-exporter/nginx-prometheus-exporter \

-nginx.scrape-uri=http://localhost:8080/status

[Install]

WantedBy=multi-user.target

Enable the service to automatically start the daemon on Linux restart.

systemctl enable nginx-exporter

Then start the nginx prometheus exporter.

systemctl start nginx-exporter

Check the status of the service.

systemctl status nginx-exporter

If your exporter fails to start, you can check logs to find the error message.

journalctl -u nginx-exporter -f --no-pager

To verify that the Prometheus exporter can access Nginx and properly scrape metrics, use the curl command and the default 9113 port for the exporter.

curl localhost:9113/metrics

Now you should be able to get the same metrics from the status page but in Prometheus format.

# TYPE nginx_connections_accepted counter

nginx_connections_accepted 8

# HELP nginx_connections_active Active client connections

# TYPE nginx_connections_active gauge

nginx_connections_active 1

# HELP nginx_connections_handled Handled client connections

# TYPE nginx_connections_handled counter

nginx_connections_handled 8

# HELP nginx_connections_reading Connections where NGINX is reading the request header

# TYPE nginx_connections_reading gauge

nginx_connections_reading 0

# HELP nginx_connections_waiting Idle client connections

# TYPE nginx_connections_waiting gauge

nginx_connections_waiting 0

# HELP nginx_connections_writing Connections where NGINX is writing the response back to the client

# TYPE nginx_connections_writing gauge

nginx_connections_writing 1

# HELP nginx_http_requests_total Total http requests

# TYPE nginx_http_requests_total counter

nginx_http_requests_total 8

# HELP nginx_up Status of the last metric scrape

# TYPE nginx_up gauge

nginx_up 1

7. Configure Prometheus for the Nodes

Now we will configure the Prometheus for our EC2 instance where we have installed the node-exporter.

global:

scrape_interval: 15s

external_labels:

monitor: 'prometheus'

scrape_configs:

- job_name: 'prometheus_node_exporter'

static_configs:

- targets: ['18.219.214.162:9100']

scrape_configs:

- job_name: 'docker'

static_configs:

- targets: ['18.219.214.162:9323']

scrape_configs:

- job_name: 'nginx-prometheus-exporter'

static_configs:

- targets: ['18.219.214.162:9113']

Restart the Prometheus Service

sudo systemctl restart prometheus

sudo systemctl status prometheus

Now you can open the Prometheus using the below url and can see the new targets added, you will see all the metrics

http://18.217.62.18:9090/targets

Prometheus Service Discovery on EC2 Instance

Now we will use Service discovery so that we don’t need to change the Prometheus configuration for each of the instance

You can clone this file and update the /etc/prometheus/prometheus.yml file with the below content

global:

scrape_interval: 15s

external_labels:

monitor: 'prometheus'

scrape_configs:

- job_name: 'node'

ec2_sd_configs:

- region: ap-south-2

access_key: yourkey

secret_key: yourkey

port: 9100

Specify the AWS region and use IAM user API key which has EC2ReadyOnlyAccess . If there is no user available then you can create one and add the below policy.

Restart the service

sudo systemctl restart prometheus

sudo systemctl status prometheus

Service discovery will find the private IP so you need to make sure that in your security group, you add this private IP also

One is showing down because it fetches all the nodes that are in ap-south-2 region and we have not installed node-exporter on the Prometheus server itself.

This is how you can use the Service discovery in Prometheus for all the EC2 instances.

8. Install Grafana

Once Prometheus is installed successfully then we can install the Grafana and configure Prometheus as a datasource.

Grafana is an open-source tool that is used to provide the visualization of your metrics.

You can download the latest version of Grafana from here

Steps to Install

clone this git repo
Run the below file

./install-grafana.sh

This script will do the following steps:

It will download the software using wget and then run the Grafana as a service

sudo apt update

sudo apt install -y gnupg2 curl software-properties-common

curl -fsSL https://packages.grafana.com/gpg.key|sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/grafana.gpg

sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"

sudo apt update

sudo apt -y install grafana

sudo systemctl enable --now grafana-server

sudo systemctl status grafana-server.service

Now open it on the browser using the below URL:

Make sure that the port 3000 is open for this instance.

http://yourip:3000

Add Prometheus DataSource

Click on Setting ->datasources

Click on Explore highlighted in red -> Select Prometheus as a data source as shown below

Now you can click on metrics -> Select Up

Output 1 shows that the node is up

There are a lot of other metrics which is provided by default and you can use them as per your need.

Now we will create a dashboard that shows us all the node details like CPU, memory, storage, etc.

Grafana provides a lot of dashboards that we can directly import into our Grafana instance and use it.

In this example, we will use this dashboard

Import the dashboard

Click on + icon -> Import

This is how the dashboard will look like and provide all the metrics for your node

Conclusion:

We’ve achieved proficiency in monitoring both an AWS EC2 instance hosting our containerized application and another EC2 instance running Nginx. We accomplished this by deploying Prometheus to collect metrics from these instances and then visualizing the data through Grafana dashboards.

Linkedin: www.linkedin.com/in/lorettaeyimina

Page updated

Google Sites

Report abuse