In this tutorial, we will use Terraform to construct EC2 instances dedicated to the task of monitoring our AWS EC2 instances, all while using Prometheus for data collection and Grafana to visualize it.
The instance to be monitored includes, one hosting our containerized application managed by Docker, while the other serves as an Nginx server.
Build your ec2 instances and security group using Terraform
Prometheus Architecture
Install Prometheus and configure Prometheus to monitor itself
Install Node Exporter on other EC2 Instances
Configure Prometheus for the EC2 Instance
EC2 Service Discovery for Prometheus
Install Grafana
AWS account;
AWS Identify and Access Management (IAM) credentials and programmatic access. The IAM credentials that you need for EC2 can be found here;
Set up AWS credentials locally with AWS configured in the AWS Command Line Interface (CLI). You can find further details here;
a VPC configured for EC2. You can find a CloudFormation template to do that here;
a code or text editor.
Prometheus EC2 instance t2.micro
Grafana EC2 instance t2.micro
Node EC2 instances to monitor (Application instance)
Security Groups Configured properly
Clone this git repo
Port 9090 — Prometheus Server
Port 9100 — Prometheus Node Exporter
Port 9323 — Docker
Port 9113 — Nginx Prometheus Exporter
Port 3000 — Grafana
Terraform is an open-source infrastructure as code (IaC) tool created by HashiCorp. It enables you to define and provision infrastructure resources, such as virtual machines, networks, storage, and more, using a declarative configuration language. Terraform allows you to manage and automate the entire lifecycle of your infrastructure in a consistent and reproducible manner.
Installation: You can go to the official site of terraform and download Terraform in easy way.
For Ubuntu
1. sudo yum install -y yum-utils
2. sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
3. sudo yum -y install terraform
Verify the Installation: Type terraform -help to check terraform is correctly installed
$ terraform -help
Usage: terraform [-version] [-help] <command> [args]
Authentication to cloud providers can be done in 3 ways. For example in the case of AWS:
Through command lines using `aws configure` command, then add your access code, secret code, aws region, and JSON format after each prompt(recommended method). Run the command in the VS Code terminal to configure programmatic access.
aws configure
Grab your AWS credentials for programmatic access and enter them one line at a time.
AWS Access Key ID: ACIR2ZFMGVSZABSEE
Secret Access Key: fYi+iKMoA+tHisDsAfAEEkz123+abcde
Default region name: ap-south-2
Default output format: JSON
Output like this ought to be the result.
Let’s put our programmatic access to the test.
2. AWS config file : ~/.aws/credentials which stores the authentication information entered while using the aws config command
3. Specifying aws_access_key_id and aws_secret_access_key in the .tf file but it's not the recommended method.
provider "aws" {
region = "ap-south-2"
version = "~> 2.46"
aws_access_key_id = ____________________________
aws_secret_access_key = ________________________
}
Step 1. Create the main.tf file
Open your text/code editor and create a new directory. Make a file called main.tf. When setting up the main.tf file, you will create and use the Terraform AWS provider — a plugin that enables Terraform to communicate with the AWS platform — and the EC2 instance.
First, add the provider code to ensure you use the AWS provider.
provider "aws" {
region = "ap-south-2"
}
Next, set up your Terraform resource. This will create the security group needed to access the services.
#securitygroup using Terraform
resource "aws_security_group" "TF_SG" {
name = "metrics SG"
description = "metrics security group using Terraform"
vpc_id = "vpc-08513ae8b191fbe25"
ingress {
description = "prometheus"
from_port = 9090
to_port = 9090
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
# ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "grafana"
from_port = 3000
to_port = 3000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
# ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "prometheus Node Exporter"
from_port = 9100
to_port = 9100
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
# ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "http"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
# ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
# ipv6_cidr_blocks = ["::/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "TF_SG"
}
}
Step 2. set up your Terraform resource, which describes an infrastructure object, for the EC2 instance. This will create the instance. Define the instance type and configure the network.
The network interface ID is to attach to the EC2 instance from the VPC.
The Amazon Machine Image (AMI) of an instance. In the code snippet below, the AMI defaults to Ubuntu.
The size of the instance. In the code snippet below, the instance type defaults to a t2 Micro instance size.
resource "aws_instance" "web1" {
ami = "ami-0f5ee92e2d63afc18"
instance_type = "t2.micro"
security_groups = [aws_security_group.TF_SG.name]
key_name = "cn_v1"
tags = {
Name = "prometheus"
}
#part of prometheus installation sripts
user_data = filebase64("${path.module}/prometheusInstall.sh")
}
resource "aws_instance" "web2" {
ami = "ami-0f5ee92e2d63afc18"
instance_type = "t2.micro"
security_groups = [aws_security_group.TF_SG.name]
key_name = "cn_v1"
tags = {
Name = "grafana"
}
#part of grafana installation sripts
user_data = filebase64("${path.module}/grafanaInstall.sh")
}
Step 3. Create the EC2 environment
To deploy the EC2 environment, ensure you’re in the Terraform module/directory in which you write the Terraform code, and run the following commands:
terraform init: Initializes the environment and pulls down the AWS provider.
terraform plan: Creates an execution plan, outputs the outcome for the environment, and confirms no bugs are found.
terraform apply — auto-approve: Creates and automatically approves the environment.
Step 4. Clean up the environment
To destroy all Terraform environments, ensure that you’re in the Terraform module/directory that you used to create the EC2 instance and run terraform destroy.
Prometheus is an open-source tool designed for monitoring and alerting applications. It operates on a multi-dimensional data model where time series data is categorized by metric names and key/value pairs. It harnesses PromQL (Prometheus Query Language) for querying data. This tool employs a pull model over HTTP for the collection of time series data. You can pinpoint the systems you wish to monitor by utilizing Service Discovery or through static configuration within the YAML file.
Below is the diagram of Prometheus architecture and its components
Prometheus Server: This component is the central component that collects the metrics from multiple nodes. Prometheus uses the concept of scraping, where target systems’ metric endpoints are contacted to fetch data at regular intervals.
Node Exporter: This is called a monitoring agent which we installed on all the target machines so that Prometheus can fetch the data from all the metrics endpoints
Push Gateway: Push Gateway is used for scraping metrics from applications and passing on the data to Prometheus. Push Gateway captures the data and then transforms it into the Prometheus data format before pushing.
Alert Manager: Alert Manager is used to send various alerts based on the metrics data collected in Prometheus.
Web UI: The web UI layer of Prometheus provides the end user with an interface to visualize data collected by Prometheus. In this, we will use Grafana to visualize the data.
Now we will install the Prometheus on one of the EC2 Instances.
You can download the latest version from here
Clone my git repo
Run the install-prometheus.sh script
This script will install everything and configure it. You can change the version as per your project.
This script will do the following
steps:
Create a new user and add new directories
sudo apt update
sudo groupadd --system prometheus
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
2. Download the Prometheus, extract it, put it in /usr/local/bin folder and finally delete the software
wget https://github.com/prometheus/prometheus/releases/download/v2.43.0/prometheus-2.43.0.linux-amd64.tar.gz
tar vxf prometheus*.tar.gz
cd prometheus-2.43.0.linux-amd64
sudo mv prometheus /usr/local/bin
sudo mv promtool /usr/local/bin
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo mv -r consoles /etc/prometheus
sudo mv -r console_libraries /etc/prometheus
sudo mv prometheus.yml /etc/prometheus
3. Now we will configure Prometheus to monitor itself using yaml file. Create a prometheus.yml file at /etc/prometheus/prometheus.yml with the below content
global:
scrape_interval: 15s
external_labels:
monitor: 'prometheus'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
4. Now we want to run the Prometheus as a Service so that in case of server restart service will come automatically.
Let’s create a file /etc/systemd/system/prometheus.service with the below content:
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
4. Change the ownership of all folders and files which we have created to the user which we have created in the first step
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo chown -R prometheus:prometheus /etc/prometheus/consoles
sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries
sudo chown -R prometheus:prometheus /var/lib/prometheus
5. Now we will configure the service and start it
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
sudo systemctl status prometheus
Now open it on the browser using the below URL:
http://18.220.110.81:9090/
If you are not able to access it then make sure your security group is configured for port 9090 and it's open from your IP.
Now to monitor your servers you need to install the node exporter on all your target machines which is like a monitoring agent on all the servers.
You can clone this repo and run it directly using the following commands
./nodeexporterInstall.sh
This script will do the following steps:
It will create a new user, and download the software using wget and then run the node-exporter as a service
sudo useradd --no-create-home node_exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
tar xzf node_exporter-1.6.1.linux-amd64.tar.gz
sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/node_exporter
cd node_exporter-1.6.1.linux-amd64
sudo cp node_exporter /usr/local/bin
cd ..
rm -rf node_exporter-1.6.1.linux-amd64.tar.gz node_exporter-1.6.1.linux-amd64
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
sudo systemctl daemon-reload
sudo systemctl enable node-exporter
sudo systemctl start node-exporter
sudo systemctl status node-exporter
Create a file sudo `vi /etc/systemd/system/node_exporter.service` and add this :
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
sudo systemctl status node_exporter
Make sure port 9100 is open from your IP to access this URL. You should be able to access all the metrics which is coming from this server.
http://3.129.211.10:9100/metrics
Now we configure the docker container, create a file sudo vi /etc/docker/daemon.json , and add this address for the metrics
{
"metrics-addr" : "127.0.0.1:9323",
"experimental" : true
}
OR
{
"metrics-addr" : "0.0.0.0:9323",
"experimental" : true
}
then restart your docker and make sure it is active
sudo systemctl restart docker
sudo systemctl status docker
Make sure port 9323is open from your IP to access this URL. You should be able to access all the metrics which is coming from this server.
http://3.129.211.10:9323/metrics
Let’s make a new Nginx configuration file to include an extra server block with our metric module. If you installed Nginx using a different method, such as the default Ubuntu packages, you may have a different location for Nginx configurations.
Switch to the root Linux user before generating a file. We will later modify Linux permissions and ownership.
sudo -s
Now the configuration file.
vim /etc/nginx/conf.d/status.conf
Optionally you can restrict this plugin to emit metrics to only the local host. It may be useful if you have a single Nginx instance and you install Prometheus exporter on it as well. In case you have multiple Nginx servers, it’s better to deploy the Prometheus exporter on a separate instance and scrape all of them from a single exporter.
We’ll use the location Nginx directive to expose basic metrics on port 8080 /status page. Go to sudo vi nginx/status.conf
server {
listen 8080;
# Optionally: allow access only from localhost
# listen 127.0.0.1:8080;
server_name _;
location /status {
stub_status;
}
}
Always verify if the configuration is valid before restarting Nginx nginx -t
To update the Nginx config without downtime, you can use the reload command.
systemctl reload nginx
Now we can access http://<ip>:8080/status page.
Active connections: 2
server accepts handled requests
4 4 3
Reading: 0 Writing: 1 Waiting: 1
Unfortunately, the Open Source Nginx server provides just these insignificant stats. From here on out, I’m going to focus solely on the active connections metric.
They opted to only give useful analytics in Nginx Plus, the commercial version of Nginx. I’ll show you how to get around later in the tutorial.
Still, let’s get all of the accessible stats for the time being. To do this, we’ll use the Nginx Prometheus exporter. It’s a Golang program that generates a single binary with no external dependencies and is extremely simple to install.
First, we’ll make a folder for the exporter and swap directory.
mkdir /opt/nginx-exporter
cd /opt/nginx-exporter
As a best practice, you should always create a dedicated user for each application that you want to run. Let’s call it a nginx-exporter user and a group.
sudo useradd --system --no-create-home --shell /bin/false nginx-exporter
Let’s locate the most recent version from the GitHub releases site and copy the URL to the relevant archive. It’s an ordinary amd64 platform in my situation.
We can use curl to download the exporter on the Ubuntu machine.
curl -L https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz -o nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
Extract the prometheus exporter from the archive.
tar -zxf nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
You can also remove it to save some space.
rm nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
Let’s make sure that we downloaded the correct binary by checking the version of the exporter.
./nginx-prometheus-exporter --version
It’s optional; let’s update the ownership on the exporter folder.
chown -R nginx-exporter:nginx-exporter /opt/nginx-exporter
To run it, let’s also create a systemd service file. In case it exits systemd manager can restart it. It’s the standard way to run Linux daemons.
vim /etc/systemd/system/nginx-exporter.service
Make sure you update the scrape-url to the one you used in Nginx to expose basic metrics. Also, update the Linux user and the group to match yours in case you used different names.
nginx-exporter.service
[Unit]
Description=Nginx Exporter
Wants=network-online.target
After=network-online.target
StartLimitIntervalSec=0
[Service]
User=nginx-exporter
Group=nginx-exporter
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/opt/nginx-exporter/nginx-prometheus-exporter \
-nginx.scrape-uri=http://localhost:8080/status
[Install]
WantedBy=multi-user.target
Enable the service to automatically start the daemon on Linux restart.
systemctl enable nginx-exporter
Then start the nginx prometheus exporter.
systemctl start nginx-exporter
Check the status of the service.
systemctl status nginx-exporter
If your exporter fails to start, you can check logs to find the error message.
journalctl -u nginx-exporter -f --no-pager
To verify that the Prometheus exporter can access Nginx and properly scrape metrics, use the curl command and the default 9113 port for the exporter.
curl localhost:9113/metrics
Now you should be able to get the same metrics from the status page but in Prometheus format.
# TYPE nginx_connections_accepted counter
nginx_connections_accepted 8
# HELP nginx_connections_active Active client connections
# TYPE nginx_connections_active gauge
nginx_connections_active 1
# HELP nginx_connections_handled Handled client connections
# TYPE nginx_connections_handled counter
nginx_connections_handled 8
# HELP nginx_connections_reading Connections where NGINX is reading the request header
# TYPE nginx_connections_reading gauge
nginx_connections_reading 0
# HELP nginx_connections_waiting Idle client connections
# TYPE nginx_connections_waiting gauge
nginx_connections_waiting 0
# HELP nginx_connections_writing Connections where NGINX is writing the response back to the client
# TYPE nginx_connections_writing gauge
nginx_connections_writing 1
# HELP nginx_http_requests_total Total http requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total 8
# HELP nginx_up Status of the last metric scrape
# TYPE nginx_up gauge
nginx_up 1
Now we will configure the Prometheus for our EC2 instance where we have installed the node-exporter.
Login to the Prometheus server and edit the file or you can clone this file/etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
external_labels:
monitor: 'prometheus'
scrape_configs:
- job_name: 'prometheus_node_exporter'
static_configs:
- targets: ['18.219.214.162:9100']
scrape_configs:
- job_name: 'docker'
static_configs:
- targets: ['18.219.214.162:9323']
scrape_configs:
- job_name: 'nginx-prometheus-exporter'
static_configs:
- targets: ['18.219.214.162:9113']
Restart the Prometheus Service
sudo systemctl restart prometheus
sudo systemctl status prometheus
Now you can open the Prometheus using the below url and can see the new targets added, you will see all the metrics
http://18.217.62.18:9090/targets
Now we will use Service discovery so that we don’t need to change the Prometheus configuration for each of the instance
You can clone this file and update the /etc/prometheus/prometheus.yml file with the below content
global:
scrape_interval: 15s
external_labels:
monitor: 'prometheus'
scrape_configs:
- job_name: 'node'
ec2_sd_configs:
- region: ap-south-2
access_key: yourkey
secret_key: yourkey
port: 9100
Specify the AWS region and use IAM user API key which has EC2ReadyOnlyAccess . If there is no user available then you can create one and add the below policy.
Restart the service
sudo systemctl restart prometheus
sudo systemctl status prometheus
Service discovery will find the private IP so you need to make sure that in your security group, you add this private IP also
One is showing down because it fetches all the nodes that are in ap-south-2 region and we have not installed node-exporter on the Prometheus server itself.
This is how you can use the Service discovery in Prometheus for all the EC2 instances.
Once Prometheus is installed successfully then we can install the Grafana and configure Prometheus as a datasource.
Grafana is an open-source tool that is used to provide the visualization of your metrics.
You can download the latest version of Grafana from here
Steps to Install
clone this git repo
Run the below file
./install-grafana.sh
This script will do the following steps:
It will download the software using wget and then run the Grafana as a service
sudo apt update
sudo apt install -y gnupg2 curl software-properties-common
curl -fsSL https://packages.grafana.com/gpg.key|sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/grafana.gpg
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
sudo apt update
sudo apt -y install grafana
sudo systemctl enable --now grafana-server
sudo systemctl status grafana-server.service
Now open it on the browser using the below URL:
Make sure that the port 3000 is open for this instance.
http://yourip:3000
Login with username : admin and password admin
Add Prometheus DataSource
Click on Setting ->datasources
Click on Explore highlighted in red -> Select Prometheus as a data source as shown below
Now you can click on metrics -> Select Up
Output 1 shows that the node is up
There are a lot of other metrics which is provided by default and you can use them as per your need.
Now we will create a dashboard that shows us all the node details like CPU, memory, storage, etc.
Grafana provides a lot of dashboards that we can directly import into our Grafana instance and use it.
In this example, we will use this dashboard
Click on + icon -> Import
This is how the dashboard will look like and provide all the metrics for your node
We’ve achieved proficiency in monitoring both an AWS EC2 instance hosting our containerized application and another EC2 instance running Nginx. We accomplished this by deploying Prometheus to collect metrics from these instances and then visualizing the data through Grafana dashboards.
Linkedin: www.linkedin.com/in/lorettaeyimina