AWS Fault Injection Simulator (AWS FIS) is a managed service that enables you to perform fault injection experiments on your AWS workloads.
Fault injection is based on the principles of chaos engineering.
These experiments stress an application by creating disruptive events so that you can observe how your application responds.
You can then use this information to improve the performance and resiliency of your applications so that they behave as expected.
AWS FIS provides templates that generate disruptions, and the controls and guardrails that you need to run experiments in production, such as automatically rolling back or stopping the experiment if specific conditions are met.
Before you use AWS FIS to run experiments in production, we strongly recommend that you complete a planning phase and run the experiments in a pre-production environment.
You can access using SDK, CLI, Managment Console & HTTPS API.
Templates
To run experiments, you first create an experiment template.
An experiment template is the blueprint of your experiment.
It contains the actions, targets, and stop conditions for the experiment.
While your experiment is running, you can track its progress and view its status.
An experiment is complete when all of the actions in the experiment have run.
Actions
An action is an activity that AWS FIS performs on an AWS resource during an experiment.
AWS FIS provides a set of preconfigured actions based on the type of AWS resource.
Each action runs for a specified duration during an experiment, or until you stop the experiment. Actions can run sequentially or simultaneously (in parallel).
Each AWS FIS action has an identifier with the following format:
aws:service-name:action-type
aws:ec2:stop-instances
Parameters:
These parameters are used to pass information to AWS FIS when the action is run.
Types:
Fault injection actions
aws:fis:inject-api-internal-error
aws:fis:inject-api-throttle-error
aws:fis:inject-api-unavailable-error
Wait action
aws:fis:wait
CloudWatch actions
aws:cloudwatch:assert-alarm-state: Verifies that the specified alarms are in one of the specified alarm states.
Amazon EC2 actions
aws:ec2:send-spot-instance-interruptions
aws:ec2:reboot-instances
aws:ec2:stop-instances
aws:ec2:terminate-instances
Amazon ECS actions
aws:ecs:drain-container-instances: Drain the specified percentage of underlying Amazon EC2 instances on the target clusters
aws:ecs:stop-task: Runs the Amazon ECS API action StopTask to stop the target task.
Amazon EKS actions
aws:eks:terminate-nodegroup-instance: Runs the Amazon EC2 API action TerminateInstances on the target node group.
Amazon RDS actions
aws:rds:failover-db-cluster: Runs the Amazon RDS API action FailoverDBCluster on the target Aurora DB cluster.
aws:rds:reboot-db-instances: Runs the Amazon RDS API action RebootDBInstance on the target DB instance.
Systems Manager actions
aws:ssm:send-command
aws:ssm:start-automation-execution
Targets
A target is one or more AWS resources on which AWS FIS performs an action during an experiment. You can choose specific resources, or you can select a group of resources based on specific criteria, such as tags or state.
Supports actions for target resources for the following AWS services: EC2, ECS, EKS and RDS
When you define a target, you specify the following:
The resource type:
aws:ec2:instance – An Amazon EC2 instance
aws:ec2:spot-instance – An Amazon EC2 Spot Instance
aws:ecs:cluster – An Amazon ECS cluster
aws:ecs:task – An Amazon ECS task
aws:eks:nodegroup – An Amazon EKS node group
aws:iam:role – An IAM role
aws:rds:cluster – An Amazon Aurora DB cluster
aws:rds:db – An Amazon RDS DB instance
How to identify the resources (through resource IDs, filters, or tags)
Resource IDs – The resource IDs of specific AWS resources. All resource IDs must represent the same type of resource.
Resource tags – The tags applied to specific AWS resources.
Resource filters – The path and values that represent resources with specific attributes. For more information, see Resource filters.
Resource filters are queries that identify target resources according to specific attributes.
Example:
"filters": [
{
"path": "component.component.component",
"values": [
"string"
]
}
],
Example 2:
"filters": [
{
"path": "Placement.AvailabilityZone",
"values": [ "us-east-1a" ]
}
]
Example 3:
"filters": [
{
"path": "State.Name",
"values": [ "running" ]
}
],
Resource parameters – The parameters that represent resources that meet specific criteria. For more information, see Resource parameters.
Which of the identified resources to run the action on (the selection mode)
Stop conditions
AWS FIS provides the controls and guardrails that you need to run experiments safely on your AWS workloads.
A stop condition is a mechanism to stop an experiment if it reaches a threshold that you define as an Amazon CloudWatch alarm.
If a stop condition is triggered while the experiment is running, AWS FIS stops the experiment.
Content
Content
Content